koko210Serve 711101816a Fix: Apply twscrape monkey patch to resolve 'Failed to parse scripts' error
Twitter changed their JavaScript response format to include unquoted keys in JSON objects, which breaks twscrape's parser. This fix applies a monkey patch that uses regex to quote the unquoted keys before parsing.

This resolves the issue preventing figurine notifications from being sent for the past several days.

Reference: https://github.com/vladkens/twscrape/issues/284
2025-12-10 09:48:25 +02:00
2025-12-07 17:15:09 +02:00
2025-12-07 17:15:09 +02:00

🎤 Miku Discord Bot 💙

Miku Banner Docker Python Discord.py

The world's #1 Virtual Idol, now in your Discord server! 🌱

FeaturesQuick StartArchitectureAPIContributing


🌟 About

Meet Hatsune Miku - a fully-featured, AI-powered Discord bot that brings the cheerful, energetic virtual idol to life in your server! Powered by local LLMs (Llama 3.1), vision models (MiniCPM-V), and a sophisticated autonomous behavior system, Miku can chat, react, share content, and even change her profile picture based on her mood!

Why This Bot?

  • 🎭 14 Dynamic Moods - From bubbly to melancholy, Miku's personality adapts
  • 🤖 Smart Autonomous Behavior - Context-aware decisions without spamming
  • 👁️ Vision Capabilities - Analyzes images, videos, and GIFs in conversations
  • 🎨 Auto Profile Pictures - Fetches & crops anime faces from Danbooru based on mood
  • 💬 DM Support - Personal conversations with mood tracking
  • 🐦 Twitter Integration - Shares Miku-related tweets and figurine announcements
  • 🎮 ComfyUI Integration - Natural language image generation requests
  • 🔊 Voice Chat Ready - Fish.audio TTS integration (docs included)
  • 📊 RESTful API - Full control via HTTP endpoints
  • 🐳 Production Ready - Docker Compose with GPU support

Features

🧠 AI & LLM Integration

  • Local LLM Processing with Llama 3.1 8B (via llama.cpp + llama-swap)
  • Automatic Model Switching - Text ↔️ Vision models swap on-demand
  • OpenAI-Compatible API - Easy migration and integration
  • Conversation History - Per-user context with RAG-style retrieval
  • Smart Prompting - Mood-aware system prompts with personality profiles

🎭 Mood & Personality System

14 Available Moods (click to expand)
  • 😊 Neutral - Classic cheerful Miku
  • 😴 Asleep - Sleepy and minimally responsive
  • 😪 Sleepy - Getting tired, simple responses
  • 🎉 Excited - Extra energetic and enthusiastic
  • 💫 Bubbly - Playful and giggly
  • 🤔 Curious - Inquisitive and wondering
  • 😳 Shy - Blushing and hesitant
  • 🤪 Silly - Goofy and fun-loving
  • 😠 Angry - Frustrated or upset
  • 😤 Irritated - Mildly annoyed
  • 😢 Melancholy - Sad and reflective
  • 😏 Flirty - Playful and teasing
  • 💕 Romantic - Sweet and affectionate
  • 🎯 Serious - Focused and thoughtful
  • Per-Server Mood Tracking - Different moods in different servers
  • DM Mood Persistence - Separate mood state for private conversations
  • Automatic Mood Shifts - Responds to conversation sentiment

🤖 Autonomous Behavior System V2

The bot features a sophisticated context-aware decision engine that makes Miku feel alive:

  • Intelligent Activity Detection - Tracks message frequency, user presence, and channel activity
  • Non-Intrusive - Won't spam or interrupt important conversations
  • Mood-Based Personality - Behavioral patterns change with mood
  • Multiple Action Types:
    • 💬 General conversation starters
    • 👋 Engaging specific users
    • 🐦 Sharing Miku tweets
    • 💬 Joining ongoing conversations
    • 🎨 Changing profile pictures
    • 😊 Reacting to messages

Rate Limiting: Minimum 30-second cooldown between autonomous actions to prevent spam.

👁️ Vision & Media Processing

  • Image Analysis - Describe images shared in chat using MiniCPM-V 4.5
  • Video Understanding - Extracts frames and analyzes video content
  • GIF Support - Processes animated GIFs (converts to MP4 if needed)
  • Embed Content Extraction - Reads Twitter/X embeds without API
  • Face Detection - On-demand anime face detection service (GPU-accelerated)

🎨 Dynamic Profile Picture System

  • Danbooru Integration - Searches for Miku artwork
  • Smart Cropping - Automatic face detection and 1:1 crop
  • Mood-Based Selection - Filters by tags matching current mood
  • Quality Filtering - Only uses high-quality, safe-rated images
  • Fallback System - Graceful degradation if detection fails

🐦 Twitter Features

  • Tweet Sharing - Automatically fetches and shares Miku-related tweets
  • Figurine Notifications - DM subscribers about new Miku figurine releases
  • Embed Compatibility - Uses fxtwitter for better Discord previews
  • Duplicate Prevention - Tracks sent tweets to avoid repeats

🎮 ComfyUI Image Generation

  • Natural Language Detection - "Draw me as Miku swimming in a pool"
  • Workflow Integration - Connects to external ComfyUI instance
  • Smart Prompting - Enhances user requests with context

📡 REST API Dashboard

Full-featured FastAPI server with endpoints for:

  • Mood management (get/set/reset)
  • Conversation history
  • Autonomous actions (trigger manually)
  • Profile picture updates
  • Server configuration
  • DM analysis reports

🔧 Developer Features

  • Docker Compose Setup - One command deployment
  • GPU Acceleration - NVIDIA runtime for models and face detection
  • Health Checks - Automatic service monitoring
  • Volume Persistence - Conversation history and settings saved
  • Hot Reload - Update without restarting (for development)

🚀 Quick Start

Prerequisites

  • Docker & Docker Compose installed
  • NVIDIA GPU with CUDA support (for model inference)
  • Discord Bot Token (Create one here)
  • At least 8GB VRAM recommended (4GB minimum)

Installation

  1. Clone the repository

    git clone https://github.com/yourusername/miku-discord.git
    cd miku-discord
    
  2. Set up your bot token

    Edit docker-compose.yml and replace the DISCORD_BOT_TOKEN:

    environment:
      - DISCORD_BOT_TOKEN=your_token_here
      - OWNER_USER_ID=your_discord_user_id  # For DM reports
    
  3. Add your models

    Place these GGUF models in the models/ directory:

    • Llama-3.1-8B-Instruct-UD-Q4_K_XL.gguf (text model)
    • MiniCPM-V-4_5-Q3_K_S.gguf (vision model)
    • MiniCPM-V-4_5-mmproj-f16.gguf (vision projector)
  4. Launch the bot

    docker-compose up -d
    
  5. Check logs

    docker-compose logs -f miku-bot
    
  6. Access the dashboard

    Open http://localhost:3939 in your browser

Optional: ComfyUI Integration

If you have ComfyUI running, update the path in docker-compose.yml:

volumes:
  - /path/to/your/ComfyUI/output:/app/ComfyUI/output:ro

Optional: Face Detection Service

Start the anime face detector when needed:

docker-compose --profile tools up -d anime-face-detector

Access Gradio UI at http://localhost:7860


🏗️ Architecture

Service Overview

┌─────────────────────────────────────────────────────────────┐
│                        Discord API                          │
└───────────────────────┬─────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────────┐
│                     Miku Bot (Python)                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │   Discord    │  │   FastAPI    │  │  Autonomous  │     │
│  │  Event Loop  │  │   Server     │  │    Engine    │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
└───────────┬────────────────┬────────────────┬──────────────┘
            │                │                │
            ▼                ▼                ▼
┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐
│   llama-swap    │ │   ComfyUI       │ │ Face Detector│
│  (Model Server) │ │ (Image Gen)     │ │  (On-Demand) │
│                 │ │                 │ │              │
│  • Llama 3.1    │ │  • Workflows    │ │  • Gradio UI │
│  • MiniCPM-V    │ │  • GPU Accel    │ │  • FastAPI   │
│  • Auto-swap    │ │                 │ │              │
└─────────────────┘ └─────────────────┘ └──────────────┘
         │
         ▼
   ┌──────────┐
   │  Models  │
   │  (GGUF)  │
   └──────────┘

Tech Stack

Component Technology
Bot Framework Discord.py 2.0+
LLM Backend llama.cpp + llama-swap
Text Model Llama 3.1 8B Instruct
Vision Model MiniCPM-V 4.5
API Server FastAPI + Uvicorn
Image Gen ComfyUI (external)
Face Detection Anime-Face-Detector (Gradio)
Database JSON files (conversation history, settings)
Containerization Docker + Docker Compose
GPU Runtime NVIDIA Container Toolkit

Key Components

1. llama-swap (Model Server)

  • Automatically loads/unloads models based on requests
  • Prevents VRAM exhaustion by swapping between text and vision models
  • OpenAI-compatible /v1/chat/completions endpoint
  • Configurable TTL (time-to-live) per model

2. Autonomous Engine V2

  • Tracks message activity, user presence, and channel engagement
  • Calculates "engagement scores" per server
  • Makes context-aware decisions without LLM overhead
  • Personality profiles per mood (e.g., shy mood = less engaging)

3. Server Manager

  • Per-guild configuration (mood, sleep state, autonomous settings)
  • Scheduled tasks (bedtime reminders, autonomous ticks)
  • Persistent storage in servers_config.json

4. Conversation History

  • Vector-based RAG (Retrieval Augmented Generation)
  • Stores last 50 messages per user
  • Semantic search using FAISS
  • Context injection for continuity

📡 API Endpoints

The bot runs a FastAPI server on port 3939 with the following endpoints:

Mood Management

Endpoint Method Description
/servers/{guild_id}/mood GET Get current mood for server
/servers/{guild_id}/mood POST Set mood (body: {"mood": "excited"})
/servers/{guild_id}/mood/reset POST Reset to neutral mood
/mood GET Get DM mood (deprecated, use server-specific)

Autonomous Actions

Endpoint Method Description
/autonomous/general POST Make Miku say something random
/autonomous/engage POST Engage a random user
/autonomous/tweet POST Share a Miku tweet
/autonomous/reaction POST React to a recent message
/autonomous/custom POST Custom prompt (body: {"prompt": "..."})

Profile Pictures

Endpoint Method Description
/profile-picture/change POST Change profile picture (body: {"mood": "happy"})
/profile-picture/revert POST Revert to previous picture
/profile-picture/current GET Get current picture metadata

Utilities

Endpoint Method Description
/conversation/reset POST Clear conversation history for user
/logs GET View bot logs (last 1000 lines)
/prompt GET View current system prompt
/ GET Dashboard HTML page

Example Usage

# Set mood to excited
curl -X POST http://localhost:3939/servers/123456789/mood \
  -H "Content-Type: application/json" \
  -d '{"mood": "excited"}'

# Make Miku say something
curl -X POST http://localhost:3939/autonomous/general

# Change profile picture
curl -X POST http://localhost:3939/profile-picture/change \
  -H "Content-Type: application/json" \
  -d '{"mood": "flirty"}'

🎮 Usage Examples

Basic Interaction

User: Hey Miku! How are you today?
Miku: Miku's doing great! 💙 Thanks for asking! ✨

User: Can you see this? [uploads image]
Miku: Ooh! 👀 I see a cute cat sitting on a keyboard! So fluffy! 🐱

Mood Changes

User: /mood excited
Miku: YAYYY!!! 🎉✨ Miku is SO EXCITED right now!!! Let's have fun! 💙🎶

User: What's your favorite food?
Miku: NEGI!! 🌱🌱🌱 Green onions are THE BEST! Want some?! ✨

Image Generation

User: Draw yourself swimming in a pool
Miku: Ooh! Let me create that for you! 🎨✨ [generates image]

Autonomous Behavior

[After detecting activity in #general]
Miku: Hey everyone! 👋 What are you all talking about? 💙

🛠️ Configuration

Model Configuration (llama-swap-config.yaml)

models:
  llama3.1:
    cmd: /app/llama-server --port ${PORT} --model /models/Llama-3.1-8B-Instruct-UD-Q4_K_XL.gguf -ngl 99
    ttl: 1800  # 30 minutes
    
  vision:
    cmd: /app/llama-server --port ${PORT} --model /models/MiniCPM-V-4_5-Q3_K_S.gguf --mmproj /models/MiniCPM-V-4_5-mmproj-f16.gguf
    ttl: 900   # 15 minutes

Environment Variables

Variable Default Description
DISCORD_BOT_TOKEN Required Your Discord bot token
OWNER_USER_ID Optional Your Discord user ID (for DM reports)
LLAMA_URL http://llama-swap:8080 Model server endpoint
TEXT_MODEL llama3.1 Text generation model name
VISION_MODEL vision Vision model name

Persistent Storage

All data is stored in bot/memory/:

  • servers_config.json - Per-server settings
  • autonomous_config.json - Autonomous behavior settings
  • conversation_history/ - User conversation data
  • profile_pictures/ - Downloaded profile pictures
  • dms/ - DM conversation logs
  • figurine_subscribers.json - Figurine notification subscribers

📚 Documentation

Detailed documentation available in the readmes/ directory:


🐛 Troubleshooting

Bot won't start

Check if models are loaded:

docker-compose logs llama-swap

Verify GPU access:

docker run --rm --runtime=nvidia nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

High VRAM usage

  • Lower the -ngl parameter in llama-swap-config.yaml (reduce GPU layers)
  • Reduce context size with -c parameter
  • Use smaller quantization (Q3 instead of Q4)

Autonomous actions not triggering

  • Check autonomous_config.json - ensure enabled and cooldown settings
  • Verify activity in server (bot tracks engagement)
  • Check logs for decision engine output

Face detection not working

  • Ensure GPU is available: docker-compose --profile tools up -d anime-face-detector
  • Check API health: curl http://localhost:6078/health
  • View Gradio UI: http://localhost:7860

Models switching too frequently

Increase TTL in llama-swap-config.yaml:

ttl: 3600  # 1 hour instead of 30 minutes

Development Setup

For local development without Docker:

# Install dependencies
cd bot
pip install -r requirements.txt

# Set environment variables
export DISCORD_BOT_TOKEN="your_token"
export LLAMA_URL="http://localhost:8080"

# Run the bot
python bot.py

Code Style

  • Use type hints where possible
  • Follow PEP 8 conventions
  • Add docstrings to functions
  • Comment complex logic

📝 License

This project is provided as-is for educational and personal use. Please respect:


🙏 Acknowledgments

  • Crypton Future Media - For creating Hatsune Miku
  • llama.cpp - For efficient local LLM inference
  • mostlygeek/llama-swap - For brilliant model management
  • Discord.py - For the excellent Discord API wrapper
  • OpenAI - For the API standard
  • MiniCPM-V Team - For the amazing vision model
  • Danbooru - For the artwork API

💙 Support

If you enjoy this project:

  • Star this repository
  • 🐛 Report bugs via Issues
  • 💬 Share your Miku bot setup in Discussions
  • 🎤 Listen to some Miku songs!

Made with 💙 by a Miku fan, for Miku fans

"The future begins now!" - Hatsune Miku 🎶

⬆ Back to Top

Description
Llama.cpp-powered Hatsune Miku Discord bot with autonomous features, chat, and image generation
Readme 5.7 MiB
Languages
Python 80.5%
HTML 19.2%
Dockerfile 0.3%