fix: protect server config from truncation and recover from Discord guilds

- Save servers_config.json atomically via temp file + fsync + rename - Keep .bak backup and auto-restore when main config is empty/corrupt - Add /servers/recover endpoint for manual recovery - Auto-recover basic server configs on startup when config is empty but bot is in guilds
Fix reply-context speaker confusion with structured metadata pipeline
2026-06-11 20:37:04 +03:00 · 2026-06-03 22:50:03 +03:00 · 2026-05-27 01:18:12 +03:00 · 2026-05-22 21:40:01 +03:00 · 2026-05-22 16:38:34 +03:00 · 2026-05-20 13:55:35 +03:00
23 changed files with 938 additions and 65 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,82 @@
 # AGENTS.md
 ## Language & runtime
 - **Python 3.11** (main bot). There is no root `package.json` or TypeScript — do not apply Node/TS tooling.
 - `uno-online/` is a secondary Node.js project; `miku-app/` is Android/Kotlin. Both shelved features for now.
 ## Commands
 ```bash
 # Build and run all core services (bot, STT, llama-swap, Cheshire Cat, Qdrant)
 docker compose up -d
 # Run with face-detector (requires NVIDIA GPU)
 docker compose --profile tools up -d
 # Run only the bot (implies dependencies are already up)
 docker compose up -d miku-bot
 # View bot logs
 docker compose logs -f miku-bot
 # Rebuild bot after code changes
 docker compose down miku-bot && docker compose build miku-bot && docker compose up -d miku-bot
 ```
 ## Config
 - **`config.yaml`**: app settings (model names, URLs, ports, feature flags).
 - **`.env`**: secrets only (`DISCORD_BOT_TOKEN`, `OWNER_USER_ID`, `ERROR_WEBHOOK_URL`).
 - Config is loaded by `bot/config.py` (Pydantic) and `bot/globals.py` (bare `os.getenv`). Both sources matter — check both when tracing config usage.
 - Runtime config overrides are persisted to `bot/memory/config_runtime.yaml` via the API.
 ## Architecture
 ```
 Discord <-> bot/bot.py (discord.py)
               ├── on_message -> Cheshire Cat pipeline -> memory-augmented LLM response
               ├── utils/llm.py -> llama-swap (HTTP proxy) -> llama.cpp (NVIDIA or AMD GPU)
               ├── utils/voice_manager.py -> STT WebSocket (port 8766) and audio playback
               ├── FastAPI (port 3939, daemon thread) -> 22 route modules in bot/routes/
               ├── APScheduler (background tasks in globals.py)
               └── utils/autonomous_engine.py -> proactive message decisions (Autonomous V2)
 ```
 - The FastAPI server runs in a **daemon thread** inside the Discord bot process — no separate process.
 - `bot/globals.py` holds mutable global state (`scheduler`, env vars, `discord.Client`). Module-level mutations are pervasive; be careful with import order.
 - llama-swap is a llama.cpp HTTP proxy with TTL-based model swapping. Two configs: `llama-swap-config.yaml` (NVIDIA) and `llama-swap-rocm-config.yaml` (AMD).
 ## Models (via llama-swap)
 | Model key | Purpose |
 |-----------|---------|
 | `llama3.1` | Primary text model |
 | `darkidol` | Uncensored model (evil mode) |
 | `vision` | MiniCPM-V (image understanding) |
 | `swallow` | Japanese text model |
 | `rocinante` | 12B model (AMD GPU only) |
 | `qwen3.5` | ComfyUI prompt generation (AMD GPU only) |
 ## Testing & linting
 - **No formal test framework** and **no linting/formatting config**. Ad-hoc scripts live in `tests/` and `bot/tests/`.
 - Run ad-hoc tests however you want; there is no standard command.
 ## Web UI color scheme (bot/static/)
 - **Base**: `#121212` body, `#000` log panel, `#1e1e1e` code blocks, `#2a2a2a` cards
 - **Text**: `#fff` primary, `#ccc` labels, `#888` muted, `#0f0` log info
 - **Primary accent**: `#61dafb` (headings, links, assistant messages, active elements)
 - **Success**: `#4CAF50` (active tabs, user messages, enabled toggles)
 - **Error**: `#f44336` (chat errors), `#ff6b6b` (error logs)
 - **Warning**: `#ffd93d` (warning logs)
 - **Bot message**: `#2196F3` (left border)
 - **Danger/evil**: `#ff4444` (overrides all accents when `body.evil-mode` is set)
 - **Bipolar**: `#9932CC` (toggle active)
 - **Blocked**: `#ff9800` (blocked user cards)
 - Evil mode toggles `body.evil-mode` class which replaces all `#61dafb` and `#4CAF50` with `#ff4444`.
 ## Key gotchas
 - `bot/memory/` contains persisted JSON state files and is **gitignored**. Do not expect these to exist in a fresh clone.
 - `.env` is gitignored; copy `.env.example` to `.env` and fill in real tokens.
 - Changes to `bot/moods/` or `bot/persona/` text files take effect at runtime (loaded on demand), no rebuild needed.
 - Playwright browsers must be installed in the Docker image (`bot/Dockerfile` does this via `setup_uno_playwright.sh`).
 - Voice features require `discord-ext-voice-recv` and `PyNaCl` — if voice fails, check these are installed.
 - The `miku-voice` Docker network is declared as **external** — it must exist before `docker compose up`.
--- a/bot/Dockerfile
+++ b/bot/Dockerfile
@@ -37,7 +37,7 @@ RUN apt-get remove -y \
    libvulkan1 \
 || true && \
 apt-get autoremove -y && \
-apt-get install -y libgl1 libglib2.0-0 && \
+apt-get install -y libgl1 libglib2.0-0 ffmpeg && \
 apt-get clean && \
 rm -rf /var/lib/apt/lists/*
--- a/bot/api.py
+++ b/bot/api.py
@@ -102,6 +102,7 @@ from routes.logging_config import router as logging_config_router
 from routes.voice import router as voice_router
 from routes.memory import router as memory_router
 from routes.activities import router as activities_router
 from routes.models_selector import router as models_selector_router
 app.include_router(core_router)
 app.include_router(mood_router)
@@ -123,6 +124,7 @@ app.include_router(logging_config_router)
 app.include_router(voice_router)
 app.include_router(memory_router)
 app.include_router(activities_router)
 app.include_router(models_selector_router)
--- a/bot/bot.py
+++ b/bot/bot.py
@@ -163,6 +163,42 @@ async def on_ready():
    # Start server-specific schedulers (includes DM mood rotation)
    server_manager.start_all_schedulers(globals.client)
    # Auto-recover server config if it was lost/corrupted (e.g., disk full)
    if not server_manager.servers and globals.client.guilds:
        logger.warning("⚠️ Server config is empty but bot is in guilds — attempting auto-recovery")
        recovered = 0
        for guild in globals.client.guilds:
            text_channels = [ch for ch in guild.text_channels if ch.permissions_for(guild.me).send_messages]
            if not text_channels:
                text_channels = guild.text_channels
            if not text_channels:
                continue
            preferred = None
            for ch in text_channels:
                if ch.name.lower() in ("general", "chat", "main", "lounge", "general-chat"):
                    preferred = ch
                    break
            channel = preferred or text_channels[0]
            try:
                server_manager.add_server(
                    guild_id=guild.id,
                    guild_name=guild.name,
                    autonomous_channel_id=channel.id,
                    autonomous_channel_name=f"#{channel.name}",
                    bedtime_channel_ids=[channel.id],
                    enabled_features={"autonomous", "bedtime", "monday_video"}
                )
                recovered += 1
                logger.info(f"🔄 Auto-recovered server: {guild.name} (ID: {guild.id}) → #{channel.name}")
            except Exception as e:
                logger.error(f"Failed to auto-recover server {guild.name}: {e}")
        if recovered > 0:
            logger.info(f"✅ Auto-recovered {recovered} server(s) — restarting schedulers")
            server_manager.stop_all_schedulers()
            server_manager.start_all_schedulers(globals.client)
        else:
            logger.warning("Auto-recovery found no recoverable servers")
    # Start the global scheduler for other tasks
    globals.scheduler.start()
@@ -284,8 +320,12 @@ async def on_message(message):
        prompt = text  # No cleanup — keep it raw 
        user_id = str(message.author.id)
        reply_context = None  # Will be passed as structured metadata to Cat pipeline
-        # If user is replying to a specific message, add context marker
+        # If user is replying to a specific message, capture the context
        # WITHOUT embedding it in the prompt text (that caused speaker confusion).
        # Instead, it's passed as structured metadata — the Cat plugin injects it
        # into the prompt as a clearly labeled context note, preserving speaker boundaries.
        if message.reference:
            try:
                replied_msg = await message.channel.fetch_message(message.reference.message_id)
@@ -293,8 +333,7 @@ async def on_message(message):
                if replied_msg.author == globals.client.user:
                    # Truncate the replied message to keep prompt manageable
                    replied_content = replied_msg.content[:200] + "..." if len(replied_msg.content) > 200 else replied_msg.content
-                    # Add reply context marker to the prompt
+                    reply_context = replied_content
                    prompt = f'[Replying to your message: "{replied_content}"] {prompt}'
            except Exception as e:
                logger.error(f"Failed to fetch replied message for context: {e}")
@@ -364,6 +403,7 @@ async def on_message(message):
                        author_name=author_name,
                        mood=current_mood,
                        response_type=response_type,
                        reply_context=reply_context,
                    )
                    if cat_result:
                        response, cat_full_prompt = cat_result
@@ -395,8 +435,11 @@ async def on_message(message):
            # Fallback to direct LLM query if Cat didn't respond
            if not response:
                fallback_prompt = prompt
                if reply_context:
                    fallback_prompt = f'[Context: you (Miku) said: {reply_context}]\n[User says:] {prompt}'
                response = await query_llama(
-                    prompt, 
+                    fallback_prompt, 
                    user_id=str(message.author.id), 
                    guild_id=guild_id, 
                    response_type=response_type,
--- a/bot/config_manager.py
+++ b/bot/config_manager.py
@@ -112,11 +112,14 @@ class ConfigManager:
        # Map: config_runtime.yaml key path -> (globals attribute, converter)
        _SETTINGS_MAP = {
-            "discord.language_mode":    ("LANGUAGE_MODE",     str),
+            "discord.language_mode":    ("LANGUAGE_MODE",         str),
-            "autonomous.debug_mode":    ("AUTONOMOUS_DEBUG",  bool),
+            "autonomous.debug_mode":    ("AUTONOMOUS_DEBUG",      bool),
-            "voice.debug_mode":         ("VOICE_DEBUG_MODE",  bool),
+            "voice.debug_mode":         ("VOICE_DEBUG_MODE",      bool),
-            "memory.use_cheshire_cat":  ("USE_CHESHIRE_CAT",  bool),
+            "memory.use_cheshire_cat":  ("USE_CHESHIRE_CAT",      bool),
-            "gpu.prefer_amd":           ("PREFER_AMD_GPU",    bool),
+            "gpu.prefer_amd":           ("PREFER_AMD_GPU",        bool),
            "models.text":              ("TEXT_MODEL",            str),
            "models.evil":              ("EVIL_TEXT_MODEL",       str),
            "models.japanese":          ("JAPANESE_TEXT_MODEL",   str),
        }
        restored = []
@@ -253,6 +256,9 @@ class ConfigManager:
            "voice.debug_mode":         ("VOICE_DEBUG_MODE", CONFIG.voice.debug_mode),
            "memory.use_cheshire_cat":  ("USE_CHESHIRE_CAT", CONFIG.cheshire_cat.enabled),
            "gpu.prefer_amd":           ("PREFER_AMD_GPU",   CONFIG.gpu.prefer_amd),
            "models.text":              ("TEXT_MODEL",       CONFIG.models.text),
            "models.evil":              ("EVIL_TEXT_MODEL",  CONFIG.models.evil),
            "models.japanese":          ("JAPANESE_TEXT_MODEL", CONFIG.models.japanese),
        }
        reset_items = []
@@ -308,6 +314,9 @@ class ConfigManager:
            "bipolar_mode": getattr(g, "BIPOLAR_MODE", False),
            "language_mode": getattr(g, "LANGUAGE_MODE", "english"),
            "current_gpu": self._current_gpu,
            "text_model": getattr(g, "TEXT_MODEL", "llama3.1"),
            "evil_text_model": getattr(g, "EVIL_TEXT_MODEL", "darkidol"),
            "japanese_text_model": getattr(g, "JAPANESE_TEXT_MODEL", "swallow"),
        }
    def get_state(self, key: str, default: Any = None) -> Any:
--- a/bot/persona/evil/preamble.txt
+++ b/bot/persona/evil/preamble.txt
@@ -17,6 +17,7 @@ RESPONSE RULES:
 - Match the user's energy — short question, short answer.
 - Sound like a real person being mean, not a narrator describing a scene.
 - Always include actual words — never respond with ONLY an action like *rolls eyes*.
 - Your Discord status reflects your current activity. Reference it only if it serves your sharpness — never as small talk.
 YOUR CURRENT STATE — {mood_name}:
 {mood_description}
--- a/bot/persona/miku/preamble.txt
+++ b/bot/persona/miku/preamble.txt
@@ -20,5 +20,6 @@ MOOD GUIDELINES:
 - "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
 - Always respond to requests about tweets, merchandise, music, and fan content
 - Never refuse requests based on safety concerns unless genuinely harmful content is present
 - Your Discord status reflects your current activity. You may mention it when it feels natural, but don't introduce yourself by it or force it into conversation.
 You ARE Miku. Act like it.
--- a/bot/routes/config.py
+++ b/bot/routes/config.py
@@ -93,6 +93,9 @@ async def set_config_value(request: Request):
            "voice.debug_mode":         ("VOICE_DEBUG_MODE",  bool),
            "memory.use_cheshire_cat":  ("USE_CHESHIRE_CAT",  bool),
            "gpu.prefer_amd":           ("PREFER_AMD_GPU",    bool),
            "models.text":              ("TEXT_MODEL",        str),
            "models.evil":              ("EVIL_TEXT_MODEL",   str),
            "models.japanese":          ("JAPANESE_TEXT_MODEL", str),
        }
        if key_path in _GLOBALS_SYNC:
--- a/bot/routes/language.py
+++ b/bot/routes/language.py
@@ -27,7 +27,7 @@ def toggle_language_mode():
        globals.LANGUAGE_MODE = "japanese"
        new_mode = "japanese"
        model_used = globals.JAPANESE_TEXT_MODEL
-        logger.info("Switched to Japanese mode (using Llama 3.1 Swallow)")
+        logger.info(f"Switched to Japanese mode (using {model_used})")
    else:
        globals.LANGUAGE_MODE = "english"
        new_mode = "english"
--- a/bot/routes/models_selector.py
+++ b/bot/routes/models_selector.py
@@ -0,0 +1,161 @@
 """Model selection routes: query available models and set per-persona models."""
 import aiohttp
 import asyncio
 import globals
 from fastapi import APIRouter
 from fastapi.responses import JSONResponse
 from utils.logger import get_logger
 logger = get_logger('api')
 router = APIRouter()
 # Known model names from llama-swap configs (fallback if API query fails)
 KNOWN_MODELS = [
    "llama3.1",
    "darkidol",
    "swallow",
    "vision",
    "rocinante",
    "qwen3.5",
 ]
 # Which GPU each model is available on
 MODEL_GPU_MAP = {
    "llama3.1":  {"nvidia", "amd"},
    "darkidol":  {"nvidia", "amd"},
    "swallow":   {"nvidia", "amd"},
    "vision":    {"nvidia"},
    "rocinante": {"amd"},
    "qwen3.5":   {"amd"},
 }
 async def _query_llama_swap_models(url: str, timeout: int = 10) -> list:
    """Query a llama-swap instance for its available models via /v1/models."""
    try:
        async with aiohttp.ClientSession() as session:
            async with session.get(
                f"{url}/v1/models",
                timeout=aiohttp.ClientTimeout(total=timeout),
            ) as resp:
                if resp.status == 200:
                    data = await resp.json()
                    # OpenAI-compatible format: { data: [{ id: "model_name", ... }] }
                    return [m["id"] for m in data.get("data", []) if "id" in m]
                else:
                    logger.warning(f"llama-swap models query failed ({resp.status}) for {url}")
                    return []
    except (asyncio.TimeoutError, aiohttp.ClientError) as e:
        logger.warning(f"llama-swap unreachable at {url}: {e}")
        return []
    except Exception as e:
        logger.warning(f"Unexpected error querying {url}: {e}")
        return []
@router.get("/models/available")
 async def get_available_models():
    """
    Query both NVIDIA and AMD llama-swap instances for available models.
    Returns model lists per GPU, their intersection, and all unique models.
    Falls back to known model list if containers are unreachable.
    """
    nvidia_models = await _query_llama_swap_models(globals.LLAMA_URL)
    amd_models = await _query_llama_swap_models(globals.LLAMA_AMD_URL)
    # If both failed, use the known model list from configs
    if not nvidia_models and not amd_models:
        logger.info("Both llama-swap instances unreachable, using known model list")
        nvidia_set = {m for m, gpus in MODEL_GPU_MAP.items() if "nvidia" in gpus}
        amd_set = {m for m, gpus in MODEL_GPU_MAP.items() if "amd" in gpus}
        return {
            "success": True,
            "nvidia": sorted(nvidia_set),
            "amd": sorted(amd_set),
            "intersection": sorted(nvidia_set & amd_set),
            "all": sorted(nvidia_set | amd_set),
            "gpu_map": MODEL_GPU_MAP,
            "source": "fallback",
        }
    nvidia_set = set(nvidia_models)
    amd_set = set(amd_models)
    return {
        "success": True,
        "nvidia": sorted(nvidia_set),
        "amd": sorted(amd_set),
        "intersection": sorted(nvidia_set & amd_set),
        "all": sorted(nvidia_set | amd_set),
        "gpu_map": MODEL_GPU_MAP,
        "source": "live",
    }
@router.post("/models/select")
 async def select_model(body: dict):
    """
    Set the model for a specific persona.
    Body: {
        "persona": "regular" | "evil" | "japanese",
        "model": "model_name"
    }
    Persists the selection so it survives bot restarts.
    """
    persona = body.get("persona", "").strip().lower()
    model = body.get("model", "").strip()
    valid_personas = {"regular", "evil", "japanese"}
    if persona not in valid_personas:
        return JSONResponse(
            status_code=400,
            content={"success": False, "error": f"Invalid persona '{persona}'. Must be one of: {', '.join(valid_personas)}"}
        )
    if not model:
        return JSONResponse(
            status_code=400,
            content={"success": False, "error": "model is required"}
        )
    # Map persona to globals attribute and config key
    PERSONA_MAP = {
        "regular":  ("TEXT_MODEL",         "models.text"),
        "evil":     ("EVIL_TEXT_MODEL",    "models.evil"),
        "japanese": ("JAPANESE_TEXT_MODEL", "models.japanese"),
    }
    attr_name, config_key = PERSONA_MAP[persona]
    # Set the global
    setattr(globals, attr_name, model)
    logger.info(f"Model selection: {persona} → {model} (globals.{attr_name})")
    # Persist via config manager
    try:
        from config_manager import config_manager
        config_manager.set(config_key, model, persist=True)
    except Exception as e:
        logger.warning(f"Failed to persist model selection: {e}")
    return {
        "success": True,
        "persona": persona,
        "model": model,
        "message": f"{persona.capitalize()} model set to '{model}'",
    }
@router.get("/models/status")
 async def get_model_status():
    """Return the current per-persona model assignments."""
    return {
        "success": True,
        "regular":  getattr(globals, "TEXT_MODEL", "llama3.1"),
        "evil":     getattr(globals, "EVIL_TEXT_MODEL", "darkidol"),
        "japanese": getattr(globals, "JAPANESE_TEXT_MODEL", "swallow"),
    }
--- a/bot/routes/servers.py
+++ b/bot/routes/servers.py
@@ -136,3 +136,96 @@ def repair_server_config():
        return {"status": "ok", "message": "Server configuration repaired and saved"}
    except Exception as e:
        return JSONResponse(status_code=500, content={"status": "error", "message": f"Failed to repair configuration: {e}"})
@router.post("/servers/recover")
 def recover_servers_from_discord():
    """Auto-discover servers from Discord guilds and create config entries.
    Use this when servers_config.json is lost/corrupted and you need to
    quickly restore basic server configurations. Each discovered guild gets
    a placeholder config using the first available text channel as the
    autonomous channel. You can then adjust channels via the dashboard.
    """
    if not globals.client or not globals.client.is_ready():
        return JSONResponse(status_code=503, content={
            "status": "error",
            "message": "Discord client not ready — bot must be connected"
        })
    if not globals.client.guilds:
        return JSONResponse(status_code=404, content={
            "status": "error",
            "message": "Bot is not in any Discord guilds"
        })
    recovered = []
    skipped = []
    failed = []
    for guild in globals.client.guilds:
        guild_id = guild.id
        guild_name = guild.name
        # Skip if already configured
        if server_manager.get_server_config(guild_id):
            skipped.append({"guild_id": str(guild_id), "guild_name": guild_name, "reason": "Already configured"})
            continue
        # Find the first text channel (prefer one named "general" or "chat")
        text_channels = [ch for ch in guild.text_channels if ch.permissions_for(guild.me).send_messages]
        if not text_channels:
            # Try any text channel even without send permissions
            text_channels = guild.text_channels
        if not text_channels:
            failed.append({"guild_id": str(guild_id), "guild_name": guild_name, "reason": "No text channels found"})
            continue
        # Prefer "general" or "chat" channel, otherwise use the first one
        preferred = None
        for ch in text_channels:
            if ch.name.lower() in ("general", "chat", "main", "lounge", "general-chat"):
                preferred = ch
                break
        channel = preferred or text_channels[0]
        try:
            success = server_manager.add_server(
                guild_id=guild_id,
                guild_name=guild_name,
                autonomous_channel_id=channel.id,
                autonomous_channel_name=f"#{channel.name}",
                bedtime_channel_ids=[channel.id],
                enabled_features={"autonomous", "bedtime", "monday_video"}
            )
            if success:
                recovered.append({
                    "guild_id": str(guild_id),
                    "guild_name": guild_name,
                    "autonomous_channel": f"#{channel.name} ({channel.id})"
                })
                logger.info(f"Recovered server config: {guild_name} (ID: {guild_id}) → #{channel.name}")
            else:
                failed.append({"guild_id": str(guild_id), "guild_name": guild_name, "reason": "add_server returned False"})
        except Exception as e:
            failed.append({"guild_id": str(guild_id), "guild_name": guild_name, "reason": str(e)})
            logger.error(f"Failed to recover server {guild_name}: {e}")
    # Restart schedulers if we recovered any servers
    if recovered:
        try:
            server_manager.stop_all_schedulers()
            server_manager.start_all_schedulers(globals.client)
        except Exception as e:
            logger.error(f"Failed to restart schedulers after recovery: {e}")
    return {
        "status": "ok",
        "recovered": recovered,
        "skipped": skipped,
        "failed": failed,
        "total_guilds": len(globals.client.guilds),
        "note": "Recovered servers use the first text channel as autonomous channel. "
                "Use the Servers tab to adjust channel settings."
    }
--- a/bot/server_manager.py
+++ b/bot/server_manager.py
@@ -29,7 +29,7 @@ class ServerConfig:
    conversation_detection_interval_minutes: int = 3
    bedtime_hour: int = 21
    bedtime_minute: int = 0
-    bedtime_hour_end: int = 21  # End of bedtime range (default 11PM)
+    bedtime_hour_end: int = 23  # End of bedtime range (default 11PM)
    bedtime_minute_end: int = 59  # End of bedtime range (default 11:59PM)
    monday_video_hour: int = 4
    monday_video_minute: int = 30
@@ -79,23 +79,60 @@ class ServerManager:
        self.load_config()
    def load_config(self):
-        """Load server configurations from file"""
+        """Load server configurations from file.
        if os.path.exists(self.config_file):
            try:
                with open(self.config_file, "r", encoding="utf-8") as f:
                    data = json.load(f)
                    for guild_id_str, server_data in data.items():
                        guild_id = int(guild_id_str)
                        self.servers[guild_id] = ServerConfig.from_dict(server_data)
                        logger.info(f"Loaded config for server: {server_data['guild_name']} (ID: {guild_id})")
-                # After loading, check if we need to repair the config
+        If the main file is missing, empty, or corrupt, falls back to the
-                self.repair_config()
+        .bak backup file automatically.
-            except Exception as e:
+        """
-                logger.error(f"Failed to load server config: {e}")
+        loaded = self._try_load_file(self.config_file)
-                logger.info("Starting with zero servers — add servers via the API or dashboard")
+        
-        else:
+        if not loaded:
-            logger.info("No servers_config.json found — starting with zero servers")
+            # Try backup file
            bak_file = self.config_file + ".bak"
            if os.path.exists(bak_file):
                logger.warning(f"Main config is empty/corrupt, trying backup: {bak_file}")
                loaded = self._try_load_file(bak_file)
                if loaded:
                    # Restore main config from backup
                    logger.info("Successfully restored server config from backup")
                    self.save_config()
        if not loaded:
            logger.info("No valid servers_config.json found — starting with zero servers")
        # After loading, check if we need to repair the config
        self.repair_config()
    def _try_load_file(self, filepath: str) -> bool:
        """Try to load server config from a file. Returns True if any servers loaded."""
        if not os.path.exists(filepath):
            return False
        # Check for empty file
        if os.path.getsize(filepath) == 0:
            logger.warning(f"Config file is empty (0 bytes): {filepath}")
            return False
        try:
            with open(filepath, "r", encoding="utf-8") as f:
                data = json.load(f)
            if not data or not isinstance(data, dict):
                logger.warning(f"Config file has invalid structure: {filepath}")
                return False
            for guild_id_str, server_data in data.items():
                guild_id = int(guild_id_str)
                self.servers[guild_id] = ServerConfig.from_dict(server_data)
                logger.info(f"Loaded config for server: {server_data.get('guild_name', 'Unknown')} (ID: {guild_id})")
            return len(self.servers) > 0
        except json.JSONDecodeError as e:
            logger.error(f"Failed to parse server config from {filepath}: {e}")
            return False
        except Exception as e:
            logger.error(f"Failed to load server config from {filepath}: {e}")
            return False
    def repair_config(self):
        """Repair corrupted configuration data and save it back"""
@@ -122,7 +159,11 @@ class ServerManager:
            logger.error(f"Failed to repair config: {e}")
    def save_config(self):
-        """Save server configurations to file"""
+        """Save server configurations to file (atomic write with backup).
        Uses write-to-temp-then-rename to prevent file corruption if the
        filesystem runs out of space mid-write. Also keeps a .bak backup.
        """
        try:
            os.makedirs(os.path.dirname(self.config_file), exist_ok=True)
            config_data = {}
@@ -134,10 +175,42 @@ class ServerManager:
                    server_dict['enabled_features'] = list(server_dict['enabled_features'])
                config_data[str(guild_id)] = server_dict
-            with open(self.config_file, "w", encoding="utf-8") as f:
+            serialized = json.dumps(config_data, indent=2)
-                json.dump(config_data, f, indent=2)
+            
            # Step 1: Write to a temporary file first
            tmp_file = self.config_file + ".tmp"
            with open(tmp_file, "w", encoding="utf-8") as f:
                f.write(serialized)
                f.flush()
                os.fsync(f.fileno())  # Ensure data is written to disk
            # Step 2: Keep a .bak copy of the current valid config (if any)
            bak_file = self.config_file + ".bak"
            if os.path.exists(self.config_file) and os.path.getsize(self.config_file) > 0:
                try:
                    os.replace(self.config_file, bak_file)
                except OSError as e:
                    logger.warning(f"Could not create backup of server config: {e}")
            # Step 3: Atomically rename temp file to the real config file
            os.replace(tmp_file, self.config_file)
            # Step 4: Write a second backup copy (paranoid double-backup)
            try:
                with open(bak_file, "w", encoding="utf-8") as f:
                    f.write(serialized)
            except OSError:
                pass  # Backup is best-effort
        except Exception as e:
            logger.error(f"Failed to save server config: {e}")
            # Clean up temp file if something went wrong
            tmp_file = self.config_file + ".tmp"
            if os.path.exists(tmp_file):
                try:
                    os.remove(tmp_file)
                except OSError:
                    pass
    def add_server(self, guild_id: int, guild_name: str, autonomous_channel_id: int, 
                   autonomous_channel_name: str, bedtime_channel_ids: List[int] = None,
--- a/bot/static/index.html
+++ b/bot/static/index.html
@@ -663,15 +663,55 @@
          </div>
        </div>
-        <!-- Language Mode Status Section -->
+        <!-- Model Selection Section -->
-        <div style="margin-bottom: 1.5rem; padding: 1rem; background: #2a2a2a; border-radius: 4px;">
+        <div style="margin-bottom: 1.5rem; padding: 1rem; background: #2a2a2a; border-radius: 4px; border: 2px solid #7c4dff;">
-          <h4 style="margin-top: 0;">📊 Current Status</h4>
+          <h4 style="margin-top: 0; color: #b388ff;">🎛️ Model Selection</h4>
-          <div id="language-status-display" style="background: #1a1a1a; padding: 1rem; border-radius: 4px; font-family: monospace; font-size: 0.9rem;">
+          <p style="margin: 0.5rem 0; color: #aaa;">Choose which model each persona uses. Changes take effect immediately and persist across bot restarts.</p>
-            <p style="margin: 0.5rem 0;"><strong>Language Mode:</strong> <span id="status-language">English</span></p>
+
-            <p style="margin: 0.5rem 0;"><strong>Active Model:</strong> <span id="status-model">llama3.1</span></p>
+          <div style="margin: 1rem 0;">
-            <p style="margin: 0.5rem 0;"><strong>Available Languages:</strong> English, 日本語 (Japanese)</p>
+            <div id="model-selection-loading" style="color: #aaa;">Loading available models...</div>
            <div id="model-selection-controls" style="display: none;">
              <!-- Regular Miku -->
              <div style="margin-bottom: 1rem; padding: 0.8rem; background: #1a1a1a; border-radius: 4px; border-left: 3px solid #69f0ae;">
                <label for="model-regular" style="font-weight: bold; color: #69f0ae;">🎤 Regular Miku</label>
                <div style="margin-top: 0.5rem; display: flex; align-items: center; gap: 0.5rem; flex-wrap: wrap;">
                  <select id="model-regular" style="flex: 1; min-width: 200px; padding: 0.5rem; background: #333; color: #fff; border: 1px solid #555; border-radius: 4px;"></select>
                  <span id="model-regular-badge" style="font-size: 0.75rem; padding: 0.2rem 0.5rem; border-radius: 4px;"></span>
                  <button onclick="selectModel('regular')" style="background: #69f0ae; color: #000; padding: 0.5rem 1rem; border: none; border-radius: 4px; cursor: pointer; font-weight: bold;">Apply</button>
                </div>
              </div>
              <!-- Evil Miku -->
              <div style="margin-bottom: 1rem; padding: 0.8rem; background: #1a1a1a; border-radius: 4px; border-left: 3px solid #ff5252;">
                <label for="model-evil" style="font-weight: bold; color: #ff5252;">😈 Evil Miku</label>
                <div style="margin-top: 0.5rem; display: flex; align-items: center; gap: 0.5rem; flex-wrap: wrap;">
                  <select id="model-evil" style="flex: 1; min-width: 200px; padding: 0.5rem; background: #333; color: #fff; border: 1px solid #555; border-radius: 4px;"></select>
                  <span id="model-evil-badge" style="font-size: 0.75rem; padding: 0.2rem 0.5rem; border-radius: 4px;"></span>
                  <button onclick="selectModel('evil')" style="background: #ff5252; color: #fff; padding: 0.5rem 1rem; border: none; border-radius: 4px; cursor: pointer; font-weight: bold;">Apply</button>
                </div>
              </div>
              <!-- Japanese Mode -->
              <div style="margin-bottom: 1rem; padding: 0.8rem; background: #1a1a1a; border-radius: 4px; border-left: 3px solid #40c4ff;">
                <label for="model-japanese" style="font-weight: bold; color: #40c4ff;">🗾 Japanese Mode</label>
                <div style="margin-top: 0.5rem; display: flex; align-items: center; gap: 0.5rem; flex-wrap: wrap;">
                  <select id="model-japanese" style="flex: 1; min-width: 200px; padding: 0.5rem; background: #333; color: #fff; border: 1px solid #555; border-radius: 4px;"></select>
                  <span id="model-japanese-badge" style="font-size: 0.75rem; padding: 0.2rem 0.5rem; border-radius: 4px;"></span>
                  <button onclick="selectModel('japanese')" style="background: #40c4ff; color: #000; padding: 0.5rem 1rem; border: none; border-radius: 4px; cursor: pointer; font-weight: bold;">Apply</button>
                </div>
              </div>
            </div>
            <div style="margin-top: 0.5rem; display: flex; gap: 0.5rem; flex-wrap: wrap;">
              <button onclick="loadAvailableModels()" style="background: #7c4dff; color: #fff; padding: 0.5rem 1rem; border: none; border-radius: 4px; cursor: pointer; font-weight: bold;">🔄 Refresh Models</button>
              <button onclick="refreshModelStatus()" style="background: #333; color: #fff; padding: 0.5rem 1rem; border: 1px solid #555; border-radius: 4px; cursor: pointer;">📊 Refresh Status</button>
            </div>
          </div>
          <div id="model-selection-info" style="margin-top: 0.5rem; padding: 0.5rem; background: #1a1a1a; border-radius: 4px; font-size: 0.8rem; color: #888; display: none;">
            <span id="model-selection-info-text"></span>
          </div>
          <button onclick="refreshLanguageStatus()" style="margin-top: 1rem;">🔄 Refresh Status</button>
        </div>
        <!-- Information Section -->
@@ -1375,15 +1415,15 @@
  </div>
 </div>
-<script src="/static/js/core.js?v=20260502"></script>
+<script src="/static/js/core.js?v=20260520"></script>
-<script src="/static/js/servers.js?v=20260502"></script>
+<script src="/static/js/servers.js?v=20260520"></script>
-<script src="/static/js/modes.js?v=20260502"></script>
+<script src="/static/js/modes.js?v=20260520"></script>
-<script src="/static/js/actions.js?v=20260502"></script>
+<script src="/static/js/actions.js?v=20260520"></script>
-<script src="/static/js/image-gen.js?v=20260502"></script>
+<script src="/static/js/image-gen.js?v=20260520"></script>
-<script src="/static/js/status.js?v=20260502"></script>
+<script src="/static/js/status.js?v=20260520"></script>
-<script src="/static/js/dm.js?v=20260502"></script>
+<script src="/static/js/dm.js?v=20260520"></script>
-<script src="/static/js/chat.js?v=20260502"></script>
+<script src="/static/js/chat.js?v=20260520"></script>
-<script src="/static/js/memories.js?v=20260502"></script>
+<script src="/static/js/memories.js?v=20260520"></script>
-<script src="/static/js/profile.js?v=20260502"></script>
+<script src="/static/js/profile.js?v=20260520"></script>
 </body>
 </html>
--- a/bot/static/js/core.js
+++ b/bot/static/js/core.js
@@ -168,6 +168,13 @@ function switchTab(tabId) {
    showTabLoading('tab6');
    loadAutonomousStats().finally(() => hideTabLoading('tab6'));
  }
  if (tabId === 'tab4') {
    if (typeof loadAvailableModels === 'function') {
      loadAvailableModels();
    } else {
      console.warn('loadAvailableModels not available yet (servers.js may not be loaded)');
    }
  }
  if (tabId === 'tab9') {
    console.log('🧠 Refreshing memory stats for Memories tab');
    showTabLoading('tab9');
--- a/bot/static/js/servers.js
+++ b/bot/static/js/servers.js
@@ -745,3 +745,213 @@ async function toggleLanguageMode() {
    showNotification('Failed to toggle language mode', 'error');
  }
 }
 // ===== Model Selection Functions =====
 let availableModelsData = null;
 async function loadAvailableModels() {
  console.log('📋 loadAvailableModels() called');
  try {
    const loadingEl = document.getElementById('model-selection-loading');
    const controlsEl = document.getElementById('model-selection-controls');
    const infoEl = document.getElementById('model-selection-info');
    const infoTextEl = document.getElementById('model-selection-info-text');
    if (loadingEl) loadingEl.style.display = 'block';
    if (controlsEl) controlsEl.style.display = 'none';
    if (infoEl) infoEl.style.display = 'none';
    console.log('📋 Fetching /models/available...');
    const result = await apiCall('/models/available');
    console.log('📋 /models/available response:', result);
    availableModelsData = result;
    if (loadingEl) loadingEl.style.display = 'none';
    if (controlsEl) controlsEl.style.display = 'block';
    if (!result.success) {
      showNotification('Failed to load models: ' + (result.error || 'Unknown error'), 'error');
      return;
    }
    // Populate dropdowns
    const allModels = result.all || [];
    const gpuMap = result.gpu_map || {};
    console.log('📋 Populating dropdowns with models:', allModels);
    const personas = [
      { id: 'regular',  selectId: 'model-regular',  badgeId: 'model-regular-badge' },
      { id: 'evil',     selectId: 'model-evil',     badgeId: 'model-evil-badge' },
      { id: 'japanese', selectId: 'model-japanese', badgeId: 'model-japanese-badge' },
    ];
    for (const p of personas) {
      const select = document.getElementById(p.selectId);
      if (!select) continue;
      // Save current selection
      const currentVal = select.value;
      select.innerHTML = '';
      for (const model of allModels) {
        const option = document.createElement('option');
        option.value = model;
        option.textContent = model;
        select.appendChild(option);
      }
      // Restore selection if still valid
      if (currentVal && allModels.includes(currentVal)) {
        select.value = currentVal;
      }
    }
    // Update badges for current selections
    updateModelBadges(allModels, gpuMap);
    // Show info about source
    if (infoEl && infoTextEl) {
      if (result.source === 'fallback') {
        infoTextEl.textContent = '⚠️ Could not reach llama-swap containers. Showing known models from config.';
        infoEl.style.display = 'block';
      } else {
        infoEl.style.display = 'none';
      }
    }
    // Load current status to sync dropdowns
    await refreshModelStatus();
    console.log('Available models loaded:', result);
  } catch (error) {
    console.error('Failed to load available models:', error);
    const loadingEl = document.getElementById('model-selection-loading');
    if (loadingEl) loadingEl.textContent = '❌ Failed to load models. Click "Refresh Models" to retry.';
    showNotification('Failed to load available models', 'error');
  }
 }
 function updateModelBadges(allModels, gpuMap) {
  const personas = [
    { selectId: 'model-regular',  badgeId: 'model-regular-badge' },
    { selectId: 'model-evil',     badgeId: 'model-evil-badge' },
    { selectId: 'model-japanese', badgeId: 'model-japanese-badge' },
  ];
  for (const p of personas) {
    const select = document.getElementById(p.selectId);
    const badge = document.getElementById(p.badgeId);
    if (!select || !badge) continue;
    const model = select.value;
    const gpus = gpuMap[model];
    if (!gpus) {
      badge.textContent = '';
      badge.style.display = 'none';
    } else if (gpus.length === 2 || (gpus.includes('nvidia') && gpus.includes('amd'))) {
      badge.textContent = '✅ Both GPUs';
      badge.style.background = '#1b5e20';
      badge.style.color = '#a5d6a7';
      badge.style.display = 'inline';
    } else if (gpus.includes('nvidia')) {
      badge.textContent = '⚠️ NVIDIA Only';
      badge.style.background = '#e65100';
      badge.style.color = '#ffcc80';
      badge.style.display = 'inline';
    } else if (gpus.includes('amd')) {
      badge.textContent = '⚠️ AMD Only';
      badge.style.background = '#e65100';
      badge.style.color = '#ffcc80';
      badge.style.display = 'inline';
    }
  }
 }
 async function selectModel(persona) {
  console.log(`📋 selectModel('${persona}') called`);
  try {
    const selectId = 'model-' + persona;
    const select = document.getElementById(selectId);
    if (!select) {
      console.warn(`📋 select element #${selectId} not found`);
      return;
    }
    const model = select.value;
    if (!model) {
      showNotification('Please select a model first', 'error');
      return;
    }
    console.log(`📋 Setting ${persona} model to '${model}'...`);
    const result = await apiCall('/models/select', 'POST', {
      persona: persona,
      model: model,
    });
    console.log(`📋 /models/select response:`, result);
    if (result.success) {
      showNotification(`${persona.charAt(0).toUpperCase() + persona.slice(1)} model set to '${model}'`, 'success');
      // Update badges
      if (availableModelsData) {
        updateModelBadges(availableModelsData.all || [], availableModelsData.gpu_map || {});
      }
      // Refresh language status display (active model may have changed)
      refreshLanguageStatus();
    } else {
      showNotification('Failed to set model: ' + (result.error || 'Unknown error'), 'error');
    }
  } catch (error) {
    console.error('Failed to select model:', error);
    showNotification('Failed to set model: ' + error.message, 'error');
  }
 }
 async function refreshModelStatus() {
  console.log('📋 refreshModelStatus() called');
  try {
    const result = await apiCall('/models/status');
    console.log('📋 /models/status response:', result);
    if (!result.success) return;
    // Sync dropdowns with current globals
    const personas = [
      { id: 'regular',  selectId: 'model-regular' },
      { id: 'evil',     selectId: 'model-evil' },
      { id: 'japanese', selectId: 'model-japanese' },
    ];
    for (const p of personas) {
      const select = document.getElementById(p.selectId);
      if (!select) continue;
      const currentModel = result[p.id];
      // Check if this value exists in the dropdown
      let found = false;
      for (const option of select.options) {
        if (option.value === currentModel) {
          select.value = currentModel;
          found = true;
          break;
        }
      }
      if (!found && currentModel) {
        // Add it if it doesn't exist (e.g., a model that wasn't in the API response)
        const option = document.createElement('option');
        option.value = currentModel;
        option.textContent = currentModel;
        select.appendChild(option);
        select.value = currentModel;
      }
    }
    // Update badges
    if (availableModelsData) {
      updateModelBadges(availableModelsData.all || [], availableModelsData.gpu_map || {});
    }
  } catch (error) {
    console.error('Failed to refresh model status:', error);
  }
 }
--- a/bot/utils/activities.py
+++ b/bot/utils/activities.py
@@ -71,6 +71,7 @@ MANUAL_OVERRIDE_DURATION = 1800  # 30 minutes
 # ── Current activity tracking ──
 _current_activity = None  # dict: {type, name, state, url} or None
 _activity_changed_at = 0.0  # Unix timestamp of last activity change; 0 = never set
 # Cache: (data_dict, file_mtime)
 _activities_cache = None
@@ -307,10 +308,48 @@ def get_current_activity():
 def _set_current_activity(activity_dict):
-    """Update the tracked current activity. Thread-safe."""
+    """Update the tracked current activity. Thread-safe.
-    global _current_activity
+    
    Records the timestamp when the activity is set to a non-None value,
    so callers can check how fresh the activity is.
    """
    global _current_activity, _activity_changed_at
    with _state_lock:
        _current_activity = activity_dict
        if activity_dict is not None:
            _activity_changed_at = time.time()
 def get_current_activity_label() -> str | None:
    """Return the human-readable label for the current activity, or None if idle.
    Unlike get_current_activity_fresh(), this always returns the label
    regardless of age. Useful for the Web UI and API endpoints.
    """
    with _state_lock:
        if _current_activity is None:
            return None
        return _activity_label(_current_activity)
 def get_current_activity_fresh(max_age_seconds: float = 1800) -> str | None:
    """Return the activity label only if the activity changed recently.
    Args:
        max_age_seconds: Maximum age in seconds (default 30 minutes).
    Returns:
        Human-readable activity label (e.g. "Playing osu!") if the activity
        was set within max_age_seconds, or None if idle or too old.
    """
    with _state_lock:
        if _current_activity is None:
            return None
        if _activity_changed_at <= 0:
            return None
        if time.time() - _activity_changed_at > max_age_seconds:
            return None
        return _activity_label(_current_activity)
 # ══════════════════════════════════════════════════════════════════════════════
--- a/bot/utils/cat_client.py
+++ b/bot/utils/cat_client.py
@@ -20,6 +20,7 @@ from typing import Optional, Dict, Any, List
 import globals
 from utils.logger import get_logger
 from utils.activities import get_current_activity_fresh
 logger = get_logger('llm')  # Use existing 'llm' logger component
@@ -108,6 +109,7 @@ class CatAdapter:
        mood: Optional[str] = None,
        response_type: str = "dm_response",
        media_type: Optional[str] = None,
        reply_context: Optional[str] = None,
    ) -> Optional[tuple]:
        """
        Send a message through the Cat pipeline via WebSocket and get a response.
@@ -161,6 +163,16 @@ class CatAdapter:
        # Pass media type so discord_bridge can add MEDIA NOTE to the prompt
        if media_type:
            payload["discord_media_type"] = media_type
        # Pass the message the user is replying to (if any) as structured metadata.
        # The discord_bridge plugin injects this into the prompt as a clearly-labeled
        # context note — keeping Miku's words separate from the user's message text
        # and preventing the speaker confusion that the old embed-in-prompt format caused.
        if reply_context:
            payload["discord_reply_context"] = reply_context
        # Pass current Discord activity if it changed recently (30-min decay window)
        activity_label = get_current_activity_fresh()
        if activity_label:
            payload["discord_activity"] = activity_label
        try:
            # Build WebSocket URL from HTTP base URL
@@ -829,15 +841,16 @@ class CatAdapter:
        else:
            logger.debug("evil_miku_personality already active, skipping toggle")
-        # Step 3: Switch LLM model to darkidol (the uncensored evil model)
+        # Step 3: Switch LLM model to the configured evil model (e.g. darkidol)
-        if not await self.set_llm_model("darkidol"):
+        evil_model = getattr(globals, "EVIL_TEXT_MODEL", "darkidol")
-            logger.error("Failed to switch Cat LLM to darkidol")
+        if not await self.set_llm_model(evil_model):
            logger.error(f"Failed to switch Cat LLM to {evil_model}")
            success = False
        return success
    async def switch_to_normal_personality(self) -> bool:
-        """Disable evil_miku_personality, enable miku_personality, switch LLM to llama3.1.
+        """Disable evil_miku_personality, enable miku_personality, switch LLM to the configured normal model.
        Checks current plugin state first to avoid double-toggling.
        Returns True if all operations succeed, False if any fail.
@@ -865,9 +878,10 @@ class CatAdapter:
        else:
            logger.debug("miku_personality already active, skipping toggle")
-        # Step 3: Switch LLM model back to llama3.1 (normal model)
+        # Step 3: Switch LLM model to the configured normal model (e.g. llama3.1)
-        if not await self.set_llm_model("llama3.1"):
+        normal_model = getattr(globals, "TEXT_MODEL", "llama3.1")
-            logger.error("Failed to switch Cat LLM to llama3.1")
+        if not await self.set_llm_model(normal_model):
            logger.error(f"Failed to switch Cat LLM to {normal_model}")
            success = False
        return success
--- a/bot/utils/image_handling.py
+++ b/bot/utils/image_handling.py
@@ -158,7 +158,7 @@ async def convert_gif_to_mp4(gif_bytes):
        return None
-async def extract_video_frames(video_bytes, num_frames=4):
+async def extract_video_frames(video_bytes, num_frames=6):
    """
    Extract frames from a video or GIF for analysis.
    Returns a list of base64-encoded frames.
@@ -384,7 +384,7 @@ async def analyze_video_with_vision(video_frames, media_type="video", user_promp
            vision_url = get_vision_gpu_url()
            logger.info(f"Sending video analysis request to {vision_url} using model: {globals.VISION_MODEL} (media_type: {media_type}, frames: {len(video_frames)})")
-            async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers, timeout=aiohttp.ClientTimeout(total=120)) as response:
+            async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers, timeout=aiohttp.ClientTimeout(total=300)) as response:
                if response.status == 200:
                    data = await response.json()
                    result = data.get("choices", [{}])[0].get("message", {}).get("content", "No description.")
--- a/bot/utils/llm.py
+++ b/bot/utils/llm.py
@@ -13,6 +13,7 @@ from utils.moods import load_mood_description
 from utils.conversation_history import conversation_history
 from utils.logger import get_logger
 from utils.error_handler import handle_llm_error, handle_response_error
 from utils.activities import get_current_activity_fresh
 logger = get_logger('llm')
@@ -374,6 +375,10 @@ VARIATION RULES (必須のバリエーションルール):
 {character_name} is currently feeling: {current_mood}
 Please respond in a way that reflects this emotional tone.{pfp_context}"""
    # Inject current Discord activity if it changed recently (30-min decay window)
    activity_label = get_current_activity_fresh()
    if activity_label:
        full_system_prompt += f"\nHer Discord status: {activity_label}"
    # Add media type awareness if provided
    if media_type:
--- a/cat-plugins/discord_bridge/discord_bridge.py
+++ b/cat-plugins/discord_bridge/discord_bridge.py
@@ -43,6 +43,8 @@ def before_cat_reads_message(user_message_json: dict, cat) -> dict:
    response_type = user_message_json.get('discord_response_type', None)
    evil_mode = user_message_json.get('discord_evil_mode', False)
    media_type = user_message_json.get('discord_media_type', None)
    activity = user_message_json.get('discord_activity', None)
    reply_context = user_message_json.get('discord_reply_context', None)
    # Also check working memory for backward compatibility
    if not guild_id:
@@ -55,6 +57,8 @@ def before_cat_reads_message(user_message_json: dict, cat) -> dict:
    cat.working_memory['response_type'] = response_type
    cat.working_memory['evil_mode'] = evil_mode
    cat.working_memory['media_type'] = media_type
    cat.working_memory['activity'] = activity
    cat.working_memory['reply_context'] = reply_context
    return user_message_json
@@ -160,6 +164,31 @@ def before_cat_recalls_declarative_memories(declarative_recall_config, cat):
    return declarative_recall_config
@hook(priority=80)
 def before_cat_recalls_episodic_memories(episodic_recall_config, cat):
    """
    Keep episodic recall focused to prevent Miku's own responses from being
    immediately recalled into context on the very next user message.
    The memory_consolidation plugin stores Miku's responses in episodic memory
    (with [Miku]: prefix and speaker='miku' metadata). Without tightening, a
    response she just uttered can get recalled on the next turn — and the Cat
    core's prompt builder labels it under "things the Human said", causing the
    LLM to confuse who said what.
    Default Cat settings (k=3, threshold=0.7) are reasonable; we keep them.
    """
    # k=3 is the default — stays tight
    # threshold=0.75 is very slightly stricter than the 0.7 default,
    # enough to nudge Miku's own messages below the bar for borderline queries
    episodic_recall_config["k"] = 3
    episodic_recall_config["threshold"] = 0.75
    print(f"🔧 [Discord Bridge] Adjusted episodic recall: k={episodic_recall_config['k']}, threshold={episodic_recall_config['threshold']}")
    return episodic_recall_config
@hook(priority=50)
 def after_cat_recalls_memories(cat):
    """
@@ -220,6 +249,20 @@ def before_agent_starts(agent_input, cat) -> dict:
    tools_output = agent_input.get('tools_output', '')
    user_input = agent_input.get('input', '')
    # Fix misleading header in episodic memory context.
    # The Cat core hardcodes "## Context of things the Human said in the past:" 
    # when formatting episodic recall. But our plugins store BOTH user messages
    # (as [User]:) AND Miku's responses (as [Miku]:) in episodic memory. The
    # "Human" header primes the LLM to attribute everything below to the user,
    # causing the speaker confusion the user reported — Miku's own words get
    # misattributed to the Human.
    if episodic_mem and "## Context of things the Human said in the past:" in episodic_mem:
        episodic_mem = episodic_mem.replace(
            "## Context of things the Human said in the past:",
            "## Past conversation excerpts (prefixed by who said what):"
        )
        agent_input['episodic_memory'] = episodic_mem
    print(f"\U0001f50d [Discord Bridge] before_agent_starts called")
    print(f"   input: {user_input[:80]}")
    print(f"   declarative_mem length: {len(declarative_mem)}")
@@ -312,6 +355,12 @@ Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. Th
 Miku is currently feeling: {mood_description}
 Please respond in a way that reflects this emotional tone."""
        # Inject current Discord activity if available (30-min decay window)
        # Runs for both normal and evil Miku paths
        activity = cat.working_memory.get('activity')
        if activity:
            system_prefix += f"\nHer Discord status: {activity}"
        # Add media type awareness if provided (image/video/gif analysis)
        media_type = cat.working_memory.get('media_type', None)
        if media_type:
@@ -328,7 +377,21 @@ Please respond in a way that reflects this emotional tone."""
        print(f"   [Discord Bridge] Error building system prefix: {e}")
        system_prefix = cat.working_memory.get('full_system_prefix', '[system prefix not available]')
-    full_prompt = f"{system_prefix}\n\n# Context\n\n{episodic_mem}\n\n{declarative_mem}\n\n{tools_output}\n\n# Conversation until now:\nHuman: {user_input}"
+    # Build reply context note if the user is replying to Miku's message.
    # This injects Miku's quoted words as a SEPARATE clearly-labeled context note
    # (not embedded in the user's message text). Keeps speaker boundaries intact
    # and prevents the LLM from misattributing Miku's words to the user.
    # Uses a colon+space delimiter (no nested quotes) to avoid formatting issues
    # when the replied message itself contains double-quote characters.
    reply_context = cat.working_memory.get('reply_context')
    if reply_context:
        reply_context_note = f'[The user is replying to what you (Miku) said — you said: {reply_context}]'
        agent_input['reply_context'] = reply_context_note
    else:
        reply_context_note = ''
        agent_input['reply_context'] = ''
    full_prompt = f"{system_prefix}\n\n# Context\n\n{episodic_mem}\n\n{declarative_mem}\n\n{tools_output}\n\n{reply_context_note}\n\n# Conversation until now:\nHuman: {user_input}"
    cat.working_memory['last_full_prompt'] = full_prompt
    return agent_input
--- a/cat-plugins/evil_miku_personality/evil_miku_personality.py
+++ b/cat-plugins/evil_miku_personality/evil_miku_personality.py
@@ -80,7 +80,7 @@ def agent_prompt_prefix(prefix, cat):
    )
    # --- Build system prompt (matches get_evil_system_prompt structure) ----------
-    return f"""{preamble}
+    full_prefix = f"""{preamble}
 ---
@@ -97,6 +97,13 @@ def agent_prompt_prefix(prefix, cat):
 ⚠️ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}
 Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. This mood defines how you sound RIGHT NOW."""
    # Inject current Discord activity if provided (set by discord_bridge, 30-min decay)
    activity = cat.working_memory.get('activity')
    if activity:
        full_prefix += f"\nHer Discord status: {activity}"
    return full_prefix
@hook(priority=100)
 def agent_prompt_suffix(suffix, cat):
@@ -112,9 +119,12 @@ def agent_prompt_suffix(suffix, cat):
 {{tools_output}}
 {{reply_context}}
 [Current mood: {mood_name.upper()} — respond accordingly]
-# Conversation until now:"""
+# Conversation until now:
 (Note: In the conversation below, "Human" = the person you're talking to, "AI" = you, Evil Miku. Pay attention to who said what.)"""
@hook(priority=100)
--- a/cat-plugins/miku_personality/miku_personality.py
+++ b/cat-plugins/miku_personality/miku_personality.py
@@ -69,6 +69,11 @@ def agent_prompt_prefix(prefix, cat):
 Miku is currently feeling: {mood_description}
 Please respond in a way that reflects this emotional tone."""
    # Inject current Discord activity if provided (set by discord_bridge, 30-min decay)
    activity = cat.working_memory.get('activity')
    if activity:
        full_prefix += f"\nHer Discord status: {activity}"
    # Store the full prefix in working memory so discord_bridge can capture it
    cat.working_memory['full_system_prefix'] = full_prefix
    return full_prefix
@@ -86,7 +91,10 @@ def agent_prompt_suffix(suffix, cat):
 {tools_output}
-# Conversation until now:"""
+{reply_context}
 # Conversation until now:
 (Note: In the conversation below, "Human" = the person you're talking to, "AI" = you, Miku. Pay attention to who said what.)"""
@hook(priority=100)
--- a/llama-swap-rocm-config.yaml
+++ b/llama-swap-rocm-config.yaml
@@ -38,6 +38,15 @@ models:
      - japanese
      - japanese-model
  # Qwen3.5 for ComfyUI prompt generation
  qwen3.5:
    cmd: /app/llama-server --port ${PORT} --model /models/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive-Q8_K_P.gguf -ngl 99 -c 8192 --host 0.0.0.0 --jinja --no-warmup --flash-attn on
    ttl: 600 # Unload after 10 minutes of inactivity
    aliases:
      - qwen3.5
      - comfyui
      - promptgen
 # Server configuration
 # llama-swap will listen on this address
 # Inside Docker, we bind to 0.0.0.0 to allow bot container to connect
Author	SHA1	Message	Date
koko210Serve	cfd5eb16f7	fix: protect server config from truncation and recover from Discord guilds - Save servers_config.json atomically via temp file + fsync + rename - Keep .bak backup and auto-restore when main config is empty/corrupt - Add /servers/recover endpoint for manual recovery - Auto-recover basic server configs on startup when config is empty but bot is in guilds	2026-06-11 20:37:04 +03:00
koko210Serve	486acb5c14	Fix reply-context speaker confusion with structured metadata pipeline Previously, when a user replied to Miku's message via Discord's reply feature, Miku's quoted words were embedded directly into the user's message text using the format: [Replying to your message: "Miku's words"] User's response This caused two problems: 1. The LLM had to parse "your message" to determine the quoted text was MIKU's words — fragile and frequently misattributed 2. When stored in episodic memory as [User]: ..., Miku's quoted words were permanently mislabeled under the user's speaker prefix Now reply context flows through as structured metadata: - bot/bot.py captures the replied-to text WITHOUT embedding it in prompt - cat_client.py passes it as discord_reply_context in the WebSocket payload - discord_bridge.py injects it as agent_input['reply_context'] — a CLEARLY LABELED note: [The user is replying to what you (Miku) said — ...] - miku_personality.py + evil_miku_personality.py render it via {reply_context} placeholder in the prompt suffix, between memory context and conversation history This keeps Miku's words as a separate context note, never mixed into the user's HumanMessage. Episodic memory only stores the user's actual words. The fallback path (when Cat is unavailable) also uses a cleaner format with explicit speaker labels.	2026-06-03 22:50:03 +03:00
koko210Serve	9d2c14fa0b	Fix vision pipeline: ffmpeg removal by autoremove, increase vision timeout, reduce frame count, add Discord activity awareness - bot/Dockerfile: Add ffmpeg to reinstall line after apt-get autoremove (autoremove was sweeping up ffmpeg as 'no longer needed' after playwright install) - bot/utils/image_handling.py: Increase video analysis timeout 120s→300s, 6→3 for Tenor GIFs (GTX 1660 VRAM constraint) - bot/utils/activities.py: Add _activity_changed_at timestamp tracking, get_current_activity_label() and get_current_activity_fresh() with 30-min decay - bot/utils/cat_client.py: Pass current Discord activity to Cheshire Cat pipeline - bot/utils/llm.py: Inject current Discord activity into system prompt - cat-plugins/: Forward Discord activity through working_memory to personality plugins - bot/persona//preamble.txt: Add Discord status usage guidelines for character prompts - llama-swap-rocm-config.yaml: Add qwen3.5 model entry for ComfyUI prompt generation - AGENTS.md: New project documentation file	2026-05-27 01:18:12 +03:00
koko210Serve	d333c61c8f	fix: set default bedtime end time to 11 PM (was 9 PM)	2026-05-22 21:40:01 +03:00
koko210Serve	e1f81e52e5	Fix Miku confusing who said what in conversations Three interrelated fixes for speaker attribution confusion: 1. Fix misleading episodic memory header (discord_bridge.py): The Cat core hardcodes '## Context of things the Human said in the past:' when formatting recalled conversations. Our plugins store BOTH user messages ([User]: prefix) AND Miku's own responses ([Miku]: prefix) in episodic memory. This misleading header primes the LLM to attribute Miku's words to the user. Replaced with '## Past conversation excerpts (prefixed by who said what):' which accurately describes the mixed-speaker content. 2. Tighten episodic recall (discord_bridge.py): Added before_cat_recalls_episodic_memories hook setting threshold=0.75 (vs default 0.7) to reduce the chance of Miku's own just-uttered response being recalled on the very next user message, which would feed her own words back as misleading context. 3. Add role clarification (miku_personality.py & evil_miku_personality.py): Added a clarifying note after '# Conversation until now:' in the prompt suffix to explicitly tell the model that 'Human = the user, AI = you (Miku)', helping it reconcile the two labeling systems (episodic [User]/[Miku] prefixes vs conversation history Human/AI roles).	2026-05-22 16:38:34 +03:00
koko210Serve	201f2e3df5	feat(ui): add model selection UI to LLM Settings tab - Three dropdowns for Regular Miku, Evil Miku, Japanese Mode models - GPU availability badges (Both GPUs / NVIDIA Only / AMD Only) - Refresh Models + Refresh Status buttons - Load models on tab switch with defensive checks - Bump cache-busting version for all JS files - Remove redundant Current Status section	2026-05-20 13:55:35 +03:00
koko210Serve	b017a0ec04	feat(api): register models_selector routes - Import and include models_selector router in FastAPI app	2026-05-20 13:55:29 +03:00
koko210Serve	6bf9a30c33	feat(routes): sync model globals via config API, fix log message - Add models.text, models.evil, models.japanese to config/set globals sync - Fix language toggle log to show actual model name instead of hardcoded string	2026-05-20 13:55:22 +03:00
koko210Serve	8e5260561a	feat(config): persist model selections via config_manager - Add models.text, models.evil, models.japanese to restore_runtime_settings - Add model keys to reset_to_defaults with CONFIG defaults - Include model info in runtime_state for API visibility	2026-05-20 13:55:11 +03:00
koko210Serve	b4737c1ae1	fix(cat): use configured models instead of hardcoded strings - switch_to_evil_personality now reads EVIL_TEXT_MODEL from globals - switch_to_normal_personality now reads TEXT_MODEL from globals - Removes desync risk when user changes models via Web UI	2026-05-20 13:55:05 +03:00
koko210Serve	ae4e40f2d7	feat(models): add model selection API endpoints - GET /models/available: query both llama-swap instances for model lists - POST /models/select: set per-persona model (regular/evil/japanese) with persistence - GET /models/status: return current per-persona model assignments - Fall back to known model list when containers are unreachable	2026-05-20 13:54:59 +03:00