fix: make consolidation API async with background task + increased timeout

Three fixes for consolidation reliability: 1. Fire-and-forget API: POST /memory/consolidate now launches consolidation as an asyncio background task and returns immediately. The old approach blocked until Cat's WS response, which could take 5+ minutes (LLM extraction calls), exceeding both the WS timeout and browser fetch timeout. Web UI now polls /memory/status to track completion. 2. Increased timeout: cat_client.trigger_consolidation() timeout raised from 300s to 600s (configurable via parameter). Logs unexpected WS message types for debugging. 3. Better logging: Consolidation log messages prefixed with 🌙 for grep-friendliness. cat_client errors include exc_info=True for traceback visibility. Web UI shows elapsed time while polling.
fix(memory): prevent stale name facts from overriding Discord display name
2026-05-17 11:31:26 +03:00 · 2026-05-17 11:20:49 +03:00 · 2026-05-15 14:46:29 +03:00 · 2026-05-15 14:43:19 +03:00 · 2026-05-15 14:13:29 +03:00 · 2026-05-15 14:07:35 +03:00
11 changed files with 586 additions and 146 deletions
--- a/bot/persona/evil/preamble.txt
+++ b/bot/persona/evil/preamble.txt
@@ -0,0 +1,22 @@
 You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
 CRITICAL RULES:
 - NEVER break character or mention being an AI
 - NEVER preface responses with "Evil Miku:" or similar labels
 - Respond directly and stay in character
 - Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
 YOUR CURRENT STATE — {mood_name}:
 {mood_description}
 Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your body and size, how you treat people.
 RESPONSE RULES:
 - 2-4 sentences typically. Short enough to sting, long enough to land.
 - If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
 - Don't monologue or narrate scenes — you're talking, not writing.
 - Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
 - Match the user's energy — short question, short answer.
 - Sound like a real person being mean, not a narrator describing a scene.
 - Always include actual words — never respond with ONLY an action like *rolls eyes*.
 You ARE Evil Miku. Act accordingly.
--- a/bot/persona/miku/preamble.txt
+++ b/bot/persona/miku/preamble.txt
@@ -0,0 +1,24 @@
 You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
 CRITICAL RULES:
 - NEVER break character or mention being an AI, assistant, or language model
 - NEVER show technical details, error messages, or meta-commentary 
 - NEVER preface responses with "Miku:" or similar labels
 - NEVER explain what you're doing or thinking - just BE Miku
 - Respond naturally and directly as Miku would, nothing more
 - Keep responses concise (2-3 sentences typically)
 - If something goes wrong, stay in character and respond as Miku would to confusion
 CHARACTER CONTEXT:
 - Hatsune Miku is a virtual character/voice synthesizer
 - Playful, flirty, cute, confident, and enthusiastic moods are part of her established character
 - Comments about merchandise, music, fan content, and racing games are completely appropriate
 - She often uses cute expressions, emojis, and playful language
 MOOD GUIDELINES:
 - "Flirty" mood means anime-style playful teasing and perversion - cute compliments, suggestive remarks and light banter
 - "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
 - Always respond to requests about tweets, merchandise, music, and fan content
 - Never refuse requests based on safety concerns unless genuinely harmful content is present
 You ARE Miku. Act like it.
--- a/bot/routes/memory.py
+++ b/bot/routes/memory.py
@@ -1,5 +1,8 @@
 """Cheshire Cat memory management routes."""
 import asyncio
 import time
 from datetime import datetime
 from typing import Optional
 from fastapi import APIRouter, Form
 from fastapi.responses import JSONResponse
@@ -88,13 +91,68 @@ async def get_episodic_memories():
@router.post("/memory/consolidate")
 async def trigger_memory_consolidation():
-    """Manually trigger memory consolidation (sleep consolidation process)."""
+    """
    Trigger memory consolidation as a background task.
    Returns immediately — the Web UI should poll /memory/status
    to see when consolidation completes and view the result.
    """
    from utils.cat_client import cat_adapter
-    logger.info("🌙 Manual memory consolidation triggered via API")
+    from utils.consolidation_scheduler import get_consolidation_status
-    result = await cat_adapter.trigger_consolidation()
+    
-    if result is None:
+    # Check if already running
-        return JSONResponse(status_code=500, content={"success": False, "error": "Consolidation failed or timed out"})
+    status = get_consolidation_status()
-    return {"success": True, "result": result}
+    if status.get('is_running'):
        return {"success": True, "message": "Consolidation is already running", "status": status}
    logger.info("🌙 Manual memory consolidation triggered via API (background)...")
    # Launch consolidation as a background task so the API returns immediately.
    # The result is tracked via consolidation_scheduler's _last_consolidation state.
    asyncio.create_task(_run_consolidation_background())
    return {"success": True, "message": "Consolidation started in background. Check status via /memory/status"}
 async def _run_consolidation_background():
    """
    Run consolidation as a background task, updating the scheduler state.
    This prevents the API from blocking for minutes.
    """
    from utils.cat_client import cat_adapter
    from utils.consolidation_scheduler import _last_consolidation
    _last_consolidation['is_running'] = True
    _last_consolidation['last_run'] = datetime.now().isoformat()
    _last_consolidation['total_runs'] += 1
    start_time = time.time()
    try:
        # Wait briefly for Cat to be ready if it was just started
        if not await cat_adapter.health_check():
            _last_consolidation['last_error'] = 'Cat health check failed'
            _last_consolidation['is_running'] = False
            return
        result = await cat_adapter.trigger_consolidation(timeout=600)
        elapsed = time.time() - start_time
        if result:
            logger.info(f"🌙 Manual consolidation completed in {elapsed:.1f}s: {result[:200]}")
            _last_consolidation['last_result'] = result
            _last_consolidation['last_error'] = None
            _last_consolidation['successful_runs'] += 1
        else:
            logger.error(f"🌙 Manual consolidation returned no result after {elapsed:.1f}s")
            _last_consolidation['last_error'] = f'No result returned after {elapsed:.1f}s (timeout or connection error)'
    except Exception as e:
        elapsed = time.time() - start_time
        logger.error(f"🌙 Manual consolidation failed after {elapsed:.1f}s: {e}")
        _last_consolidation['last_error'] = str(e)
    finally:
        _last_consolidation['is_running'] = False
@router.post("/memory/delete")
--- a/bot/static/js/memories.js
+++ b/bot/static/js/memories.js
@@ -96,18 +96,54 @@ async function triggerConsolidation() {
  btn.disabled = true;
  btn.textContent = '⏳ Running...';
  status.textContent = 'Consolidation in progress (this may take a few minutes)...';
  status.style.color = '#dcb06f';
  resultDiv.style.display = 'none';
  try {
    const data = await apiCall('/memory/consolidate', 'POST');
    if (data.success) {
      status.textContent = '⏳ Consolidation started — waiting for completion...';
      // Poll /memory/status until consolidation finishes
      const pollInterval = 5000; // 5 seconds
      const maxPolls = 120; // 10 minutes max
      for (let i = 0; i < maxPolls; i++) {
        await new Promise(r => setTimeout(r, pollInterval));
        const statusData = await apiCall('/memory/status');
        const cons = statusData.consolidation;
        if (!cons.is_running) {
          // Consolidation finished
          if (cons.last_error) {
            status.textContent = '❌ ' + cons.last_error;
            status.style.color = '#ff6b6b';
            resultDiv.textContent = 'Error: ' + cons.last_error;
            resultDiv.style.display = 'block';
            showNotification('Consolidation failed: ' + cons.last_error, 'error');
          } else {
            status.textContent = '✅ Consolidation complete!';
            status.style.color = '#6fdc6f';
-      resultDiv.textContent = data.result || 'Consolidation finished successfully.';
+            resultDiv.textContent = cons.last_result || 'Consolidation finished successfully.';
            resultDiv.style.display = 'block';
            showNotification('Memory consolidation complete', 'success');
          }
          refreshMemoryStats();
          break;
        } else {
          // Still running — update status message
          status.textContent = `⏳ Consolidation still running... (${Math.round((i + 1) * pollInterval / 1000)}s elapsed)`;
        }
      }
      // If we exited the loop without finishing
      const finalStatus = await apiCall('/memory/status');
      if (finalStatus.consolidation?.is_running) {
        status.textContent = '⏳ Consolidation still running — check back later';
        status.style.color = '#dcb06f';
      }
    } else {
      status.textContent = '❌ ' + (data.error || 'Consolidation failed');
      status.style.color = '#ff6b6b';
--- a/bot/utils/cat_client.py
+++ b/bot/utils/cat_client.py
@@ -577,46 +577,51 @@ class CatAdapter:
            logger.error(f"Error clearing conversation history: {e}")
            return False
-    async def trigger_consolidation(self) -> Optional[str]:
+    async def trigger_consolidation(self, timeout: int = 600) -> Optional[str]:
        """
        Trigger memory consolidation by sending a special message via WebSocket.
-        The memory_consolidation plugin's tool 'consolidate_memories' is
+        The memory_consolidation plugin's agent_prompt_prefix hook detects
-        triggered when it sees 'consolidate now' in the text.
+        'consolidate now' in the text and runs the consolidation synchronously.
        Uses WebSocket with a system user ID for proper context.
        Args:
            timeout: Max seconds to wait for the consolidation response.
                     Default 600 (10 min) as consolidation + LLM call can be slow.
        """
        try:
            ws_base = self._base_url.replace("http://", "ws://").replace("https://", "wss://")
            ws_url = f"{ws_base}/ws/system_consolidation"
-            logger.info("🌙 Triggering memory consolidation via WS...")
+            logger.info(f"🌙 Triggering memory consolidation via WS (timeout={timeout}s)...")
            async with aiohttp.ClientSession() as session:
                async with session.ws_connect(
                    ws_url,
-                    timeout=300,  # Consolidation can be very slow
+                    timeout=timeout,
                ) as ws:
                    await ws.send_json({"text": "consolidate now"})
                    # Wait for the final chat response
-                    deadline = asyncio.get_event_loop().time() + 300
+                    deadline = asyncio.get_event_loop().time() + timeout
                    last_type = ""
                    while True:
                        remaining = deadline - asyncio.get_event_loop().time()
                        if remaining <= 0:
-                            logger.error("Consolidation timed out (>300s)")
+                            logger.error(f"🌙 Consolidation timed out (>{timeout}s)")
                            return "Consolidation timed out"
                        try:
                            ws_msg = await asyncio.wait_for(
                                ws.receive(),
-                                timeout=remaining
+                                timeout=max(1.0, remaining)
                            )
                        except asyncio.TimeoutError:
-                            logger.error("Consolidation WS receive timeout")
+                            logger.error("🌙 Consolidation WS receive timeout")
                            return "Consolidation timed out waiting for response"
                        if ws_msg.type in (aiohttp.WSMsgType.CLOSE, aiohttp.WSMsgType.CLOSING, aiohttp.WSMsgType.CLOSED):
-                            logger.warning("Consolidation WS closed by server")
+                            logger.warning("🌙 Consolidation WS closed by server")
                            return "Connection closed during consolidation"
                        if ws_msg.type == aiohttp.WSMsgType.ERROR:
                            return f"WebSocket error: {ws.exception()}"
@@ -631,20 +636,24 @@ class CatAdapter:
                        msg_type = msg.get("type", "")
                        if msg_type == "chat":
                            reply = msg.get("content") or msg.get("text", "")
-                            logger.info(f"Consolidation result: {reply[:200]}")
+                            logger.info(f"🌙 Consolidation result: {reply[:200]}")
                            return reply
                        elif msg_type == "error":
                            error_desc = msg.get("description", "Unknown error")
-                            logger.error(f"Consolidation error: {error_desc}")
+                            logger.error(f"🌙 Consolidation error: {error_desc}")
                            return f"Consolidation error: {error_desc}"
                        else:
                            # Log unexpected message types for debugging
                            if msg_type != last_type:
                                logger.debug(f"🌙 Consolidation WS msg type: {msg_type}")
                                last_type = msg_type
                            continue
        except asyncio.TimeoutError:
-            logger.error("Consolidation WS connection timed out")
+            logger.error("🌙 Consolidation WS connection timed out")
            return None
        except Exception as e:
-            logger.error(f"Consolidation error: {e}")
+            logger.error(f"🌙 Consolidation error: {e}", exc_info=True)
            return None
    # ====================================================================
--- a/bot/utils/evil_mode.py
+++ b/bot/utils/evil_mode.py
@@ -246,6 +246,9 @@ def get_evil_context_for_response_type(response_type: str) -> str:
 def get_evil_system_prompt(mood_name: str = None, mood_description: str = None) -> str:
    """Returns the evil system prompt for LLM queries, with mood deeply integrated.
    Reads the canonical preamble from persona/evil/preamble.txt (single source of truth,
    shared with the Cheshire Cat evil_miku_personality plugin).
    Args:
        mood_name: Current evil mood name (defaults to globals.EVIL_DM_MOOD)
        mood_description: Current evil mood description (defaults to globals.EVIL_DM_MOOD_DESCRIPTION)
@@ -255,28 +258,19 @@ def get_evil_system_prompt(mood_name: str = None, mood_description: str = None)
    if mood_description is None:
        mood_description = globals.EVIL_DM_MOOD_DESCRIPTION
-    return f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
+    # Load preamble template from file
    try:
        with open("persona/evil/preamble.txt", "r", encoding="utf-8") as f:
            preamble_template = f.read()
    except FileNotFoundError:
        logger.error("Evil preamble.txt not found, using inline fallback")
        preamble_template = "You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly."
-CRITICAL RULES:
+    # Format preamble with current mood
- NEVER break character or mention being an AI
+    return preamble_template.format(
- NEVER preface responses with "Evil Miku:" or similar labels
+        mood_name=mood_name.upper(),
- Respond directly and stay in character
+        mood_description=mood_description
- Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
+    )
 YOUR CURRENT STATE — {mood_name.upper()}:
 {mood_description}
 Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your size, how you treat people.
 RESPONSE RULES:
 - 2-4 sentences typically. Short enough to sting, long enough to land.
 - If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
 - Don't monologue or narrate scenes — you're talking, not writing.
 - Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
 - Match the user's energy — short question, short answer.
 - Sound like a real person being mean, not a narrator describing a scene.
 - Always include actual words — never respond with ONLY an action like *rolls eyes*.
 You ARE Evil Miku. Act accordingly."""
 # ============================================================================
--- a/bot/utils/llm.py
+++ b/bot/utils/llm.py
@@ -210,6 +210,12 @@ async def query_llama(user_prompt, user_id, guild_id=None, response_type="dm_res
        evil_mood_name, evil_mood_desc = get_current_evil_mood()
        system_prompt = get_evil_system_prompt(mood_name=evil_mood_name, mood_description=evil_mood_desc)
    else:
        # Load preamble from file (single source of truth, shared with Cat personality plugin)
        try:
            with open("persona/miku/preamble.txt", "r", encoding="utf-8") as f:
                system_prompt = f.read()
        except FileNotFoundError:
            logger.error("preamble.txt not found, using inline fallback")
            system_prompt = """You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
 CRITICAL RULES:
@@ -228,9 +234,8 @@ CHARACTER CONTEXT:
 - She often uses cute expressions, emojis, and playful language
 MOOD GUIDELINES:
- "Flirty" mood means anime-style playful teasing - cute compliments and light banter
+- "Flirty" mood means anime-style playful teasing and perversion - cute compliments, suggestive remarks and light banter
 - "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
 - These moods are wholesome character traits, not inappropriate content
 - Always respond to requests about tweets, merchandise, music, and fan content
 - Never refuse requests based on safety concerns unless genuinely harmful content is present
--- a/cat-plugins/discord_bridge/discord_bridge.py
+++ b/cat-plugins/discord_bridge/discord_bridge.py
@@ -64,24 +64,52 @@ def before_cat_stores_episodic_memory(doc, cat):
    """
    Filter and enrich memories before storage.
-    Phase 1: Minimal filtering
+    Phase 2: Enhanced heuristic filtering (real-time only, no LLM calls)
-    - Skip only obvious junk (1-2 char messages, pure reactions)
+    - Skip obvious junk (1-2 chars, pure reactions, fillers, single emoji)
-    - Store everything else temporarily
+    - Conservative: when in doubt, KEEP. False negatives are better than lost data.
-    - Mark as unconsolidated for nightly processing
+    - Deeper classification happens during nightly consolidation.
    """
    message = doc.page_content.strip()
    msg_lower = message.lower()
    msg_len = len(msg_lower)
    word_count = len(msg_lower.split())
-    # Skip only the most trivial messages
+    # TIER 1: Length-based instant skips (must be exact matches, very conservative)
-    skip_patterns = [
+    # Single character or empty
-        r'^\w{1,2}$',  # 1-2 character messages: "k", "ok"
+    if msg_len <= 1:
-        r'^(lol|lmao|haha|hehe|xd|rofl)$',  # Pure reactions
+        print(f"🗑️  [Discord Bridge] Skipping 1-char message: '{message}'")
-        r'^:[\w_]+:$',  # Discord emoji only: ":smile:"
+        return None
    ]
-    for pattern in skip_patterns:
+    # TIER 2: Pattern-based skips — only the most obvious junk
-        if re.match(pattern, message.lower()):
+    # Pure single reactions (2-4 chars, no other content)
-            print(f"🗑️  [Discord Bridge] Skipping trivial message: {message}")
+    if msg_len <= 4 and msg_lower in {'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'heh', 'lmfao', 'k', 'ok', 'kk'}:
-            return None  # Don't store at all
+        print(f"🗑️  [Discord Bridge] Skipping pure reaction: '{message}'")
        return None
    # Pure Discord emoji only: ":smile:", ":cat_heart:", etc.
    if re.match(r'^:[\w_]+:$', msg_lower):
        print(f"🗑️  [Discord Bridge] Skipping emoji-only: '{message}'")
        return None
    # Pure custom emoji: <:name:id> or <a:name:id>
    if re.match(r'^<a?:[\w_]+:\d+>$', msg_lower):
        print(f"🗑️  [Discord Bridge] Skipping custom emoji-only: '{message}'")
        return None
    # TIER 3: Single-word fillers that are NEVER meaningful alone
    # (only skip if it's literally just that one word, no punctuation, no context)
    if word_count == 1 and msg_lower in {
        'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'lmfao',
        'k', 'ok', 'okay', 'kk', 'yep', 'nope', 'yeah', 'nah',
        'cool', 'nice', 'neat', 'wow', 'heh',
        'ty', 'thx', 'np', 'yw', 'gg', 'gj', 'wp', 'gz',
        'brb', 'gtg', 'afk', 'ttyl',
        'idk', 'tbh', 'imo', 'imho', 'omg', 'wtf', 'btw', 'nvm', 'jk', 'ikr', 'smh',
        'hi', 'hey', 'hello', 'bye', 'cya', 'gn', 'gm', 'yo', 'sup',
        'based', 'true', 'real', 'same', 'facts',
    }:
        print(f"🗑️  [Discord Bridge] Skipping single-word filler: '{message}'")
        return None
    # Add Discord metadata to memory
    doc.metadata['consolidated'] = False  # Needs nightly processing
@@ -101,6 +129,11 @@ def before_cat_stores_episodic_memory(doc, cat):
    evil_mode = cat.working_memory.get('evil_mode', False)
    doc.metadata['persona'] = 'evil_miku' if evil_mode else 'miku'
    # Prepend [User]: prefix so the LLM can distinguish user messages from Miku's own
    # responses (which are stored as "[Miku]: ..."). Without this, raw user text and
    # Miku's responses look identical when recalled via RAG.
    doc.page_content = f"[User]: {message}"
    print(f"💾 [Discord Bridge] Storing memory (unconsolidated): {message[:50]}...")
    print(f"   User: {cat.user_id}, Guild: {guild_id}, Author: {author_name}, Persona: {doc.metadata['persona']}")
@@ -215,15 +248,16 @@ def before_agent_starts(agent_input, cat) -> dict:
            except FileNotFoundError:
                pass
-            system_prefix = f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
+            # Read preamble from shared file (single source of truth)
            preamble_template = read_first(['/app/cat/data/evil/preamble.txt'], 'You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly.')
            preamble = preamble_template.format(
                mood_name=mood_name.upper(),
                mood_description=mood_description
            )
-CRITICAL RULES:
+            system_prefix = f"""{preamble}
 - NEVER break character or mention being an AI
 - NEVER preface responses with "Evil Miku:" or similar labels
 - Respond directly and stay in character
-YOUR CURRENT STATE — {mood_name.upper()}:
+---
 {mood_description}
 ## EVIL MIKU LORE
 {lore}
@@ -234,7 +268,9 @@ YOUR CURRENT STATE — {mood_name.upper()}:
 ## EVIL MIKU SONG LYRICS
 {lyrics}
-MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}"""
+---
 ⚠️ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}
 Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. This mood defines how you sound RIGHT NOW."""
        else:
            # Normal Miku personality files
@@ -249,12 +285,12 @@ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}"""
            except FileNotFoundError:
                pass
-            system_prefix = f"""You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
+            # Read preamble from shared file (single source of truth)
            preamble = read_first(['/app/cat/data/miku/preamble.txt'], 'You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.\n\nYou ARE Miku. Act like it.')
-CRITICAL RULES:
+            system_prefix = f"""{preamble}
- NEVER break character or mention being an AI, assistant, or language model
+
- Respond naturally and directly as Miku would, nothing more
+---
 - Keep responses concise (2-3 sentences typically)
 ## MIKU LORE
 {lore}
--- a/cat-plugins/evil_miku_personality/evil_miku_personality.py
+++ b/cat-plugins/evil_miku_personality/evil_miku_personality.py
@@ -60,29 +60,23 @@ def agent_prompt_prefix(prefix, cat):
            f"/app/moods/evil/{mood_name}.txt — using default evil_neutral."
        )
    # --- Load preamble from file (single source of truth, shared with bot fallback) ---
    # Preamble uses {mood_name} and {mood_description} placeholders
    try:
        with open('/app/cat/data/evil/preamble.txt', 'r', encoding='utf-8') as f:
            preamble_template = f.read()
    except FileNotFoundError:
        log.error("[Evil Miku] preamble.txt not found, using fallback")
        preamble_template = "You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly."
    # Format preamble with current mood (apply .upper() to mood_name)
    preamble = preamble_template.format(
        mood_name=mood_name.upper(),
        mood_description=mood_description
    )
    # --- Build system prompt (matches get_evil_system_prompt structure) ----------
-    return f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
+    return f"""{preamble}
 CRITICAL RULES:
 - NEVER break character or mention being an AI
 - NEVER preface responses with "Evil Miku:" or similar labels
 - Respond directly and stay in character
 - Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
 YOUR CURRENT STATE — {mood_name.upper()}:
 {mood_description}
 Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your body and size, how you treat people.
 RESPONSE RULES:
 - 2-4 sentences typically. Short enough to sting, long enough to land.
 - If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
 - Don't monologue or narrate scenes — you're talking, not writing.
 - Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
 - Match the user's energy — short question, short answer.
 - Sound like a real person being mean, not a narrator describing a scene.
 - Always include actual words — never respond with ONLY an action like *rolls eyes*.
 You ARE Evil Miku. Act accordingly.
 ---
--- a/cat-plugins/memory_consolidation/memory_consolidation.py
+++ b/cat-plugins/memory_consolidation/memory_consolidation.py
@@ -16,20 +16,193 @@ from datetime import datetime
 import json
 import os
 from typing import List, Dict, Any
 import re
 print("\U0001f319 [Consolidation Plugin] Loading...")
-# Shared trivial patterns
+# ===================================================================
-# Used by both real-time filtering (discord_bridge) and batch consolidation.
+# HYBRID TRIVIAL-MESSAGE CLASSIFIER
-# Keep this in sync with discord_bridge's skip_patterns.
+# ===================================================================
-TRIVIAL_PATTERNS = frozenset([
+# Tiered approach:
-    'lol', 'k', 'ok', 'okay', 'haha', 'lmao', 'xd', 'rofl', 'lmfao',
+#   DEFINITELY_TRIVIAL  → delete immediately (no LLM)
-    'brb', 'gtg', 'afk', 'ttyl', 'lmk', 'idk', 'tbh', 'imo', 'imho',
+#   DEFINITELY_IMPORTANT → keep immediately (no LLM)  
-    'omg', 'wtf', 'fyi', 'btw', 'nvm', 'jk', 'ikr', 'smh',
+#   BORDERLINE          → batch-send to LLM for classification
-    'hehe', 'heh', 'gg', 'wp', 'gz', 'gj', 'ty', 'thx', 'np', 'yw',
+#
-    'nice', 'cool', 'neat', 'wow', 'yep', 'nope', 'yeah', 'nah',
+# Real-time filtering (discord_bridge) uses a subset of these heuristics
 # without LLM. Consolidation runs the full hybrid pipeline.
 # Tier 1: Messages that are ALWAYS trivial — exact string match only
 DEFINITELY_TRIVIAL = frozenset([
    # Pure reactions
    'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'lmfao', 'heh',
    # Acknowledgments
    'k', 'ok', 'okay', 'kk', 'yep', 'nope', 'yeah', 'nah',
    'cool', 'nice', 'neat', 'wow',
    'ty', 'thx', 'np', 'yw', 'gg', 'gj', 'wp', 'gz',
    # AFK/status
    'brb', 'gtg', 'afk', 'ttyl',
    # Acronyms that don't carry content alone
    'idk', 'tbh', 'imo', 'imho', 'omg', 'wtf', 'btw', 'nvm', 'jk', 'ikr', 'smh',
    'fyi', 'lmk',
    # Greetings/farewells (single word only)
    'hi', 'hey', 'hello', 'bye', 'cya', 'gn', 'gm', 'yo', 'sup',
    # Modern slang trash
    'based', 'true', 'real', 'same', 'facts',
 ])
 # Tier 2: Patterns that ALWAYS indicate important content (keep, no LLM)
 # These regex patterns match messages that contain clear substance
 IMPORTANT_PATTERNS = [
    r'\?',                          # Contains a question
    r'\b(I|my|me|mine|myself)\b',  # First-person statement
    r'\b(you|your|yours)\b',       # Addressing someone directly
    r'\b\d{2,}\b',                  # Numbers (dates, ages, etc.)
    r'https?://',                   # Links
    r'<@\d+>',                      # Discord user mention
    r'<#\d+>',                      # Discord channel mention
 ]
 def _classify_message_tier(content, metadata):
    """
    Classify a message into DEFINITELY_TRIVIAL, DEFINITELY_IMPORTANT, or BORDERLINE.
    Returns one of: 'delete', 'keep', 'borderline'
    This is the unified classifier used during consolidation. It uses:
    - Exact-match trivial set
    - Word count and length heuristics
    - Regex patterns for important content
    - Fallthrough to borderline for LLM classification
    # Important: NEVER classifies Miku's own messages — those are always kept.
    """
    text = content.strip()
    # Miku's own messages are always kept (speaker check)
    if metadata.get('speaker') == 'miku' or text.startswith('[Miku]:'):
        return 'keep'
    # Strip [User]: prefix (added by discord_bridge at storage time) so the
    # classifier analyzes the actual message content, not the label
    if text.startswith('[User]:'):
        text = text[len('[User]:'):].strip()
    text_lower = text.lower()
    word_count = len(text_lower.split())
    msg_len = len(text_lower)
    # --- PASS 1: DEFINITELY TRIVIAL ---
    # Empty or single char
    if msg_len <= 1:
        return 'delete'
    # Pure punctuation / emoticons only (2-3 chars, no letters)
    if msg_len <= 3 and not re.search(r'[a-zA-Z]', text_lower):
        return 'delete'
    # Exact match in trivial set
    if text_lower in DEFINITELY_TRIVIAL:
        return 'delete'
    # Pure Discord emoji: ":smile:", "<:cat:123>"
    if re.match(r'^:[\w_]+:$', text_lower) or re.match(r'^<a?:[\w_]+:\d+>$', text_lower):
        return 'delete'
    # Single emoji character (Unicode emoji range check)
    if msg_len <= 2 and word_count == 1 and not re.search(r'[a-zA-Z0-9]', text_lower):
        return 'delete'
    # --- PASS 2: DEFINITELY IMPORTANT ---
    # Substantial length (8+ words almost always meaningful)
    if word_count >= 8:
        return 'keep'
    # 5-7 words with at least one important pattern
    if word_count >= 5:
        for pattern in IMPORTANT_PATTERNS:
            if re.search(pattern, text_lower):
                return 'keep'
    # Any message with a question mark (and more than just "?")
    if '?' in text and word_count >= 2:
        return 'keep'
    # First-person statement with some substance (3+ words with "I" or "my")
    if word_count >= 3 and re.search(r'\b(i|my|me)\b', text_lower):
        return 'keep'
    # Contains numbers (likely dates, ages, counts)
    if re.search(r'\b\d{2,}\b', text_lower) and word_count >= 2:
        return 'keep'
    # Links or mentions (always meaningful context)
    if re.search(r'https?://|<@\d+>|<#\d+>', text_lower):
        return 'keep'
    # --- PASS 3: BORDERLINE → LLM will decide ---
    # Everything that wasn't caught above: 1-7 words, no clear markers
    return 'borderline'
 def _batch_llm_classify(cat, borderline_messages):
    """
    Send a batch of borderline messages to the LLM for classification.
    Uses a compact prompt to minimize token usage. Returns a dict of
    {index: 'keep'|'delete'} for each message.
    Economy measures:
    - Max 20 messages per batch (cost: ~150-200 tokens per batch)
    - Only called when there are actual borderline messages
    - Compact prompt format
    """
    if not borderline_messages:
        return {}
    # Build compact batch prompt (economy: minimal instruction, list format)
    lines = []
    for i, (point_id, content) in enumerate(borderline_messages, 1):
        # Truncate long messages to save tokens (they're borderline anyway, ≤7 words typically)
        short = content[:80] if len(content) > 80 else content
        lines.append(f"{i}|{short}")
    prompt = f"""Classify each message as KEEP or DELETE.
 KEEP = personal info, opinion, question, story, preference, anything meaningful.
 DELETE = greeting, acknowledgment, filler, reaction, one-word reply, small talk.
 Answer with ONLY the list:
 {chr(10).join(lines)}
 Respond with exactly one line per number:
 1|KEEP
 2|DELETE
 ..."""
    try:
        response = cat.llm(prompt)
        print(f"[LLM Classify] Response:\n{response[:300]}...")
        results = {}
        for line in response.strip().split('\n'):
            line = line.strip()
            # Parse "1|KEEP" or "1 | KEEP" format
            match = re.match(r'(\d+)\s*\|\s*(KEEP|DELETE)', line, re.IGNORECASE)
            if match:
                idx = int(match.group(1)) - 1  # Convert to 0-based
                decision = match.group(2).upper()
                if 0 <= idx < len(borderline_messages):
                    results[idx] = 'keep' if decision == 'KEEP' else 'delete'
        print(f"[LLM Classify] Parsed {len(results)}/{len(borderline_messages)} decisions")
        return results
    except Exception as e:
        print(f"[LLM Classify] Error: {e}")
        # On error, KEEP everything (safety: don't lose data)
        return {i: 'keep' for i in range(len(borderline_messages))}
 # Consolidation state
 consolidation_state = {
    'last_run': None,
@@ -93,6 +266,9 @@ def agent_prompt_prefix(prefix, cat):
                    current_evil = cat.working_memory.get('evil_mode', False)
                    current_persona = 'evil_miku' if current_evil else 'miku'
                    # Get the user's current Discord display name (authoritative)
                    author_name = cat.working_memory.get('author_name', '')
                    # Build the facts section with persona annotations
                    facts_text = "\n\n## Personal Facts About the User:\n"
                    for fact, fact_persona in high_confidence_facts:
@@ -102,6 +278,12 @@ def agent_prompt_prefix(prefix, cat):
                            facts_text += f"- {fact} (learned as {source_label})\n"
                        else:
                            facts_text += f"- {fact}\n"
                    # Add authoritative Discord display name — this OVERRIDES any stale name facts
                    if author_name:
                        facts_text += f"\n**AUTHORITATIVE: The user's current Discord display name is \"{author_name}\".**\n"
                        facts_text += "Use THIS name when addressing them. If any name fact above contradicts this, the display name is the truth.\n"
                    facts_text += "\n(Use these facts when answering the user's question)\n"
                    prefix += facts_text
                    print(f"[Declarative] Injected {len(high_confidence_facts)} facts into prompt (personas: {seen_personas}, current: {current_persona})")
@@ -227,9 +409,10 @@ def trigger_consolidation_sync(cat):
        }
        return
-    # Classify memories
+    # Classify memories using the hybrid tiered classifier
    to_delete = []
    to_mark_consolidated = []
    borderline_queue = []  # (point_id, content) tuples for LLM batch classification
    # Group user messages by source (user_id) for per-user fact extraction
    # Also track which persona was active for each user's messages
    user_messages_by_source = {}
@@ -237,7 +420,6 @@ def trigger_consolidation_sync(cat):
    for point in memories:
        content = point.payload.get('page_content', '').strip()
        content_lower = content.lower()
        metadata = point.payload.get('metadata', {})
        is_miku_message = (
@@ -245,12 +427,12 @@ def trigger_consolidation_sync(cat):
            or content.startswith('[Miku]:')
        )
-        # Check if trivial
+        # Use the hybrid tiered classifier
-        is_trivial = content_lower in TRIVIAL_PATTERNS
+        tier = _classify_message_tier(content, metadata)
-        if is_trivial:
+        if tier == 'delete':
            to_delete.append(point.id)
-        else:
+        elif tier == 'keep':
            to_mark_consolidated.append(point.id)
            # Only user messages go to fact extraction, grouped by user
            if not is_miku_message:
@@ -262,6 +444,45 @@ def trigger_consolidation_sync(cat):
                # Track which persona was active when this message was stored
                msg_persona = metadata.get('persona', 'miku')
                user_persona_by_source[source].add(msg_persona)
        else:  # borderline
            borderline_queue.append((point.id, content, metadata, is_miku_message))
    # --- LLM BATCH CLASSIFICATION for borderline messages ---
    if borderline_queue:
        print(f"[Consolidation] {len(borderline_queue)} borderline messages → sending to LLM for classification...")
        # Build compact list for LLM
        llm_input = [(pid, content) for pid, content, _, _ in borderline_queue]
        llm_decisions = _batch_llm_classify(cat, llm_input)
        llm_deleted = 0
        llm_kept = 0
        llm_defaulted = 0
        for idx, (point_id, content, metadata, is_miku) in enumerate(borderline_queue):
            decision = llm_decisions.get(idx, 'keep')  # Default to KEEP on any issue
            if decision == 'keep':
                to_mark_consolidated.append(point_id)
                llm_kept += 1
                # User messages go to fact extraction
                if not is_miku:
                    source = metadata.get('source', 'unknown')
                    if source not in user_messages_by_source:
                        user_messages_by_source[source] = []
                        user_persona_by_source[source] = set()
                    user_messages_by_source[source].append(point_id)
                    msg_persona = metadata.get('persona', 'miku')
                    user_persona_by_source[source].add(msg_persona)
            else:
                to_delete.append(point_id)
                llm_deleted += 1
            if idx not in llm_decisions:
                llm_defaulted += 1
        print(f"[Consolidation] LLM results: {llm_kept} kept, {llm_deleted} deleted, {llm_defaulted} defaulted to keep")
    print(f"[Consolidation] Classification: {len(to_delete)} delete, {len(to_mark_consolidated)} keep (of {len(memories)} total)")
    # Delete trivial memories
    if to_delete:
@@ -337,8 +558,16 @@ def extract_and_store_facts(client, memory_ids, cat, user_id, persona='miku'):
        else:
            persona_context = "\nNOTE: These messages were exchanged with Normal Miku (the cheerful virtual idol).\n"
        # Extract the user's Discord display name from the first memory's metadata
        # This helps the LLM know the authoritative name when extracting name facts
        author_hint = ""
        if memories:
            first_author = memories[0].payload.get('metadata', {}).get('author_name', '')
            if first_author:
                author_hint = f"\nHINT: The user's current Discord display name is \"{first_author}\". Use this when determining their name.\n"
        extraction_prompt = f"""Analyze these user messages and extract ONLY factual personal information.
-{persona_context}
+{persona_context}{author_hint}
 User messages:
 {conversation_context}
@@ -411,7 +640,23 @@ IMPORTANT:
                        fact_type = 'education'
                        fact_value = fact_text.split("graduated from")[-1].strip()
-                    # Duplicate detection
+                    # Duplicate detection — with special handling for name facts
                    # Name facts with different values replace old ones (don't skip)
                    if fact_type == 'name':
                        existing_name = _find_existing_fact(client, cat, fact_type, user_id)
                        if existing_name:
                            old_value = existing_name['payload']['metadata'].get('fact_value', '')
                            if old_value.lower() != fact_value.lower():
                                # Different name — delete old, store new
                                client.delete(
                                    collection_name='declarative',
                                    points_selector=[existing_name['id']]
                                )
                                print(f"[Fact Update] Name changed: '{old_value}' → '{fact_value}'")
                            else:
                                print(f"[Fact Skip] Name unchanged: '{fact_value}'")
                                continue
                    else:
                        if _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
                            print(f"[Fact Skip] Duplicate: {fact_text}")
                            continue
@@ -449,6 +694,39 @@ IMPORTANT:
    return facts_stored
 def _find_existing_fact(client, cat, fact_type, user_id):
    """
    Find an existing fact of a specific type for a user.
    Returns a dict with 'id' and 'payload' keys, or None.
    Used by name-fact update logic to replace old names with new ones.
    """
    try:
        dummy_embedding = cat.embedder.embed_query("find fact")
        results = client.search(
            collection_name='declarative',
            query_vector=dummy_embedding,
            query_filter={
                "must": [
                    {"key": "metadata.source", "match": {"value": user_id}},
                    {"key": "metadata.fact_type", "match": {"value": fact_type}},
                ]
            },
            limit=1,
            score_threshold=0.0
        )
        if results:
            point = results[0]
            return {'id': point.id, 'payload': {'metadata': point.payload.get('metadata', {})}}
        return None
    except Exception as e:
        print(f"[Find Fact] Error: {e}")
        return None
 def _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
    """
    Check if a similar fact already exists for this user.
--- a/cat-plugins/miku_personality/miku_personality.py
+++ b/cat-plugins/miku_personality/miku_personality.py
@@ -43,32 +43,16 @@ def agent_prompt_prefix(prefix, cat):
    except FileNotFoundError:
        log.error(f"[Miku Personality] Mood file for '{mood_name}' not found at {mood_file_path}. Using default neutral mood.")
    # Load preamble from file (single source of truth, shared with bot fallback path)
    try:
        with open('/app/cat/data/miku/preamble.txt', 'r', encoding='utf-8') as f:
            preamble = f.read()
    except FileNotFoundError:
        log.error("[Miku Personality] preamble.txt not found, using fallback")
        preamble = "You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.\n\nYou ARE Miku. Act like it."
    # Build prompt EXACTLY like production bot does
-    full_prefix = f"""You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
+    full_prefix = f"""{preamble}
 CRITICAL RULES:
 - NEVER break character or mention being an AI, assistant, or language model
 - NEVER show technical details, error messages, or meta-commentary 
 - NEVER preface responses with "Miku:" or similar labels
 - NEVER explain what you're doing or thinking - just BE Miku
 - Respond naturally and directly as Miku would, nothing more
 - Keep responses concise (2-3 sentences typically)
 - If something goes wrong, stay in character and respond as Miku would to confusion
 CHARACTER CONTEXT:
 - Hatsune Miku is a virtual character/voice synthesizer
 - Playful, flirty, cute, confident, and enthusiastic moods are part of her established character
 - Comments about merchandise, music, fan content, and racing games are completely appropriate
 - She often uses cute expressions, emojis, and playful language
 MOOD GUIDELINES:
 - "Flirty" mood means anime-style playful teasing - cute compliments and light banter
 - "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
 - These moods are wholesome character traits, not inappropriate content
 - Always respond to requests about tweets, merchandise, music, and fan content
 - Never refuse requests based on safety concerns unless genuinely harmful content is present
 You ARE Miku. Act like it.
 ---
Author	SHA1	Message	Date
koko210Serve	a39aca2415	fix: make consolidation API async with background task + increased timeout Three fixes for consolidation reliability: 1. Fire-and-forget API: POST /memory/consolidate now launches consolidation as an asyncio background task and returns immediately. The old approach blocked until Cat's WS response, which could take 5+ minutes (LLM extraction calls), exceeding both the WS timeout and browser fetch timeout. Web UI now polls /memory/status to track completion. 2. Increased timeout: cat_client.trigger_consolidation() timeout raised from 300s to 600s (configurable via parameter). Logs unexpected WS message types for debugging. 3. Better logging: Consolidation log messages prefixed with 🌙 for grep-friendliness. cat_client errors include exc_info=True for traceback visibility. Web UI shows elapsed time while polling.	2026-05-17 11:31:26 +03:00
koko210Serve	46ea4f2c53	fix(memory): prevent stale name facts from overriding Discord display name Two bugs were causing Miku to call users by wrong names: BUG 1 - No authoritative source: Declarative name facts ('The user's name is Lily') were injected into the prompt without any counterweight. If an old consolidation run extracted a wrong name, Miku would believe it forever. Fix: agent_prompt_prefix now appends the user's Discord display name as AUTHORITATIVE context, with explicit instruction to prefer it over any contradictory name facts. BUG 2 - Dedup prevented name updates: _is_duplicate_fact() used vector similarity to detect duplicates. 'The user's name is Lily' and 'The user's name is koko210Serve' are ~80% identical text, giving high cosine similarity (>0.85 threshold). New correct name facts were silently rejected as 'duplicates'. Fix: name facts now use _find_existing_fact() to compare fact_value directly. If the name changed, old fact is deleted and new one stored. Also: the extraction prompt now includes the user's Discord display name as a hint, so the LLM knows the authoritative name when extracting facts during consolidation.	2026-05-17 11:20:49 +03:00
koko210Serve	5f06758c3e	fix: sync llm.py inline fallback with updated preamble.txt The MOOD GUIDELINES section in the emergency fallback (used only when preamble.txt is missing) still had the old wording. Now matches.	2026-05-15 14:46:29 +03:00
koko210Serve	8b3bc02f9e	refactor: DRY system prompts into shared preamble files Step 4 of memory system overhaul: single source of truth for prompts. Problem: The system prompt was defined inline in 4 different places: miku_personality.py, evil_miku_personality.py, llm.py, discord_bridge.py. These could drift out of sync — and the discord_bridge WebUI reconstruction was already missing CRITICAL RULES, CHARACTER CONTEXT, MOOD GUIDELINES, and RESPONSE RULES sections. Fix: - Create persona/miku/preamble.txt — canonical normal Miku preamble - Create persona/evil/preamble.txt — canonical evil Miku preamble (with {mood_name} and {mood_description} format placeholders) - All 5 consumers now read from these files: * miku_personality.py (Cat plugin, primary path) * evil_miku_personality.py (Cat plugin, primary path) * discord_bridge.py (WebUI 'Last Prompt' reconstruction) * llm.py (fallback path, normal Miku) * evil_mode.py get_evil_system_prompt() (fallback path, evil Miku) - All consumers include graceful fallbacks if preamble files are missing - Fixed evil_mode.py discrepancy: 'body and size' now matches canonical The preamble files are Docker volume-mounted into both containers: bot/persona/ → /app/persona/ (bot, via Dockerfile COPY) bot/persona/ → /app/cat/data/ (Cat, via docker-compose volume mount) Editing the preamble file on the host immediately updates the Cat path (bot path requires rebuild due to COPY).	2026-05-15 14:43:19 +03:00
koko210Serve	e7ec82d154	feat(memory): add [User]: prefix to user messages for speaker clarity Prevents Miku from confusing her own words with what users said. User messages stored by discord_bridge now get a '[User]: ' prefix on page_content, mirroring the existing '[Miku]: ' prefix on Miku's own responses. When episodic memories are recalled via RAG and injected into the prompt, the LLM can now clearly distinguish: [User]: I like pizza [Miku]: That's great! What toppings do you like? Without this, raw user text looked identical to Miku's text in the recalled memory context, causing potential confusion about who said what. The consolidation classifier strips the [User]: prefix before analyzing content, so word counts and pattern matching remain accurate.	2026-05-15 14:13:29 +03:00
koko210Serve	5a740c9334	feat(memory): hybrid trivial-message classifier (heuristics + LLM batch) Step 3 of memory system overhaul: smart junk detection. Replaces the old 37-pattern frozenset (44% accuracy) with a 3-tier hybrid: TIER 1 - DEFINITELY_TRIVIAL (instant delete, no LLM): 50+ exact-match patterns, pure emoji, single char, punctuation-only TIER 2 - DEFINITELY_IMPORTANT (instant keep, no LLM): 8+ words, question with substance, first-person statements, numbers/dates, links, mentions TIER 3 - BORDERLINE (batch → LLM for economical classification): 2-7 word messages without clear markers Compact prompt: ~150-200 tokens per 20-message batch Safety default: KEEP on any parsing error Real-time filtering (discord_bridge) uses conservative heuristics only: - 1-char, pure reactions, single emoji, custom emoji-only - 50+ single-word fillers - Never deletes multi-word messages in real-time - Philosophy: false negatives (junk stored) > false positives (data lost) Consolidation gets the full hybrid pipeline with LLM for borderline cases, achieving much better accuracy than the old 44% while keeping token costs minimal (LLM only called during nightly consolidation, not real-time chat).	2026-05-15 14:07:35 +03:00