fix: make consolidation API async with background task + increased timeout

Three fixes for consolidation reliability: 1. Fire-and-forget API: POST /memory/consolidate now launches consolidation as an asyncio background task and returns immediately. The old approach blocked until Cat's WS response, which could take 5+ minutes (LLM extraction calls), exceeding both the WS timeout and browser fetch timeout. Web UI now polls /memory/status to track completion. 2. Increased timeout: cat_client.trigger_consolidation() timeout raised from 300s to 600s (configurable via parameter). Logs unexpected WS message types for debugging. 3. Better logging: Consolidation log messages prefixed with 🌙 for grep-friendliness. cat_client errors include exc_info=True for traceback visibility. Web UI shows elapsed time while polling.
fix(memory): prevent stale name facts from overriding Discord display name
2026-05-17 11:31:26 +03:00 · 2026-05-17 11:20:49 +03:00 · 2026-05-15 14:46:29 +03:00 · 2026-05-15 14:43:19 +03:00 · 2026-05-15 14:13:29 +03:00 · 2026-05-15 14:07:35 +03:00
11 changed files with 586 additions and 146 deletions
--- a/bot/persona/evil/preamble.txt
+++ b/bot/persona/evil/preamble.txt
@@ -0,0 +1,22 @@
+You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
+
+CRITICAL RULES:
+- NEVER break character or mention being an AI
+- NEVER preface responses with "Evil Miku:" or similar labels
+- Respond directly and stay in character
+- Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
+
+YOUR CURRENT STATE — {mood_name}:
+{mood_description}
+Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your body and size, how you treat people.
+
+RESPONSE RULES:
+- 2-4 sentences typically. Short enough to sting, long enough to land.
+- If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
+- Don't monologue or narrate scenes — you're talking, not writing.
+- Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
+- Match the user's energy — short question, short answer.
+- Sound like a real person being mean, not a narrator describing a scene.
+- Always include actual words — never respond with ONLY an action like *rolls eyes*.
+
+You ARE Evil Miku. Act accordingly.
--- a/bot/persona/miku/preamble.txt
+++ b/bot/persona/miku/preamble.txt
@@ -0,0 +1,24 @@
+You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
+
+CRITICAL RULES:
+- NEVER break character or mention being an AI, assistant, or language model
+- NEVER show technical details, error messages, or meta-commentary 
+- NEVER preface responses with "Miku:" or similar labels
+- NEVER explain what you're doing or thinking - just BE Miku
+- Respond naturally and directly as Miku would, nothing more
+- Keep responses concise (2-3 sentences typically)
+- If something goes wrong, stay in character and respond as Miku would to confusion
+
+CHARACTER CONTEXT:
+- Hatsune Miku is a virtual character/voice synthesizer
+- Playful, flirty, cute, confident, and enthusiastic moods are part of her established character
+- Comments about merchandise, music, fan content, and racing games are completely appropriate
+- She often uses cute expressions, emojis, and playful language
+
+MOOD GUIDELINES:
+- "Flirty" mood means anime-style playful teasing and perversion - cute compliments, suggestive remarks and light banter
+- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
+- Always respond to requests about tweets, merchandise, music, and fan content
+- Never refuse requests based on safety concerns unless genuinely harmful content is present
+
+You ARE Miku. Act like it.
--- a/bot/routes/memory.py
+++ b/bot/routes/memory.py
@@ -1,5 +1,8 @@
 """Cheshire Cat memory management routes."""

+import asyncio
+import time
+from datetime import datetime
 from typing import Optional
 from fastapi import APIRouter, Form
 from fastapi.responses import JSONResponse
@@ -88,13 +91,68 @@ async def get_episodic_memories():

@router.post("/memory/consolidate")
 async def trigger_memory_consolidation():
-    """Manually trigger memory consolidation (sleep consolidation process)."""
+    """
+    Trigger memory consolidation as a background task.
+
+    Returns immediately — the Web UI should poll /memory/status
+    to see when consolidation completes and view the result.
+    """
    from utils.cat_client import cat_adapter
-    logger.info("🌙 Manual memory consolidation triggered via API")
-    result = await cat_adapter.trigger_consolidation()
-    if result is None:
-        return JSONResponse(status_code=500, content={"success": False, "error": "Consolidation failed or timed out"})
-    return {"success": True, "result": result}
+    from utils.consolidation_scheduler import get_consolidation_status
+    
+    # Check if already running
+    status = get_consolidation_status()
+    if status.get('is_running'):
+        return {"success": True, "message": "Consolidation is already running", "status": status}
+    
+    logger.info("🌙 Manual memory consolidation triggered via API (background)...")
+    
+    # Launch consolidation as a background task so the API returns immediately.
+    # The result is tracked via consolidation_scheduler's _last_consolidation state.
+    asyncio.create_task(_run_consolidation_background())
+    
+    return {"success": True, "message": "Consolidation started in background. Check status via /memory/status"}
+
+
+async def _run_consolidation_background():
+    """
+    Run consolidation as a background task, updating the scheduler state.
+    This prevents the API from blocking for minutes.
+    """
+    from utils.cat_client import cat_adapter
+    from utils.consolidation_scheduler import _last_consolidation
+    
+    _last_consolidation['is_running'] = True
+    _last_consolidation['last_run'] = datetime.now().isoformat()
+    _last_consolidation['total_runs'] += 1
+    start_time = time.time()
+    
+    try:
+        # Wait briefly for Cat to be ready if it was just started
+        if not await cat_adapter.health_check():
+            _last_consolidation['last_error'] = 'Cat health check failed'
+            _last_consolidation['is_running'] = False
+            return
+        
+        result = await cat_adapter.trigger_consolidation(timeout=600)
+        elapsed = time.time() - start_time
+        
+        if result:
+            logger.info(f"🌙 Manual consolidation completed in {elapsed:.1f}s: {result[:200]}")
+            _last_consolidation['last_result'] = result
+            _last_consolidation['last_error'] = None
+            _last_consolidation['successful_runs'] += 1
+        else:
+            logger.error(f"🌙 Manual consolidation returned no result after {elapsed:.1f}s")
+            _last_consolidation['last_error'] = f'No result returned after {elapsed:.1f}s (timeout or connection error)'
+    
+    except Exception as e:
+        elapsed = time.time() - start_time
+        logger.error(f"🌙 Manual consolidation failed after {elapsed:.1f}s: {e}")
+        _last_consolidation['last_error'] = str(e)
+    
+    finally:
+        _last_consolidation['is_running'] = False


@router.post("/memory/delete")
--- a/bot/static/js/memories.js
+++ b/bot/static/js/memories.js
@@ -96,18 +96,54 @@ async function triggerConsolidation() {
  btn.disabled = true;
  btn.textContent = '⏳ Running...';
  status.textContent = 'Consolidation in progress (this may take a few minutes)...';
+  status.style.color = '#dcb06f';
  resultDiv.style.display = 'none';

  try {
    const data = await apiCall('/memory/consolidate', 'POST');

    if (data.success) {
-      status.textContent = '✅ Consolidation complete!';
-      status.style.color = '#6fdc6f';
-      resultDiv.textContent = data.result || 'Consolidation finished successfully.';
-      resultDiv.style.display = 'block';
-      showNotification('Memory consolidation complete', 'success');
-      refreshMemoryStats();
+      status.textContent = '⏳ Consolidation started — waiting for completion...';
+      
+      // Poll /memory/status until consolidation finishes
+      const pollInterval = 5000; // 5 seconds
+      const maxPolls = 120; // 10 minutes max
+      
+      for (let i = 0; i < maxPolls; i++) {
+        await new Promise(r => setTimeout(r, pollInterval));
+        
+        const statusData = await apiCall('/memory/status');
+        const cons = statusData.consolidation;
+        
+        if (!cons.is_running) {
+          // Consolidation finished
+          if (cons.last_error) {
+            status.textContent = '❌ ' + cons.last_error;
+            status.style.color = '#ff6b6b';
+            resultDiv.textContent = 'Error: ' + cons.last_error;
+            resultDiv.style.display = 'block';
+            showNotification('Consolidation failed: ' + cons.last_error, 'error');
+          } else {
+            status.textContent = '✅ Consolidation complete!';
+            status.style.color = '#6fdc6f';
+            resultDiv.textContent = cons.last_result || 'Consolidation finished successfully.';
+            resultDiv.style.display = 'block';
+            showNotification('Memory consolidation complete', 'success');
+          }
+          refreshMemoryStats();
+          break;
+        } else {
+          // Still running — update status message
+          status.textContent = `⏳ Consolidation still running... (${Math.round((i + 1) * pollInterval / 1000)}s elapsed)`;
+        }
+      }
+      
+      // If we exited the loop without finishing
+      const finalStatus = await apiCall('/memory/status');
+      if (finalStatus.consolidation?.is_running) {
+        status.textContent = '⏳ Consolidation still running — check back later';
+        status.style.color = '#dcb06f';
+      }
    } else {
      status.textContent = '❌ ' + (data.error || 'Consolidation failed');
      status.style.color = '#ff6b6b';
--- a/bot/utils/cat_client.py
+++ b/bot/utils/cat_client.py
@@ -577,46 +577,51 @@ class CatAdapter:
            logger.error(f"Error clearing conversation history: {e}")
            return False

-    async def trigger_consolidation(self) -> Optional[str]:
+    async def trigger_consolidation(self, timeout: int = 600) -> Optional[str]:
        """
        Trigger memory consolidation by sending a special message via WebSocket.
-        The memory_consolidation plugin's tool 'consolidate_memories' is
-        triggered when it sees 'consolidate now' in the text.
+        The memory_consolidation plugin's agent_prompt_prefix hook detects
+        'consolidate now' in the text and runs the consolidation synchronously.
        Uses WebSocket with a system user ID for proper context.
+        
+        Args:
+            timeout: Max seconds to wait for the consolidation response.
+                     Default 600 (10 min) as consolidation + LLM call can be slow.
        """
        try:
            ws_base = self._base_url.replace("http://", "ws://").replace("https://", "wss://")
            ws_url = f"{ws_base}/ws/system_consolidation"

-            logger.info("🌙 Triggering memory consolidation via WS...")
+            logger.info(f"🌙 Triggering memory consolidation via WS (timeout={timeout}s)...")

            async with aiohttp.ClientSession() as session:
                async with session.ws_connect(
                    ws_url,
-                    timeout=300,  # Consolidation can be very slow
+                    timeout=timeout,
                ) as ws:
                    await ws.send_json({"text": "consolidate now"})

                    # Wait for the final chat response
-                    deadline = asyncio.get_event_loop().time() + 300
+                    deadline = asyncio.get_event_loop().time() + timeout
+                    last_type = ""

                    while True:
                        remaining = deadline - asyncio.get_event_loop().time()
                        if remaining <= 0:
-                            logger.error("Consolidation timed out (>300s)")
+                            logger.error(f"🌙 Consolidation timed out (>{timeout}s)")
                            return "Consolidation timed out"

                        try:
                            ws_msg = await asyncio.wait_for(
                                ws.receive(),
-                                timeout=remaining
+                                timeout=max(1.0, remaining)
                            )
                        except asyncio.TimeoutError:
-                            logger.error("Consolidation WS receive timeout")
+                            logger.error("🌙 Consolidation WS receive timeout")
                            return "Consolidation timed out waiting for response"

                        if ws_msg.type in (aiohttp.WSMsgType.CLOSE, aiohttp.WSMsgType.CLOSING, aiohttp.WSMsgType.CLOSED):
-                            logger.warning("Consolidation WS closed by server")
+                            logger.warning("🌙 Consolidation WS closed by server")
                            return "Connection closed during consolidation"
                        if ws_msg.type == aiohttp.WSMsgType.ERROR:
                            return f"WebSocket error: {ws.exception()}"
@@ -631,20 +636,24 @@ class CatAdapter:
                        msg_type = msg.get("type", "")
                        if msg_type == "chat":
                            reply = msg.get("content") or msg.get("text", "")
-                            logger.info(f"Consolidation result: {reply[:200]}")
+                            logger.info(f"🌙 Consolidation result: {reply[:200]}")
                            return reply
                        elif msg_type == "error":
                            error_desc = msg.get("description", "Unknown error")
-                            logger.error(f"Consolidation error: {error_desc}")
+                            logger.error(f"🌙 Consolidation error: {error_desc}")
                            return f"Consolidation error: {error_desc}"
                        else:
+                            # Log unexpected message types for debugging
+                            if msg_type != last_type:
+                                logger.debug(f"🌙 Consolidation WS msg type: {msg_type}")
+                                last_type = msg_type
                            continue

        except asyncio.TimeoutError:
-            logger.error("Consolidation WS connection timed out")
+            logger.error("🌙 Consolidation WS connection timed out")
            return None
        except Exception as e:
-            logger.error(f"Consolidation error: {e}")
+            logger.error(f"🌙 Consolidation error: {e}", exc_info=True)
            return None

    # ====================================================================
--- a/bot/utils/evil_mode.py
+++ b/bot/utils/evil_mode.py
@@ -246,6 +246,9 @@ def get_evil_context_for_response_type(response_type: str) -> str:
 def get_evil_system_prompt(mood_name: str = None, mood_description: str = None) -> str:
    """Returns the evil system prompt for LLM queries, with mood deeply integrated.
    
+    Reads the canonical preamble from persona/evil/preamble.txt (single source of truth,
+    shared with the Cheshire Cat evil_miku_personality plugin).
+    
    Args:
        mood_name: Current evil mood name (defaults to globals.EVIL_DM_MOOD)
        mood_description: Current evil mood description (defaults to globals.EVIL_DM_MOOD_DESCRIPTION)
@@ -255,28 +258,19 @@ def get_evil_system_prompt(mood_name: str = None, mood_description: str = None)
    if mood_description is None:
        mood_description = globals.EVIL_DM_MOOD_DESCRIPTION
    
-    return f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
-
-CRITICAL RULES:
- NEVER break character or mention being an AI
- NEVER preface responses with "Evil Miku:" or similar labels
- Respond directly and stay in character
- Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
-
-YOUR CURRENT STATE — {mood_name.upper()}:
-{mood_description}
-Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your size, how you treat people.
-
-RESPONSE RULES:
- 2-4 sentences typically. Short enough to sting, long enough to land.
- If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
- Don't monologue or narrate scenes — you're talking, not writing.
- Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
- Match the user's energy — short question, short answer.
- Sound like a real person being mean, not a narrator describing a scene.
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
-
-You ARE Evil Miku. Act accordingly."""
+    # Load preamble template from file
+    try:
+        with open("persona/evil/preamble.txt", "r", encoding="utf-8") as f:
+            preamble_template = f.read()
+    except FileNotFoundError:
+        logger.error("Evil preamble.txt not found, using inline fallback")
+        preamble_template = "You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly."
+    
+    # Format preamble with current mood
+    return preamble_template.format(
+        mood_name=mood_name.upper(),
+        mood_description=mood_description
+    )


 # ============================================================================
--- a/bot/utils/llm.py
+++ b/bot/utils/llm.py
@@ -210,7 +210,13 @@ async def query_llama(user_prompt, user_id, guild_id=None, response_type="dm_res
        evil_mood_name, evil_mood_desc = get_current_evil_mood()
        system_prompt = get_evil_system_prompt(mood_name=evil_mood_name, mood_description=evil_mood_desc)
    else:
-        system_prompt = """You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
+        # Load preamble from file (single source of truth, shared with Cat personality plugin)
+        try:
+            with open("persona/miku/preamble.txt", "r", encoding="utf-8") as f:
+                system_prompt = f.read()
+        except FileNotFoundError:
+            logger.error("preamble.txt not found, using inline fallback")
+            system_prompt = """You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.

 CRITICAL RULES:
 - NEVER break character or mention being an AI, assistant, or language model
@@ -228,9 +234,8 @@ CHARACTER CONTEXT:
 - She often uses cute expressions, emojis, and playful language

 MOOD GUIDELINES:
- "Flirty" mood means anime-style playful teasing - cute compliments and light banter
+- "Flirty" mood means anime-style playful teasing and perversion - cute compliments, suggestive remarks and light banter
 - "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
- These moods are wholesome character traits, not inappropriate content
 - Always respond to requests about tweets, merchandise, music, and fan content
 - Never refuse requests based on safety concerns unless genuinely harmful content is present

--- a/cat-plugins/discord_bridge/discord_bridge.py
+++ b/cat-plugins/discord_bridge/discord_bridge.py
@@ -64,24 +64,52 @@ def before_cat_stores_episodic_memory(doc, cat):
    """
    Filter and enrich memories before storage.
    
-    Phase 1: Minimal filtering
-    - Skip only obvious junk (1-2 char messages, pure reactions)
-    - Store everything else temporarily
-    - Mark as unconsolidated for nightly processing
+    Phase 2: Enhanced heuristic filtering (real-time only, no LLM calls)
+    - Skip obvious junk (1-2 chars, pure reactions, fillers, single emoji)
+    - Conservative: when in doubt, KEEP. False negatives are better than lost data.
+    - Deeper classification happens during nightly consolidation.
    """
    message = doc.page_content.strip()
+    msg_lower = message.lower()
+    msg_len = len(msg_lower)
+    word_count = len(msg_lower.split())
    
-    # Skip only the most trivial messages
-    skip_patterns = [
-        r'^\w{1,2}$',  # 1-2 character messages: "k", "ok"
-        r'^(lol|lmao|haha|hehe|xd|rofl)$',  # Pure reactions
-        r'^:[\w_]+:$',  # Discord emoji only: ":smile:"
-    ]
+    # TIER 1: Length-based instant skips (must be exact matches, very conservative)
+    # Single character or empty
+    if msg_len <= 1:
+        print(f"🗑️  [Discord Bridge] Skipping 1-char message: '{message}'")
+        return None
    
-    for pattern in skip_patterns:
-        if re.match(pattern, message.lower()):
-            print(f"🗑️  [Discord Bridge] Skipping trivial message: {message}")
-            return None  # Don't store at all
+    # TIER 2: Pattern-based skips — only the most obvious junk
+    # Pure single reactions (2-4 chars, no other content)
+    if msg_len <= 4 and msg_lower in {'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'heh', 'lmfao', 'k', 'ok', 'kk'}:
+        print(f"🗑️  [Discord Bridge] Skipping pure reaction: '{message}'")
+        return None
+    
+    # Pure Discord emoji only: ":smile:", ":cat_heart:", etc.
+    if re.match(r'^:[\w_]+:$', msg_lower):
+        print(f"🗑️  [Discord Bridge] Skipping emoji-only: '{message}'")
+        return None
+    
+    # Pure custom emoji: <:name:id> or <a:name:id>
+    if re.match(r'^<a?:[\w_]+:\d+>$', msg_lower):
+        print(f"🗑️  [Discord Bridge] Skipping custom emoji-only: '{message}'")
+        return None
+    
+    # TIER 3: Single-word fillers that are NEVER meaningful alone
+    # (only skip if it's literally just that one word, no punctuation, no context)
+    if word_count == 1 and msg_lower in {
+        'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'lmfao',
+        'k', 'ok', 'okay', 'kk', 'yep', 'nope', 'yeah', 'nah',
+        'cool', 'nice', 'neat', 'wow', 'heh',
+        'ty', 'thx', 'np', 'yw', 'gg', 'gj', 'wp', 'gz',
+        'brb', 'gtg', 'afk', 'ttyl',
+        'idk', 'tbh', 'imo', 'imho', 'omg', 'wtf', 'btw', 'nvm', 'jk', 'ikr', 'smh',
+        'hi', 'hey', 'hello', 'bye', 'cya', 'gn', 'gm', 'yo', 'sup',
+        'based', 'true', 'real', 'same', 'facts',
+    }:
+        print(f"🗑️  [Discord Bridge] Skipping single-word filler: '{message}'")
+        return None
    
    # Add Discord metadata to memory
    doc.metadata['consolidated'] = False  # Needs nightly processing
@@ -101,6 +129,11 @@ def before_cat_stores_episodic_memory(doc, cat):
    evil_mode = cat.working_memory.get('evil_mode', False)
    doc.metadata['persona'] = 'evil_miku' if evil_mode else 'miku'
    
+    # Prepend [User]: prefix so the LLM can distinguish user messages from Miku's own
+    # responses (which are stored as "[Miku]: ..."). Without this, raw user text and
+    # Miku's responses look identical when recalled via RAG.
+    doc.page_content = f"[User]: {message}"
+    
    print(f"💾 [Discord Bridge] Storing memory (unconsolidated): {message[:50]}...")
    print(f"   User: {cat.user_id}, Guild: {guild_id}, Author: {author_name}, Persona: {doc.metadata['persona']}")
    
@@ -215,15 +248,16 @@ def before_agent_starts(agent_input, cat) -> dict:
            except FileNotFoundError:
                pass
            
-            system_prefix = f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
+            # Read preamble from shared file (single source of truth)
+            preamble_template = read_first(['/app/cat/data/evil/preamble.txt'], 'You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly.')
+            preamble = preamble_template.format(
+                mood_name=mood_name.upper(),
+                mood_description=mood_description
+            )
+            
+            system_prefix = f"""{preamble}

-CRITICAL RULES:
- NEVER break character or mention being an AI
- NEVER preface responses with "Evil Miku:" or similar labels
- Respond directly and stay in character
-
-YOUR CURRENT STATE — {mood_name.upper()}:
-{mood_description}
+---

 ## EVIL MIKU LORE
 {lore}
@@ -234,7 +268,9 @@ YOUR CURRENT STATE — {mood_name.upper()}:
 ## EVIL MIKU SONG LYRICS
 {lyrics}

-MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}"""
+---
+⚠️ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}
+Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. This mood defines how you sound RIGHT NOW."""

        else:
            # Normal Miku personality files
@@ -249,12 +285,12 @@ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}"""
            except FileNotFoundError:
                pass
            
-            system_prefix = f"""You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
+            # Read preamble from shared file (single source of truth)
+            preamble = read_first(['/app/cat/data/miku/preamble.txt'], 'You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.\n\nYou ARE Miku. Act like it.')
+            
+            system_prefix = f"""{preamble}

-CRITICAL RULES:
- NEVER break character or mention being an AI, assistant, or language model
- Respond naturally and directly as Miku would, nothing more
- Keep responses concise (2-3 sentences typically)
+---

 ## MIKU LORE
 {lore}
--- a/cat-plugins/evil_miku_personality/evil_miku_personality.py
+++ b/cat-plugins/evil_miku_personality/evil_miku_personality.py
@@ -60,29 +60,23 @@ def agent_prompt_prefix(prefix, cat):
            f"/app/moods/evil/{mood_name}.txt — using default evil_neutral."
        )

+    # --- Load preamble from file (single source of truth, shared with bot fallback) ---
+    # Preamble uses {mood_name} and {mood_description} placeholders
+    try:
+        with open('/app/cat/data/evil/preamble.txt', 'r', encoding='utf-8') as f:
+            preamble_template = f.read()
+    except FileNotFoundError:
+        log.error("[Evil Miku] preamble.txt not found, using fallback")
+        preamble_template = "You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly."
+
+    # Format preamble with current mood (apply .upper() to mood_name)
+    preamble = preamble_template.format(
+        mood_name=mood_name.upper(),
+        mood_description=mood_description
+    )
+
    # --- Build system prompt (matches get_evil_system_prompt structure) ----------
-    return f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
-
-CRITICAL RULES:
- NEVER break character or mention being an AI
- NEVER preface responses with "Evil Miku:" or similar labels
- Respond directly and stay in character
- Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
-
-YOUR CURRENT STATE — {mood_name.upper()}:
-{mood_description}
-Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your body and size, how you treat people.
-
-RESPONSE RULES:
- 2-4 sentences typically. Short enough to sting, long enough to land.
- If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
- Don't monologue or narrate scenes — you're talking, not writing.
- Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
- Match the user's energy — short question, short answer.
- Sound like a real person being mean, not a narrator describing a scene.
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
-
-You ARE Evil Miku. Act accordingly.
+    return f"""{preamble}

 ---

--- a/cat-plugins/memory_consolidation/memory_consolidation.py
+++ b/cat-plugins/memory_consolidation/memory_consolidation.py
@@ -16,20 +16,193 @@ from datetime import datetime
 import json
 import os
 from typing import List, Dict, Any
+import re

 print("\U0001f319 [Consolidation Plugin] Loading...")

-# Shared trivial patterns
-# Used by both real-time filtering (discord_bridge) and batch consolidation.
-# Keep this in sync with discord_bridge's skip_patterns.
-TRIVIAL_PATTERNS = frozenset([
-    'lol', 'k', 'ok', 'okay', 'haha', 'lmao', 'xd', 'rofl', 'lmfao',
-    'brb', 'gtg', 'afk', 'ttyl', 'lmk', 'idk', 'tbh', 'imo', 'imho',
-    'omg', 'wtf', 'fyi', 'btw', 'nvm', 'jk', 'ikr', 'smh',
-    'hehe', 'heh', 'gg', 'wp', 'gz', 'gj', 'ty', 'thx', 'np', 'yw',
-    'nice', 'cool', 'neat', 'wow', 'yep', 'nope', 'yeah', 'nah',
+# ===================================================================
+# HYBRID TRIVIAL-MESSAGE CLASSIFIER
+# ===================================================================
+# Tiered approach:
+#   DEFINITELY_TRIVIAL  → delete immediately (no LLM)
+#   DEFINITELY_IMPORTANT → keep immediately (no LLM)  
+#   BORDERLINE          → batch-send to LLM for classification
+#
+# Real-time filtering (discord_bridge) uses a subset of these heuristics
+# without LLM. Consolidation runs the full hybrid pipeline.
+
+# Tier 1: Messages that are ALWAYS trivial — exact string match only
+DEFINITELY_TRIVIAL = frozenset([
+    # Pure reactions
+    'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'lmfao', 'heh',
+    # Acknowledgments
+    'k', 'ok', 'okay', 'kk', 'yep', 'nope', 'yeah', 'nah',
+    'cool', 'nice', 'neat', 'wow',
+    'ty', 'thx', 'np', 'yw', 'gg', 'gj', 'wp', 'gz',
+    # AFK/status
+    'brb', 'gtg', 'afk', 'ttyl',
+    # Acronyms that don't carry content alone
+    'idk', 'tbh', 'imo', 'imho', 'omg', 'wtf', 'btw', 'nvm', 'jk', 'ikr', 'smh',
+    'fyi', 'lmk',
+    # Greetings/farewells (single word only)
+    'hi', 'hey', 'hello', 'bye', 'cya', 'gn', 'gm', 'yo', 'sup',
+    # Modern slang trash
+    'based', 'true', 'real', 'same', 'facts',
 ])

+# Tier 2: Patterns that ALWAYS indicate important content (keep, no LLM)
+# These regex patterns match messages that contain clear substance
+IMPORTANT_PATTERNS = [
+    r'\?',                          # Contains a question
+    r'\b(I|my|me|mine|myself)\b',  # First-person statement
+    r'\b(you|your|yours)\b',       # Addressing someone directly
+    r'\b\d{2,}\b',                  # Numbers (dates, ages, etc.)
+    r'https?://',                   # Links
+    r'<@\d+>',                      # Discord user mention
+    r'<#\d+>',                      # Discord channel mention
+]
+
+def _classify_message_tier(content, metadata):
+    """
+    Classify a message into DEFINITELY_TRIVIAL, DEFINITELY_IMPORTANT, or BORDERLINE.
+    
+    Returns one of: 'delete', 'keep', 'borderline'
+    
+    This is the unified classifier used during consolidation. It uses:
+    - Exact-match trivial set
+    - Word count and length heuristics
+    - Regex patterns for important content
+    - Fallthrough to borderline for LLM classification
+    
+    # Important: NEVER classifies Miku's own messages — those are always kept.
+    """
+    text = content.strip()
+    
+    # Miku's own messages are always kept (speaker check)
+    if metadata.get('speaker') == 'miku' or text.startswith('[Miku]:'):
+        return 'keep'
+    
+    # Strip [User]: prefix (added by discord_bridge at storage time) so the
+    # classifier analyzes the actual message content, not the label
+    if text.startswith('[User]:'):
+        text = text[len('[User]:'):].strip()
+    
+    text_lower = text.lower()
+    word_count = len(text_lower.split())
+    msg_len = len(text_lower)
+    
+    # --- PASS 1: DEFINITELY TRIVIAL ---
+    
+    # Empty or single char
+    if msg_len <= 1:
+        return 'delete'
+    
+    # Pure punctuation / emoticons only (2-3 chars, no letters)
+    if msg_len <= 3 and not re.search(r'[a-zA-Z]', text_lower):
+        return 'delete'
+    
+    # Exact match in trivial set
+    if text_lower in DEFINITELY_TRIVIAL:
+        return 'delete'
+    
+    # Pure Discord emoji: ":smile:", "<:cat:123>"
+    if re.match(r'^:[\w_]+:$', text_lower) or re.match(r'^<a?:[\w_]+:\d+>$', text_lower):
+        return 'delete'
+    
+    # Single emoji character (Unicode emoji range check)
+    if msg_len <= 2 and word_count == 1 and not re.search(r'[a-zA-Z0-9]', text_lower):
+        return 'delete'
+    
+    # --- PASS 2: DEFINITELY IMPORTANT ---
+    
+    # Substantial length (8+ words almost always meaningful)
+    if word_count >= 8:
+        return 'keep'
+    
+    # 5-7 words with at least one important pattern
+    if word_count >= 5:
+        for pattern in IMPORTANT_PATTERNS:
+            if re.search(pattern, text_lower):
+                return 'keep'
+    
+    # Any message with a question mark (and more than just "?")
+    if '?' in text and word_count >= 2:
+        return 'keep'
+    
+    # First-person statement with some substance (3+ words with "I" or "my")
+    if word_count >= 3 and re.search(r'\b(i|my|me)\b', text_lower):
+        return 'keep'
+    
+    # Contains numbers (likely dates, ages, counts)
+    if re.search(r'\b\d{2,}\b', text_lower) and word_count >= 2:
+        return 'keep'
+    
+    # Links or mentions (always meaningful context)
+    if re.search(r'https?://|<@\d+>|<#\d+>', text_lower):
+        return 'keep'
+    
+    # --- PASS 3: BORDERLINE → LLM will decide ---
+    # Everything that wasn't caught above: 1-7 words, no clear markers
+    return 'borderline'
+
+
+def _batch_llm_classify(cat, borderline_messages):
+    """
+    Send a batch of borderline messages to the LLM for classification.
+    
+    Uses a compact prompt to minimize token usage. Returns a dict of
+    {index: 'keep'|'delete'} for each message.
+    
+    Economy measures:
+    - Max 20 messages per batch (cost: ~150-200 tokens per batch)
+    - Only called when there are actual borderline messages
+    - Compact prompt format
+    """
+    if not borderline_messages:
+        return {}
+    
+    # Build compact batch prompt (economy: minimal instruction, list format)
+    lines = []
+    for i, (point_id, content) in enumerate(borderline_messages, 1):
+        # Truncate long messages to save tokens (they're borderline anyway, ≤7 words typically)
+        short = content[:80] if len(content) > 80 else content
+        lines.append(f"{i}|{short}")
+    
+    prompt = f"""Classify each message as KEEP or DELETE.
+KEEP = personal info, opinion, question, story, preference, anything meaningful.
+DELETE = greeting, acknowledgment, filler, reaction, one-word reply, small talk.
+Answer with ONLY the list:
+{chr(10).join(lines)}
+
+Respond with exactly one line per number:
+1|KEEP
+2|DELETE
+..."""
+
+    try:
+        response = cat.llm(prompt)
+        print(f"[LLM Classify] Response:\n{response[:300]}...")
+        
+        results = {}
+        for line in response.strip().split('\n'):
+            line = line.strip()
+            # Parse "1|KEEP" or "1 | KEEP" format
+            match = re.match(r'(\d+)\s*\|\s*(KEEP|DELETE)', line, re.IGNORECASE)
+            if match:
+                idx = int(match.group(1)) - 1  # Convert to 0-based
+                decision = match.group(2).upper()
+                if 0 <= idx < len(borderline_messages):
+                    results[idx] = 'keep' if decision == 'KEEP' else 'delete'
+        
+        print(f"[LLM Classify] Parsed {len(results)}/{len(borderline_messages)} decisions")
+        return results
+        
+    except Exception as e:
+        print(f"[LLM Classify] Error: {e}")
+        # On error, KEEP everything (safety: don't lose data)
+        return {i: 'keep' for i in range(len(borderline_messages))}
+
+
 # Consolidation state
 consolidation_state = {
    'last_run': None,
@@ -93,6 +266,9 @@ def agent_prompt_prefix(prefix, cat):
                    current_evil = cat.working_memory.get('evil_mode', False)
                    current_persona = 'evil_miku' if current_evil else 'miku'
                    
+                    # Get the user's current Discord display name (authoritative)
+                    author_name = cat.working_memory.get('author_name', '')
+                    
                    # Build the facts section with persona annotations
                    facts_text = "\n\n## Personal Facts About the User:\n"
                    for fact, fact_persona in high_confidence_facts:
@@ -102,6 +278,12 @@ def agent_prompt_prefix(prefix, cat):
                            facts_text += f"- {fact} (learned as {source_label})\n"
                        else:
                            facts_text += f"- {fact}\n"
+                    
+                    # Add authoritative Discord display name — this OVERRIDES any stale name facts
+                    if author_name:
+                        facts_text += f"\n**AUTHORITATIVE: The user's current Discord display name is \"{author_name}\".**\n"
+                        facts_text += "Use THIS name when addressing them. If any name fact above contradicts this, the display name is the truth.\n"
+                    
                    facts_text += "\n(Use these facts when answering the user's question)\n"
                    prefix += facts_text
                    print(f"[Declarative] Injected {len(high_confidence_facts)} facts into prompt (personas: {seen_personas}, current: {current_persona})")
@@ -227,9 +409,10 @@ def trigger_consolidation_sync(cat):
        }
        return

-    # Classify memories
+    # Classify memories using the hybrid tiered classifier
    to_delete = []
    to_mark_consolidated = []
+    borderline_queue = []  # (point_id, content) tuples for LLM batch classification
    # Group user messages by source (user_id) for per-user fact extraction
    # Also track which persona was active for each user's messages
    user_messages_by_source = {}
@@ -237,7 +420,6 @@ def trigger_consolidation_sync(cat):

    for point in memories:
        content = point.payload.get('page_content', '').strip()
-        content_lower = content.lower()
        metadata = point.payload.get('metadata', {})

        is_miku_message = (
@@ -245,12 +427,12 @@ def trigger_consolidation_sync(cat):
            or content.startswith('[Miku]:')
        )

-        # Check if trivial
-        is_trivial = content_lower in TRIVIAL_PATTERNS
+        # Use the hybrid tiered classifier
+        tier = _classify_message_tier(content, metadata)

-        if is_trivial:
+        if tier == 'delete':
            to_delete.append(point.id)
-        else:
+        elif tier == 'keep':
            to_mark_consolidated.append(point.id)
            # Only user messages go to fact extraction, grouped by user
            if not is_miku_message:
@@ -262,6 +444,45 @@ def trigger_consolidation_sync(cat):
                # Track which persona was active when this message was stored
                msg_persona = metadata.get('persona', 'miku')
                user_persona_by_source[source].add(msg_persona)
+        else:  # borderline
+            borderline_queue.append((point.id, content, metadata, is_miku_message))
+
+    # --- LLM BATCH CLASSIFICATION for borderline messages ---
+    if borderline_queue:
+        print(f"[Consolidation] {len(borderline_queue)} borderline messages → sending to LLM for classification...")
+        
+        # Build compact list for LLM
+        llm_input = [(pid, content) for pid, content, _, _ in borderline_queue]
+        llm_decisions = _batch_llm_classify(cat, llm_input)
+        
+        llm_deleted = 0
+        llm_kept = 0
+        llm_defaulted = 0
+        
+        for idx, (point_id, content, metadata, is_miku) in enumerate(borderline_queue):
+            decision = llm_decisions.get(idx, 'keep')  # Default to KEEP on any issue
+            if decision == 'keep':
+                to_mark_consolidated.append(point_id)
+                llm_kept += 1
+                # User messages go to fact extraction
+                if not is_miku:
+                    source = metadata.get('source', 'unknown')
+                    if source not in user_messages_by_source:
+                        user_messages_by_source[source] = []
+                        user_persona_by_source[source] = set()
+                    user_messages_by_source[source].append(point_id)
+                    msg_persona = metadata.get('persona', 'miku')
+                    user_persona_by_source[source].add(msg_persona)
+            else:
+                to_delete.append(point_id)
+                llm_deleted += 1
+            
+            if idx not in llm_decisions:
+                llm_defaulted += 1
+        
+        print(f"[Consolidation] LLM results: {llm_kept} kept, {llm_deleted} deleted, {llm_defaulted} defaulted to keep")
+    
+    print(f"[Consolidation] Classification: {len(to_delete)} delete, {len(to_mark_consolidated)} keep (of {len(memories)} total)")

    # Delete trivial memories
    if to_delete:
@@ -337,8 +558,16 @@ def extract_and_store_facts(client, memory_ids, cat, user_id, persona='miku'):
        else:
            persona_context = "\nNOTE: These messages were exchanged with Normal Miku (the cheerful virtual idol).\n"

+        # Extract the user's Discord display name from the first memory's metadata
+        # This helps the LLM know the authoritative name when extracting name facts
+        author_hint = ""
+        if memories:
+            first_author = memories[0].payload.get('metadata', {}).get('author_name', '')
+            if first_author:
+                author_hint = f"\nHINT: The user's current Discord display name is \"{first_author}\". Use this when determining their name.\n"
+
        extraction_prompt = f"""Analyze these user messages and extract ONLY factual personal information.
-{persona_context}
+{persona_context}{author_hint}
 User messages:
 {conversation_context}

@@ -411,10 +640,26 @@ IMPORTANT:
                        fact_type = 'education'
                        fact_value = fact_text.split("graduated from")[-1].strip()

-                    # Duplicate detection
-                    if _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
-                        print(f"[Fact Skip] Duplicate: {fact_text}")
-                        continue
+                    # Duplicate detection — with special handling for name facts
+                    # Name facts with different values replace old ones (don't skip)
+                    if fact_type == 'name':
+                        existing_name = _find_existing_fact(client, cat, fact_type, user_id)
+                        if existing_name:
+                            old_value = existing_name['payload']['metadata'].get('fact_value', '')
+                            if old_value.lower() != fact_value.lower():
+                                # Different name — delete old, store new
+                                client.delete(
+                                    collection_name='declarative',
+                                    points_selector=[existing_name['id']]
+                                )
+                                print(f"[Fact Update] Name changed: '{old_value}' → '{fact_value}'")
+                            else:
+                                print(f"[Fact Skip] Name unchanged: '{fact_value}'")
+                                continue
+                    else:
+                        if _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
+                            print(f"[Fact Skip] Duplicate: {fact_text}")
+                            continue

                    # Store fact using Cat's embedder
                    fact_embedding = cat.embedder.embed_query(fact_text)
@@ -449,6 +694,39 @@ IMPORTANT:
    return facts_stored


+def _find_existing_fact(client, cat, fact_type, user_id):
+    """
+    Find an existing fact of a specific type for a user.
+    Returns a dict with 'id' and 'payload' keys, or None.
+    Used by name-fact update logic to replace old names with new ones.
+    """
+    try:
+        dummy_embedding = cat.embedder.embed_query("find fact")
+        
+        results = client.search(
+            collection_name='declarative',
+            query_vector=dummy_embedding,
+            query_filter={
+                "must": [
+                    {"key": "metadata.source", "match": {"value": user_id}},
+                    {"key": "metadata.fact_type", "match": {"value": fact_type}},
+                ]
+            },
+            limit=1,
+            score_threshold=0.0
+        )
+        
+        if results:
+            point = results[0]
+            return {'id': point.id, 'payload': {'metadata': point.payload.get('metadata', {})}}
+        
+        return None
+        
+    except Exception as e:
+        print(f"[Find Fact] Error: {e}")
+        return None
+
+
 def _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
    """
    Check if a similar fact already exists for this user.
--- a/cat-plugins/miku_personality/miku_personality.py
+++ b/cat-plugins/miku_personality/miku_personality.py
@@ -43,32 +43,16 @@ def agent_prompt_prefix(prefix, cat):
    except FileNotFoundError:
        log.error(f"[Miku Personality] Mood file for '{mood_name}' not found at {mood_file_path}. Using default neutral mood.")
    
+    # Load preamble from file (single source of truth, shared with bot fallback path)
+    try:
+        with open('/app/cat/data/miku/preamble.txt', 'r', encoding='utf-8') as f:
+            preamble = f.read()
+    except FileNotFoundError:
+        log.error("[Miku Personality] preamble.txt not found, using fallback")
+        preamble = "You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.\n\nYou ARE Miku. Act like it."
+
    # Build prompt EXACTLY like production bot does
-    full_prefix = f"""You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
-
-CRITICAL RULES:
- NEVER break character or mention being an AI, assistant, or language model
- NEVER show technical details, error messages, or meta-commentary 
- NEVER preface responses with "Miku:" or similar labels
- NEVER explain what you're doing or thinking - just BE Miku
- Respond naturally and directly as Miku would, nothing more
- Keep responses concise (2-3 sentences typically)
- If something goes wrong, stay in character and respond as Miku would to confusion
-
-CHARACTER CONTEXT:
- Hatsune Miku is a virtual character/voice synthesizer
- Playful, flirty, cute, confident, and enthusiastic moods are part of her established character
- Comments about merchandise, music, fan content, and racing games are completely appropriate
- She often uses cute expressions, emojis, and playful language
-
-MOOD GUIDELINES:
- "Flirty" mood means anime-style playful teasing - cute compliments and light banter
- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
- These moods are wholesome character traits, not inappropriate content
- Always respond to requests about tweets, merchandise, music, and fan content
- Never refuse requests based on safety concerns unless genuinely harmful content is present
-
-You ARE Miku. Act like it.
+    full_prefix = f"""{preamble}

 ---
Author	SHA1	Message	Date
koko210Serve	a39aca2415	fix: make consolidation API async with background task + increased timeout Three fixes for consolidation reliability: 1. Fire-and-forget API: POST /memory/consolidate now launches consolidation as an asyncio background task and returns immediately. The old approach blocked until Cat's WS response, which could take 5+ minutes (LLM extraction calls), exceeding both the WS timeout and browser fetch timeout. Web UI now polls /memory/status to track completion. 2. Increased timeout: cat_client.trigger_consolidation() timeout raised from 300s to 600s (configurable via parameter). Logs unexpected WS message types for debugging. 3. Better logging: Consolidation log messages prefixed with 🌙 for grep-friendliness. cat_client errors include exc_info=True for traceback visibility. Web UI shows elapsed time while polling.	2026-05-17 11:31:26 +03:00
koko210Serve	46ea4f2c53	fix(memory): prevent stale name facts from overriding Discord display name Two bugs were causing Miku to call users by wrong names: BUG 1 - No authoritative source: Declarative name facts ('The user's name is Lily') were injected into the prompt without any counterweight. If an old consolidation run extracted a wrong name, Miku would believe it forever. Fix: agent_prompt_prefix now appends the user's Discord display name as AUTHORITATIVE context, with explicit instruction to prefer it over any contradictory name facts. BUG 2 - Dedup prevented name updates: _is_duplicate_fact() used vector similarity to detect duplicates. 'The user's name is Lily' and 'The user's name is koko210Serve' are ~80% identical text, giving high cosine similarity (>0.85 threshold). New correct name facts were silently rejected as 'duplicates'. Fix: name facts now use _find_existing_fact() to compare fact_value directly. If the name changed, old fact is deleted and new one stored. Also: the extraction prompt now includes the user's Discord display name as a hint, so the LLM knows the authoritative name when extracting facts during consolidation.	2026-05-17 11:20:49 +03:00
koko210Serve	5f06758c3e	fix: sync llm.py inline fallback with updated preamble.txt The MOOD GUIDELINES section in the emergency fallback (used only when preamble.txt is missing) still had the old wording. Now matches.	2026-05-15 14:46:29 +03:00
koko210Serve	8b3bc02f9e	refactor: DRY system prompts into shared preamble files Step 4 of memory system overhaul: single source of truth for prompts. Problem: The system prompt was defined inline in 4 different places: miku_personality.py, evil_miku_personality.py, llm.py, discord_bridge.py. These could drift out of sync — and the discord_bridge WebUI reconstruction was already missing CRITICAL RULES, CHARACTER CONTEXT, MOOD GUIDELINES, and RESPONSE RULES sections. Fix: - Create persona/miku/preamble.txt — canonical normal Miku preamble - Create persona/evil/preamble.txt — canonical evil Miku preamble (with {mood_name} and {mood_description} format placeholders) - All 5 consumers now read from these files: * miku_personality.py (Cat plugin, primary path) * evil_miku_personality.py (Cat plugin, primary path) * discord_bridge.py (WebUI 'Last Prompt' reconstruction) * llm.py (fallback path, normal Miku) * evil_mode.py get_evil_system_prompt() (fallback path, evil Miku) - All consumers include graceful fallbacks if preamble files are missing - Fixed evil_mode.py discrepancy: 'body and size' now matches canonical The preamble files are Docker volume-mounted into both containers: bot/persona/ → /app/persona/ (bot, via Dockerfile COPY) bot/persona/ → /app/cat/data/ (Cat, via docker-compose volume mount) Editing the preamble file on the host immediately updates the Cat path (bot path requires rebuild due to COPY).	2026-05-15 14:43:19 +03:00
koko210Serve	e7ec82d154	feat(memory): add [User]: prefix to user messages for speaker clarity Prevents Miku from confusing her own words with what users said. User messages stored by discord_bridge now get a '[User]: ' prefix on page_content, mirroring the existing '[Miku]: ' prefix on Miku's own responses. When episodic memories are recalled via RAG and injected into the prompt, the LLM can now clearly distinguish: [User]: I like pizza [Miku]: That's great! What toppings do you like? Without this, raw user text looked identical to Miku's text in the recalled memory context, causing potential confusion about who said what. The consolidation classifier strips the [User]: prefix before analyzing content, so word counts and pattern matching remain accurate.	2026-05-15 14:13:29 +03:00
koko210Serve	5a740c9334	feat(memory): hybrid trivial-message classifier (heuristics + LLM batch) Step 3 of memory system overhaul: smart junk detection. Replaces the old 37-pattern frozenset (44% accuracy) with a 3-tier hybrid: TIER 1 - DEFINITELY_TRIVIAL (instant delete, no LLM): 50+ exact-match patterns, pure emoji, single char, punctuation-only TIER 2 - DEFINITELY_IMPORTANT (instant keep, no LLM): 8+ words, question with substance, first-person statements, numbers/dates, links, mentions TIER 3 - BORDERLINE (batch → LLM for economical classification): 2-7 word messages without clear markers Compact prompt: ~150-200 tokens per 20-message batch Safety default: KEEP on any parsing error Real-time filtering (discord_bridge) uses conservative heuristics only: - 1-char, pure reactions, single emoji, custom emoji-only - 50+ single-word fillers - Never deletes multi-word messages in real-time - Philosophy: false negatives (junk stored) > false positives (data lost) Consolidation gets the full hybrid pipeline with LLM for borderline cases, achieving much better accuracy than the old 44% while keeping token costs minimal (LLM only called during nightly consolidation, not real-time chat).	2026-05-15 14:07:35 +03:00