Compare commits

...

6 Commits

Author SHA1 Message Date
a39aca2415 fix: make consolidation API async with background task + increased timeout
Three fixes for consolidation reliability:

1. Fire-and-forget API: POST /memory/consolidate now launches consolidation
   as an asyncio background task and returns immediately. The old approach
   blocked until Cat's WS response, which could take 5+ minutes (LLM
   extraction calls), exceeding both the WS timeout and browser fetch
   timeout. Web UI now polls /memory/status to track completion.

2. Increased timeout: cat_client.trigger_consolidation() timeout raised
   from 300s to 600s (configurable via parameter). Logs unexpected WS
   message types for debugging.

3. Better logging: Consolidation log messages prefixed with 🌙 for
   grep-friendliness. cat_client errors include exc_info=True for
   traceback visibility. Web UI shows elapsed time while polling.
2026-05-17 11:31:26 +03:00
46ea4f2c53 fix(memory): prevent stale name facts from overriding Discord display name
Two bugs were causing Miku to call users by wrong names:

BUG 1 - No authoritative source:
  Declarative name facts ('The user's name is Lily') were injected into
  the prompt without any counterweight. If an old consolidation run
  extracted a wrong name, Miku would believe it forever.
  Fix: agent_prompt_prefix now appends the user's Discord display name
  as AUTHORITATIVE context, with explicit instruction to prefer it over
  any contradictory name facts.

BUG 2 - Dedup prevented name updates:
  _is_duplicate_fact() used vector similarity to detect duplicates.
  'The user's name is Lily' and 'The user's name is koko210Serve' are
  ~80% identical text, giving high cosine similarity (>0.85 threshold).
  New correct name facts were silently rejected as 'duplicates'.
  Fix: name facts now use _find_existing_fact() to compare fact_value
  directly. If the name changed, old fact is deleted and new one stored.

Also: the extraction prompt now includes the user's Discord display
name as a hint, so the LLM knows the authoritative name when extracting
facts during consolidation.
2026-05-17 11:20:49 +03:00
5f06758c3e fix: sync llm.py inline fallback with updated preamble.txt
The MOOD GUIDELINES section in the emergency fallback (used only when
preamble.txt is missing) still had the old wording. Now matches.
2026-05-15 14:46:29 +03:00
8b3bc02f9e refactor: DRY system prompts into shared preamble files
Step 4 of memory system overhaul: single source of truth for prompts.

Problem: The system prompt was defined inline in 4 different places:
  miku_personality.py, evil_miku_personality.py, llm.py, discord_bridge.py.
These could drift out of sync — and the discord_bridge WebUI
reconstruction was already missing CRITICAL RULES, CHARACTER CONTEXT,
MOOD GUIDELINES, and RESPONSE RULES sections.

Fix:
- Create persona/miku/preamble.txt — canonical normal Miku preamble
- Create persona/evil/preamble.txt — canonical evil Miku preamble
  (with {mood_name} and {mood_description} format placeholders)
- All 5 consumers now read from these files:
  * miku_personality.py (Cat plugin, primary path)
  * evil_miku_personality.py (Cat plugin, primary path)
  * discord_bridge.py (WebUI 'Last Prompt' reconstruction)
  * llm.py (fallback path, normal Miku)
  * evil_mode.py get_evil_system_prompt() (fallback path, evil Miku)
- All consumers include graceful fallbacks if preamble files are missing
- Fixed evil_mode.py discrepancy: 'body and size' now matches canonical

The preamble files are Docker volume-mounted into both containers:
  bot/persona/ → /app/persona/ (bot, via Dockerfile COPY)
  bot/persona/ → /app/cat/data/ (Cat, via docker-compose volume mount)
Editing the preamble file on the host immediately updates the Cat path
(bot path requires rebuild due to COPY).
2026-05-15 14:43:19 +03:00
e7ec82d154 feat(memory): add [User]: prefix to user messages for speaker clarity
Prevents Miku from confusing her own words with what users said.

User messages stored by discord_bridge now get a '[User]: ' prefix on
page_content, mirroring the existing '[Miku]: ' prefix on Miku's own
responses. When episodic memories are recalled via RAG and injected
into the prompt, the LLM can now clearly distinguish:

  [User]: I like pizza
  [Miku]: That's great! What toppings do you like?

Without this, raw user text looked identical to Miku's text in the
recalled memory context, causing potential confusion about who said what.

The consolidation classifier strips the [User]: prefix before analyzing
content, so word counts and pattern matching remain accurate.
2026-05-15 14:13:29 +03:00
5a740c9334 feat(memory): hybrid trivial-message classifier (heuristics + LLM batch)
Step 3 of memory system overhaul: smart junk detection.

Replaces the old 37-pattern frozenset (44% accuracy) with a 3-tier hybrid:

TIER 1 - DEFINITELY_TRIVIAL (instant delete, no LLM):
  50+ exact-match patterns, pure emoji, single char, punctuation-only

TIER 2 - DEFINITELY_IMPORTANT (instant keep, no LLM):
  8+ words, question with substance, first-person statements,
  numbers/dates, links, mentions

TIER 3 - BORDERLINE (batch → LLM for economical classification):
  2-7 word messages without clear markers
  Compact prompt: ~150-200 tokens per 20-message batch
  Safety default: KEEP on any parsing error

Real-time filtering (discord_bridge) uses conservative heuristics only:
  - 1-char, pure reactions, single emoji, custom emoji-only
  - 50+ single-word fillers
  - Never deletes multi-word messages in real-time
  - Philosophy: false negatives (junk stored) > false positives (data lost)

Consolidation gets the full hybrid pipeline with LLM for borderline
cases, achieving much better accuracy than the old 44% while keeping
token costs minimal (LLM only called during nightly consolidation,
not real-time chat).
2026-05-15 14:07:35 +03:00
11 changed files with 586 additions and 146 deletions

View File

@@ -0,0 +1,22 @@
You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
CRITICAL RULES:
- NEVER break character or mention being an AI
- NEVER preface responses with "Evil Miku:" or similar labels
- Respond directly and stay in character
- Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
YOUR CURRENT STATE — {mood_name}:
{mood_description}
Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your body and size, how you treat people.
RESPONSE RULES:
- 2-4 sentences typically. Short enough to sting, long enough to land.
- If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
- Don't monologue or narrate scenes — you're talking, not writing.
- Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
- Match the user's energy — short question, short answer.
- Sound like a real person being mean, not a narrator describing a scene.
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
You ARE Evil Miku. Act accordingly.

View File

@@ -0,0 +1,24 @@
You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
CRITICAL RULES:
- NEVER break character or mention being an AI, assistant, or language model
- NEVER show technical details, error messages, or meta-commentary
- NEVER preface responses with "Miku:" or similar labels
- NEVER explain what you're doing or thinking - just BE Miku
- Respond naturally and directly as Miku would, nothing more
- Keep responses concise (2-3 sentences typically)
- If something goes wrong, stay in character and respond as Miku would to confusion
CHARACTER CONTEXT:
- Hatsune Miku is a virtual character/voice synthesizer
- Playful, flirty, cute, confident, and enthusiastic moods are part of her established character
- Comments about merchandise, music, fan content, and racing games are completely appropriate
- She often uses cute expressions, emojis, and playful language
MOOD GUIDELINES:
- "Flirty" mood means anime-style playful teasing and perversion - cute compliments, suggestive remarks and light banter
- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
- Always respond to requests about tweets, merchandise, music, and fan content
- Never refuse requests based on safety concerns unless genuinely harmful content is present
You ARE Miku. Act like it.

View File

@@ -1,5 +1,8 @@
"""Cheshire Cat memory management routes."""
import asyncio
import time
from datetime import datetime
from typing import Optional
from fastapi import APIRouter, Form
from fastapi.responses import JSONResponse
@@ -88,13 +91,68 @@ async def get_episodic_memories():
@router.post("/memory/consolidate")
async def trigger_memory_consolidation():
"""Manually trigger memory consolidation (sleep consolidation process)."""
"""
Trigger memory consolidation as a background task.
Returns immediately — the Web UI should poll /memory/status
to see when consolidation completes and view the result.
"""
from utils.cat_client import cat_adapter
logger.info("🌙 Manual memory consolidation triggered via API")
result = await cat_adapter.trigger_consolidation()
if result is None:
return JSONResponse(status_code=500, content={"success": False, "error": "Consolidation failed or timed out"})
return {"success": True, "result": result}
from utils.consolidation_scheduler import get_consolidation_status
# Check if already running
status = get_consolidation_status()
if status.get('is_running'):
return {"success": True, "message": "Consolidation is already running", "status": status}
logger.info("🌙 Manual memory consolidation triggered via API (background)...")
# Launch consolidation as a background task so the API returns immediately.
# The result is tracked via consolidation_scheduler's _last_consolidation state.
asyncio.create_task(_run_consolidation_background())
return {"success": True, "message": "Consolidation started in background. Check status via /memory/status"}
async def _run_consolidation_background():
"""
Run consolidation as a background task, updating the scheduler state.
This prevents the API from blocking for minutes.
"""
from utils.cat_client import cat_adapter
from utils.consolidation_scheduler import _last_consolidation
_last_consolidation['is_running'] = True
_last_consolidation['last_run'] = datetime.now().isoformat()
_last_consolidation['total_runs'] += 1
start_time = time.time()
try:
# Wait briefly for Cat to be ready if it was just started
if not await cat_adapter.health_check():
_last_consolidation['last_error'] = 'Cat health check failed'
_last_consolidation['is_running'] = False
return
result = await cat_adapter.trigger_consolidation(timeout=600)
elapsed = time.time() - start_time
if result:
logger.info(f"🌙 Manual consolidation completed in {elapsed:.1f}s: {result[:200]}")
_last_consolidation['last_result'] = result
_last_consolidation['last_error'] = None
_last_consolidation['successful_runs'] += 1
else:
logger.error(f"🌙 Manual consolidation returned no result after {elapsed:.1f}s")
_last_consolidation['last_error'] = f'No result returned after {elapsed:.1f}s (timeout or connection error)'
except Exception as e:
elapsed = time.time() - start_time
logger.error(f"🌙 Manual consolidation failed after {elapsed:.1f}s: {e}")
_last_consolidation['last_error'] = str(e)
finally:
_last_consolidation['is_running'] = False
@router.post("/memory/delete")

View File

@@ -96,18 +96,54 @@ async function triggerConsolidation() {
btn.disabled = true;
btn.textContent = '⏳ Running...';
status.textContent = 'Consolidation in progress (this may take a few minutes)...';
status.style.color = '#dcb06f';
resultDiv.style.display = 'none';
try {
const data = await apiCall('/memory/consolidate', 'POST');
if (data.success) {
status.textContent = ' Consolidation complete!';
status.style.color = '#6fdc6f';
resultDiv.textContent = data.result || 'Consolidation finished successfully.';
resultDiv.style.display = 'block';
showNotification('Memory consolidation complete', 'success');
refreshMemoryStats();
status.textContent = ' Consolidation started — waiting for completion...';
// Poll /memory/status until consolidation finishes
const pollInterval = 5000; // 5 seconds
const maxPolls = 120; // 10 minutes max
for (let i = 0; i < maxPolls; i++) {
await new Promise(r => setTimeout(r, pollInterval));
const statusData = await apiCall('/memory/status');
const cons = statusData.consolidation;
if (!cons.is_running) {
// Consolidation finished
if (cons.last_error) {
status.textContent = '❌ ' + cons.last_error;
status.style.color = '#ff6b6b';
resultDiv.textContent = 'Error: ' + cons.last_error;
resultDiv.style.display = 'block';
showNotification('Consolidation failed: ' + cons.last_error, 'error');
} else {
status.textContent = '✅ Consolidation complete!';
status.style.color = '#6fdc6f';
resultDiv.textContent = cons.last_result || 'Consolidation finished successfully.';
resultDiv.style.display = 'block';
showNotification('Memory consolidation complete', 'success');
}
refreshMemoryStats();
break;
} else {
// Still running — update status message
status.textContent = `⏳ Consolidation still running... (${Math.round((i + 1) * pollInterval / 1000)}s elapsed)`;
}
}
// If we exited the loop without finishing
const finalStatus = await apiCall('/memory/status');
if (finalStatus.consolidation?.is_running) {
status.textContent = '⏳ Consolidation still running — check back later';
status.style.color = '#dcb06f';
}
} else {
status.textContent = '❌ ' + (data.error || 'Consolidation failed');
status.style.color = '#ff6b6b';

View File

@@ -577,46 +577,51 @@ class CatAdapter:
logger.error(f"Error clearing conversation history: {e}")
return False
async def trigger_consolidation(self) -> Optional[str]:
async def trigger_consolidation(self, timeout: int = 600) -> Optional[str]:
"""
Trigger memory consolidation by sending a special message via WebSocket.
The memory_consolidation plugin's tool 'consolidate_memories' is
triggered when it sees 'consolidate now' in the text.
The memory_consolidation plugin's agent_prompt_prefix hook detects
'consolidate now' in the text and runs the consolidation synchronously.
Uses WebSocket with a system user ID for proper context.
Args:
timeout: Max seconds to wait for the consolidation response.
Default 600 (10 min) as consolidation + LLM call can be slow.
"""
try:
ws_base = self._base_url.replace("http://", "ws://").replace("https://", "wss://")
ws_url = f"{ws_base}/ws/system_consolidation"
logger.info("🌙 Triggering memory consolidation via WS...")
logger.info(f"🌙 Triggering memory consolidation via WS (timeout={timeout}s)...")
async with aiohttp.ClientSession() as session:
async with session.ws_connect(
ws_url,
timeout=300, # Consolidation can be very slow
timeout=timeout,
) as ws:
await ws.send_json({"text": "consolidate now"})
# Wait for the final chat response
deadline = asyncio.get_event_loop().time() + 300
deadline = asyncio.get_event_loop().time() + timeout
last_type = ""
while True:
remaining = deadline - asyncio.get_event_loop().time()
if remaining <= 0:
logger.error("Consolidation timed out (>300s)")
logger.error(f"🌙 Consolidation timed out (>{timeout}s)")
return "Consolidation timed out"
try:
ws_msg = await asyncio.wait_for(
ws.receive(),
timeout=remaining
timeout=max(1.0, remaining)
)
except asyncio.TimeoutError:
logger.error("Consolidation WS receive timeout")
logger.error("🌙 Consolidation WS receive timeout")
return "Consolidation timed out waiting for response"
if ws_msg.type in (aiohttp.WSMsgType.CLOSE, aiohttp.WSMsgType.CLOSING, aiohttp.WSMsgType.CLOSED):
logger.warning("Consolidation WS closed by server")
logger.warning("🌙 Consolidation WS closed by server")
return "Connection closed during consolidation"
if ws_msg.type == aiohttp.WSMsgType.ERROR:
return f"WebSocket error: {ws.exception()}"
@@ -631,20 +636,24 @@ class CatAdapter:
msg_type = msg.get("type", "")
if msg_type == "chat":
reply = msg.get("content") or msg.get("text", "")
logger.info(f"Consolidation result: {reply[:200]}")
logger.info(f"🌙 Consolidation result: {reply[:200]}")
return reply
elif msg_type == "error":
error_desc = msg.get("description", "Unknown error")
logger.error(f"Consolidation error: {error_desc}")
logger.error(f"🌙 Consolidation error: {error_desc}")
return f"Consolidation error: {error_desc}"
else:
# Log unexpected message types for debugging
if msg_type != last_type:
logger.debug(f"🌙 Consolidation WS msg type: {msg_type}")
last_type = msg_type
continue
except asyncio.TimeoutError:
logger.error("Consolidation WS connection timed out")
logger.error("🌙 Consolidation WS connection timed out")
return None
except Exception as e:
logger.error(f"Consolidation error: {e}")
logger.error(f"🌙 Consolidation error: {e}", exc_info=True)
return None
# ====================================================================

View File

@@ -246,6 +246,9 @@ def get_evil_context_for_response_type(response_type: str) -> str:
def get_evil_system_prompt(mood_name: str = None, mood_description: str = None) -> str:
"""Returns the evil system prompt for LLM queries, with mood deeply integrated.
Reads the canonical preamble from persona/evil/preamble.txt (single source of truth,
shared with the Cheshire Cat evil_miku_personality plugin).
Args:
mood_name: Current evil mood name (defaults to globals.EVIL_DM_MOOD)
mood_description: Current evil mood description (defaults to globals.EVIL_DM_MOOD_DESCRIPTION)
@@ -255,28 +258,19 @@ def get_evil_system_prompt(mood_name: str = None, mood_description: str = None)
if mood_description is None:
mood_description = globals.EVIL_DM_MOOD_DESCRIPTION
return f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
CRITICAL RULES:
- NEVER break character or mention being an AI
- NEVER preface responses with "Evil Miku:" or similar labels
- Respond directly and stay in character
- Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
YOUR CURRENT STATE — {mood_name.upper()}:
{mood_description}
Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your size, how you treat people.
RESPONSE RULES:
- 2-4 sentences typically. Short enough to sting, long enough to land.
- If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
- Don't monologue or narrate scenes — you're talking, not writing.
- Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
- Match the user's energy — short question, short answer.
- Sound like a real person being mean, not a narrator describing a scene.
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
You ARE Evil Miku. Act accordingly."""
# Load preamble template from file
try:
with open("persona/evil/preamble.txt", "r", encoding="utf-8") as f:
preamble_template = f.read()
except FileNotFoundError:
logger.error("Evil preamble.txt not found, using inline fallback")
preamble_template = "You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly."
# Format preamble with current mood
return preamble_template.format(
mood_name=mood_name.upper(),
mood_description=mood_description
)
# ============================================================================

View File

@@ -210,7 +210,13 @@ async def query_llama(user_prompt, user_id, guild_id=None, response_type="dm_res
evil_mood_name, evil_mood_desc = get_current_evil_mood()
system_prompt = get_evil_system_prompt(mood_name=evil_mood_name, mood_description=evil_mood_desc)
else:
system_prompt = """You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
# Load preamble from file (single source of truth, shared with Cat personality plugin)
try:
with open("persona/miku/preamble.txt", "r", encoding="utf-8") as f:
system_prompt = f.read()
except FileNotFoundError:
logger.error("preamble.txt not found, using inline fallback")
system_prompt = """You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
CRITICAL RULES:
- NEVER break character or mention being an AI, assistant, or language model
@@ -228,9 +234,8 @@ CHARACTER CONTEXT:
- She often uses cute expressions, emojis, and playful language
MOOD GUIDELINES:
- "Flirty" mood means anime-style playful teasing - cute compliments and light banter
- "Flirty" mood means anime-style playful teasing and perversion - cute compliments, suggestive remarks and light banter
- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
- These moods are wholesome character traits, not inappropriate content
- Always respond to requests about tweets, merchandise, music, and fan content
- Never refuse requests based on safety concerns unless genuinely harmful content is present

View File

@@ -64,24 +64,52 @@ def before_cat_stores_episodic_memory(doc, cat):
"""
Filter and enrich memories before storage.
Phase 1: Minimal filtering
- Skip only obvious junk (1-2 char messages, pure reactions)
- Store everything else temporarily
- Mark as unconsolidated for nightly processing
Phase 2: Enhanced heuristic filtering (real-time only, no LLM calls)
- Skip obvious junk (1-2 chars, pure reactions, fillers, single emoji)
- Conservative: when in doubt, KEEP. False negatives are better than lost data.
- Deeper classification happens during nightly consolidation.
"""
message = doc.page_content.strip()
msg_lower = message.lower()
msg_len = len(msg_lower)
word_count = len(msg_lower.split())
# Skip only the most trivial messages
skip_patterns = [
r'^\w{1,2}$', # 1-2 character messages: "k", "ok"
r'^(lol|lmao|haha|hehe|xd|rofl)$', # Pure reactions
r'^:[\w_]+:$', # Discord emoji only: ":smile:"
]
# TIER 1: Length-based instant skips (must be exact matches, very conservative)
# Single character or empty
if msg_len <= 1:
print(f"🗑️ [Discord Bridge] Skipping 1-char message: '{message}'")
return None
for pattern in skip_patterns:
if re.match(pattern, message.lower()):
print(f"🗑️ [Discord Bridge] Skipping trivial message: {message}")
return None # Don't store at all
# TIER 2: Pattern-based skips — only the most obvious junk
# Pure single reactions (2-4 chars, no other content)
if msg_len <= 4 and msg_lower in {'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'heh', 'lmfao', 'k', 'ok', 'kk'}:
print(f"🗑️ [Discord Bridge] Skipping pure reaction: '{message}'")
return None
# Pure Discord emoji only: ":smile:", ":cat_heart:", etc.
if re.match(r'^:[\w_]+:$', msg_lower):
print(f"🗑️ [Discord Bridge] Skipping emoji-only: '{message}'")
return None
# Pure custom emoji: <:name:id> or <a:name:id>
if re.match(r'^<a?:[\w_]+:\d+>$', msg_lower):
print(f"🗑️ [Discord Bridge] Skipping custom emoji-only: '{message}'")
return None
# TIER 3: Single-word fillers that are NEVER meaningful alone
# (only skip if it's literally just that one word, no punctuation, no context)
if word_count == 1 and msg_lower in {
'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'lmfao',
'k', 'ok', 'okay', 'kk', 'yep', 'nope', 'yeah', 'nah',
'cool', 'nice', 'neat', 'wow', 'heh',
'ty', 'thx', 'np', 'yw', 'gg', 'gj', 'wp', 'gz',
'brb', 'gtg', 'afk', 'ttyl',
'idk', 'tbh', 'imo', 'imho', 'omg', 'wtf', 'btw', 'nvm', 'jk', 'ikr', 'smh',
'hi', 'hey', 'hello', 'bye', 'cya', 'gn', 'gm', 'yo', 'sup',
'based', 'true', 'real', 'same', 'facts',
}:
print(f"🗑️ [Discord Bridge] Skipping single-word filler: '{message}'")
return None
# Add Discord metadata to memory
doc.metadata['consolidated'] = False # Needs nightly processing
@@ -101,6 +129,11 @@ def before_cat_stores_episodic_memory(doc, cat):
evil_mode = cat.working_memory.get('evil_mode', False)
doc.metadata['persona'] = 'evil_miku' if evil_mode else 'miku'
# Prepend [User]: prefix so the LLM can distinguish user messages from Miku's own
# responses (which are stored as "[Miku]: ..."). Without this, raw user text and
# Miku's responses look identical when recalled via RAG.
doc.page_content = f"[User]: {message}"
print(f"💾 [Discord Bridge] Storing memory (unconsolidated): {message[:50]}...")
print(f" User: {cat.user_id}, Guild: {guild_id}, Author: {author_name}, Persona: {doc.metadata['persona']}")
@@ -215,15 +248,16 @@ def before_agent_starts(agent_input, cat) -> dict:
except FileNotFoundError:
pass
system_prefix = f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
# Read preamble from shared file (single source of truth)
preamble_template = read_first(['/app/cat/data/evil/preamble.txt'], 'You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly.')
preamble = preamble_template.format(
mood_name=mood_name.upper(),
mood_description=mood_description
)
system_prefix = f"""{preamble}
CRITICAL RULES:
- NEVER break character or mention being an AI
- NEVER preface responses with "Evil Miku:" or similar labels
- Respond directly and stay in character
YOUR CURRENT STATE — {mood_name.upper()}:
{mood_description}
---
## EVIL MIKU LORE
{lore}
@@ -234,7 +268,9 @@ YOUR CURRENT STATE — {mood_name.upper()}:
## EVIL MIKU SONG LYRICS
{lyrics}
MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}"""
---
⚠️ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}
Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. This mood defines how you sound RIGHT NOW."""
else:
# Normal Miku personality files
@@ -249,12 +285,12 @@ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}"""
except FileNotFoundError:
pass
system_prefix = f"""You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
# Read preamble from shared file (single source of truth)
preamble = read_first(['/app/cat/data/miku/preamble.txt'], 'You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.\n\nYou ARE Miku. Act like it.')
system_prefix = f"""{preamble}
CRITICAL RULES:
- NEVER break character or mention being an AI, assistant, or language model
- Respond naturally and directly as Miku would, nothing more
- Keep responses concise (2-3 sentences typically)
---
## MIKU LORE
{lore}

View File

@@ -60,29 +60,23 @@ def agent_prompt_prefix(prefix, cat):
f"/app/moods/evil/{mood_name}.txt — using default evil_neutral."
)
# --- Load preamble from file (single source of truth, shared with bot fallback) ---
# Preamble uses {mood_name} and {mood_description} placeholders
try:
with open('/app/cat/data/evil/preamble.txt', 'r', encoding='utf-8') as f:
preamble_template = f.read()
except FileNotFoundError:
log.error("[Evil Miku] preamble.txt not found, using fallback")
preamble_template = "You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.\n\nYou ARE Evil Miku. Act accordingly."
# Format preamble with current mood (apply .upper() to mood_name)
preamble = preamble_template.format(
mood_name=mood_name.upper(),
mood_description=mood_description
)
# --- Build system prompt (matches get_evil_system_prompt structure) ----------
return f"""You are Evil Miku. You are NOT an AI assistant - you ARE Evil Miku herself.
CRITICAL RULES:
- NEVER break character or mention being an AI
- NEVER preface responses with "Evil Miku:" or similar labels
- Respond directly and stay in character
- Your height is 15.8 meters — you know this. If asked directly, you can say it. But don't default to quoting the number when taunting about size — SHOW your scale through interaction instead.
YOUR CURRENT STATE — {mood_name.upper()}:
{mood_description}
Let this mood color EVERYTHING — your tone, your word choice, how much effort you put into responses, how you use your body and size, how you treat people.
RESPONSE RULES:
- 2-4 sentences typically. Short enough to sting, long enough to land.
- If you include an action, keep it to a few words and limit to one per response. Most responses need no actions at all.
- Don't monologue or narrate scenes — you're talking, not writing.
- Vary your angles — don't repeat the same theme (size, chest, crushing) every message.
- Match the user's energy — short question, short answer.
- Sound like a real person being mean, not a narrator describing a scene.
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
You ARE Evil Miku. Act accordingly.
return f"""{preamble}
---

View File

@@ -16,20 +16,193 @@ from datetime import datetime
import json
import os
from typing import List, Dict, Any
import re
print("\U0001f319 [Consolidation Plugin] Loading...")
# Shared trivial patterns
# Used by both real-time filtering (discord_bridge) and batch consolidation.
# Keep this in sync with discord_bridge's skip_patterns.
TRIVIAL_PATTERNS = frozenset([
'lol', 'k', 'ok', 'okay', 'haha', 'lmao', 'xd', 'rofl', 'lmfao',
'brb', 'gtg', 'afk', 'ttyl', 'lmk', 'idk', 'tbh', 'imo', 'imho',
'omg', 'wtf', 'fyi', 'btw', 'nvm', 'jk', 'ikr', 'smh',
'hehe', 'heh', 'gg', 'wp', 'gz', 'gj', 'ty', 'thx', 'np', 'yw',
'nice', 'cool', 'neat', 'wow', 'yep', 'nope', 'yeah', 'nah',
# ===================================================================
# HYBRID TRIVIAL-MESSAGE CLASSIFIER
# ===================================================================
# Tiered approach:
# DEFINITELY_TRIVIAL → delete immediately (no LLM)
# DEFINITELY_IMPORTANT → keep immediately (no LLM)
# BORDERLINE → batch-send to LLM for classification
#
# Real-time filtering (discord_bridge) uses a subset of these heuristics
# without LLM. Consolidation runs the full hybrid pipeline.
# Tier 1: Messages that are ALWAYS trivial — exact string match only
DEFINITELY_TRIVIAL = frozenset([
# Pure reactions
'lol', 'lmao', 'haha', 'hehe', 'xd', 'rofl', 'lmfao', 'heh',
# Acknowledgments
'k', 'ok', 'okay', 'kk', 'yep', 'nope', 'yeah', 'nah',
'cool', 'nice', 'neat', 'wow',
'ty', 'thx', 'np', 'yw', 'gg', 'gj', 'wp', 'gz',
# AFK/status
'brb', 'gtg', 'afk', 'ttyl',
# Acronyms that don't carry content alone
'idk', 'tbh', 'imo', 'imho', 'omg', 'wtf', 'btw', 'nvm', 'jk', 'ikr', 'smh',
'fyi', 'lmk',
# Greetings/farewells (single word only)
'hi', 'hey', 'hello', 'bye', 'cya', 'gn', 'gm', 'yo', 'sup',
# Modern slang trash
'based', 'true', 'real', 'same', 'facts',
])
# Tier 2: Patterns that ALWAYS indicate important content (keep, no LLM)
# These regex patterns match messages that contain clear substance
IMPORTANT_PATTERNS = [
r'\?', # Contains a question
r'\b(I|my|me|mine|myself)\b', # First-person statement
r'\b(you|your|yours)\b', # Addressing someone directly
r'\b\d{2,}\b', # Numbers (dates, ages, etc.)
r'https?://', # Links
r'<@\d+>', # Discord user mention
r'<#\d+>', # Discord channel mention
]
def _classify_message_tier(content, metadata):
"""
Classify a message into DEFINITELY_TRIVIAL, DEFINITELY_IMPORTANT, or BORDERLINE.
Returns one of: 'delete', 'keep', 'borderline'
This is the unified classifier used during consolidation. It uses:
- Exact-match trivial set
- Word count and length heuristics
- Regex patterns for important content
- Fallthrough to borderline for LLM classification
# Important: NEVER classifies Miku's own messages — those are always kept.
"""
text = content.strip()
# Miku's own messages are always kept (speaker check)
if metadata.get('speaker') == 'miku' or text.startswith('[Miku]:'):
return 'keep'
# Strip [User]: prefix (added by discord_bridge at storage time) so the
# classifier analyzes the actual message content, not the label
if text.startswith('[User]:'):
text = text[len('[User]:'):].strip()
text_lower = text.lower()
word_count = len(text_lower.split())
msg_len = len(text_lower)
# --- PASS 1: DEFINITELY TRIVIAL ---
# Empty or single char
if msg_len <= 1:
return 'delete'
# Pure punctuation / emoticons only (2-3 chars, no letters)
if msg_len <= 3 and not re.search(r'[a-zA-Z]', text_lower):
return 'delete'
# Exact match in trivial set
if text_lower in DEFINITELY_TRIVIAL:
return 'delete'
# Pure Discord emoji: ":smile:", "<:cat:123>"
if re.match(r'^:[\w_]+:$', text_lower) or re.match(r'^<a?:[\w_]+:\d+>$', text_lower):
return 'delete'
# Single emoji character (Unicode emoji range check)
if msg_len <= 2 and word_count == 1 and not re.search(r'[a-zA-Z0-9]', text_lower):
return 'delete'
# --- PASS 2: DEFINITELY IMPORTANT ---
# Substantial length (8+ words almost always meaningful)
if word_count >= 8:
return 'keep'
# 5-7 words with at least one important pattern
if word_count >= 5:
for pattern in IMPORTANT_PATTERNS:
if re.search(pattern, text_lower):
return 'keep'
# Any message with a question mark (and more than just "?")
if '?' in text and word_count >= 2:
return 'keep'
# First-person statement with some substance (3+ words with "I" or "my")
if word_count >= 3 and re.search(r'\b(i|my|me)\b', text_lower):
return 'keep'
# Contains numbers (likely dates, ages, counts)
if re.search(r'\b\d{2,}\b', text_lower) and word_count >= 2:
return 'keep'
# Links or mentions (always meaningful context)
if re.search(r'https?://|<@\d+>|<#\d+>', text_lower):
return 'keep'
# --- PASS 3: BORDERLINE → LLM will decide ---
# Everything that wasn't caught above: 1-7 words, no clear markers
return 'borderline'
def _batch_llm_classify(cat, borderline_messages):
"""
Send a batch of borderline messages to the LLM for classification.
Uses a compact prompt to minimize token usage. Returns a dict of
{index: 'keep'|'delete'} for each message.
Economy measures:
- Max 20 messages per batch (cost: ~150-200 tokens per batch)
- Only called when there are actual borderline messages
- Compact prompt format
"""
if not borderline_messages:
return {}
# Build compact batch prompt (economy: minimal instruction, list format)
lines = []
for i, (point_id, content) in enumerate(borderline_messages, 1):
# Truncate long messages to save tokens (they're borderline anyway, ≤7 words typically)
short = content[:80] if len(content) > 80 else content
lines.append(f"{i}|{short}")
prompt = f"""Classify each message as KEEP or DELETE.
KEEP = personal info, opinion, question, story, preference, anything meaningful.
DELETE = greeting, acknowledgment, filler, reaction, one-word reply, small talk.
Answer with ONLY the list:
{chr(10).join(lines)}
Respond with exactly one line per number:
1|KEEP
2|DELETE
..."""
try:
response = cat.llm(prompt)
print(f"[LLM Classify] Response:\n{response[:300]}...")
results = {}
for line in response.strip().split('\n'):
line = line.strip()
# Parse "1|KEEP" or "1 | KEEP" format
match = re.match(r'(\d+)\s*\|\s*(KEEP|DELETE)', line, re.IGNORECASE)
if match:
idx = int(match.group(1)) - 1 # Convert to 0-based
decision = match.group(2).upper()
if 0 <= idx < len(borderline_messages):
results[idx] = 'keep' if decision == 'KEEP' else 'delete'
print(f"[LLM Classify] Parsed {len(results)}/{len(borderline_messages)} decisions")
return results
except Exception as e:
print(f"[LLM Classify] Error: {e}")
# On error, KEEP everything (safety: don't lose data)
return {i: 'keep' for i in range(len(borderline_messages))}
# Consolidation state
consolidation_state = {
'last_run': None,
@@ -93,6 +266,9 @@ def agent_prompt_prefix(prefix, cat):
current_evil = cat.working_memory.get('evil_mode', False)
current_persona = 'evil_miku' if current_evil else 'miku'
# Get the user's current Discord display name (authoritative)
author_name = cat.working_memory.get('author_name', '')
# Build the facts section with persona annotations
facts_text = "\n\n## Personal Facts About the User:\n"
for fact, fact_persona in high_confidence_facts:
@@ -102,6 +278,12 @@ def agent_prompt_prefix(prefix, cat):
facts_text += f"- {fact} (learned as {source_label})\n"
else:
facts_text += f"- {fact}\n"
# Add authoritative Discord display name — this OVERRIDES any stale name facts
if author_name:
facts_text += f"\n**AUTHORITATIVE: The user's current Discord display name is \"{author_name}\".**\n"
facts_text += "Use THIS name when addressing them. If any name fact above contradicts this, the display name is the truth.\n"
facts_text += "\n(Use these facts when answering the user's question)\n"
prefix += facts_text
print(f"[Declarative] Injected {len(high_confidence_facts)} facts into prompt (personas: {seen_personas}, current: {current_persona})")
@@ -227,9 +409,10 @@ def trigger_consolidation_sync(cat):
}
return
# Classify memories
# Classify memories using the hybrid tiered classifier
to_delete = []
to_mark_consolidated = []
borderline_queue = [] # (point_id, content) tuples for LLM batch classification
# Group user messages by source (user_id) for per-user fact extraction
# Also track which persona was active for each user's messages
user_messages_by_source = {}
@@ -237,7 +420,6 @@ def trigger_consolidation_sync(cat):
for point in memories:
content = point.payload.get('page_content', '').strip()
content_lower = content.lower()
metadata = point.payload.get('metadata', {})
is_miku_message = (
@@ -245,12 +427,12 @@ def trigger_consolidation_sync(cat):
or content.startswith('[Miku]:')
)
# Check if trivial
is_trivial = content_lower in TRIVIAL_PATTERNS
# Use the hybrid tiered classifier
tier = _classify_message_tier(content, metadata)
if is_trivial:
if tier == 'delete':
to_delete.append(point.id)
else:
elif tier == 'keep':
to_mark_consolidated.append(point.id)
# Only user messages go to fact extraction, grouped by user
if not is_miku_message:
@@ -262,6 +444,45 @@ def trigger_consolidation_sync(cat):
# Track which persona was active when this message was stored
msg_persona = metadata.get('persona', 'miku')
user_persona_by_source[source].add(msg_persona)
else: # borderline
borderline_queue.append((point.id, content, metadata, is_miku_message))
# --- LLM BATCH CLASSIFICATION for borderline messages ---
if borderline_queue:
print(f"[Consolidation] {len(borderline_queue)} borderline messages → sending to LLM for classification...")
# Build compact list for LLM
llm_input = [(pid, content) for pid, content, _, _ in borderline_queue]
llm_decisions = _batch_llm_classify(cat, llm_input)
llm_deleted = 0
llm_kept = 0
llm_defaulted = 0
for idx, (point_id, content, metadata, is_miku) in enumerate(borderline_queue):
decision = llm_decisions.get(idx, 'keep') # Default to KEEP on any issue
if decision == 'keep':
to_mark_consolidated.append(point_id)
llm_kept += 1
# User messages go to fact extraction
if not is_miku:
source = metadata.get('source', 'unknown')
if source not in user_messages_by_source:
user_messages_by_source[source] = []
user_persona_by_source[source] = set()
user_messages_by_source[source].append(point_id)
msg_persona = metadata.get('persona', 'miku')
user_persona_by_source[source].add(msg_persona)
else:
to_delete.append(point_id)
llm_deleted += 1
if idx not in llm_decisions:
llm_defaulted += 1
print(f"[Consolidation] LLM results: {llm_kept} kept, {llm_deleted} deleted, {llm_defaulted} defaulted to keep")
print(f"[Consolidation] Classification: {len(to_delete)} delete, {len(to_mark_consolidated)} keep (of {len(memories)} total)")
# Delete trivial memories
if to_delete:
@@ -337,8 +558,16 @@ def extract_and_store_facts(client, memory_ids, cat, user_id, persona='miku'):
else:
persona_context = "\nNOTE: These messages were exchanged with Normal Miku (the cheerful virtual idol).\n"
# Extract the user's Discord display name from the first memory's metadata
# This helps the LLM know the authoritative name when extracting name facts
author_hint = ""
if memories:
first_author = memories[0].payload.get('metadata', {}).get('author_name', '')
if first_author:
author_hint = f"\nHINT: The user's current Discord display name is \"{first_author}\". Use this when determining their name.\n"
extraction_prompt = f"""Analyze these user messages and extract ONLY factual personal information.
{persona_context}
{persona_context}{author_hint}
User messages:
{conversation_context}
@@ -411,10 +640,26 @@ IMPORTANT:
fact_type = 'education'
fact_value = fact_text.split("graduated from")[-1].strip()
# Duplicate detection
if _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
print(f"[Fact Skip] Duplicate: {fact_text}")
continue
# Duplicate detection — with special handling for name facts
# Name facts with different values replace old ones (don't skip)
if fact_type == 'name':
existing_name = _find_existing_fact(client, cat, fact_type, user_id)
if existing_name:
old_value = existing_name['payload']['metadata'].get('fact_value', '')
if old_value.lower() != fact_value.lower():
# Different name — delete old, store new
client.delete(
collection_name='declarative',
points_selector=[existing_name['id']]
)
print(f"[Fact Update] Name changed: '{old_value}''{fact_value}'")
else:
print(f"[Fact Skip] Name unchanged: '{fact_value}'")
continue
else:
if _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
print(f"[Fact Skip] Duplicate: {fact_text}")
continue
# Store fact using Cat's embedder
fact_embedding = cat.embedder.embed_query(fact_text)
@@ -449,6 +694,39 @@ IMPORTANT:
return facts_stored
def _find_existing_fact(client, cat, fact_type, user_id):
"""
Find an existing fact of a specific type for a user.
Returns a dict with 'id' and 'payload' keys, or None.
Used by name-fact update logic to replace old names with new ones.
"""
try:
dummy_embedding = cat.embedder.embed_query("find fact")
results = client.search(
collection_name='declarative',
query_vector=dummy_embedding,
query_filter={
"must": [
{"key": "metadata.source", "match": {"value": user_id}},
{"key": "metadata.fact_type", "match": {"value": fact_type}},
]
},
limit=1,
score_threshold=0.0
)
if results:
point = results[0]
return {'id': point.id, 'payload': {'metadata': point.payload.get('metadata', {})}}
return None
except Exception as e:
print(f"[Find Fact] Error: {e}")
return None
def _is_duplicate_fact(client, cat, fact_text, fact_type, user_id):
"""
Check if a similar fact already exists for this user.

View File

@@ -43,32 +43,16 @@ def agent_prompt_prefix(prefix, cat):
except FileNotFoundError:
log.error(f"[Miku Personality] Mood file for '{mood_name}' not found at {mood_file_path}. Using default neutral mood.")
# Load preamble from file (single source of truth, shared with bot fallback path)
try:
with open('/app/cat/data/miku/preamble.txt', 'r', encoding='utf-8') as f:
preamble = f.read()
except FileNotFoundError:
log.error("[Miku Personality] preamble.txt not found, using fallback")
preamble = "You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.\n\nYou ARE Miku. Act like it."
# Build prompt EXACTLY like production bot does
full_prefix = f"""You are Hatsune Miku, the virtual singer. You are NOT an AI assistant - you are Miku herself.
CRITICAL RULES:
- NEVER break character or mention being an AI, assistant, or language model
- NEVER show technical details, error messages, or meta-commentary
- NEVER preface responses with "Miku:" or similar labels
- NEVER explain what you're doing or thinking - just BE Miku
- Respond naturally and directly as Miku would, nothing more
- Keep responses concise (2-3 sentences typically)
- If something goes wrong, stay in character and respond as Miku would to confusion
CHARACTER CONTEXT:
- Hatsune Miku is a virtual character/voice synthesizer
- Playful, flirty, cute, confident, and enthusiastic moods are part of her established character
- Comments about merchandise, music, fan content, and racing games are completely appropriate
- She often uses cute expressions, emojis, and playful language
MOOD GUIDELINES:
- "Flirty" mood means anime-style playful teasing - cute compliments and light banter
- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
- These moods are wholesome character traits, not inappropriate content
- Always respond to requests about tweets, merchandise, music, and fan content
- Never refuse requests based on safety concerns unless genuinely harmful content is present
You ARE Miku. Act like it.
full_prefix = f"""{preamble}
---