feat(memory): add [User]: prefix to user messages for speaker clarity

Prevents Miku from confusing her own words with what users said.

User messages stored by discord_bridge now get a '[User]: ' prefix on
page_content, mirroring the existing '[Miku]: ' prefix on Miku's own
responses. When episodic memories are recalled via RAG and injected
into the prompt, the LLM can now clearly distinguish:

  [User]: I like pizza
  [Miku]: That's great! What toppings do you like?

Without this, raw user text looked identical to Miku's text in the
recalled memory context, causing potential confusion about who said what.

The consolidation classifier strips the [User]: prefix before analyzing
content, so word counts and pattern matching remain accurate.
This commit is contained in:
2026-05-15 14:13:29 +03:00
parent 5a740c9334
commit e7ec82d154
2 changed files with 14 additions and 3 deletions

View File

@@ -129,6 +129,11 @@ def before_cat_stores_episodic_memory(doc, cat):
evil_mode = cat.working_memory.get('evil_mode', False)
doc.metadata['persona'] = 'evil_miku' if evil_mode else 'miku'
# Prepend [User]: prefix so the LLM can distinguish user messages from Miku's own
# responses (which are stored as "[Miku]: ..."). Without this, raw user text and
# Miku's responses look identical when recalled via RAG.
doc.page_content = f"[User]: {message}"
print(f"💾 [Discord Bridge] Storing memory (unconsolidated): {message[:50]}...")
print(f" User: {cat.user_id}, Guild: {guild_id}, Author: {author_name}, Persona: {doc.metadata['persona']}")

View File

@@ -77,14 +77,20 @@ def _classify_message_tier(content, metadata):
# Important: NEVER classifies Miku's own messages — those are always kept.
"""
text = content.strip()
text_lower = text.lower()
word_count = len(text_lower.split())
msg_len = len(text_lower)
# Miku's own messages are always kept (speaker check)
if metadata.get('speaker') == 'miku' or text.startswith('[Miku]:'):
return 'keep'
# Strip [User]: prefix (added by discord_bridge at storage time) so the
# classifier analyzes the actual message content, not the label
if text.startswith('[User]:'):
text = text[len('[User]:'):].strip()
text_lower = text.lower()
word_count = len(text_lower.split())
msg_len = len(text_lower)
# --- PASS 1: DEFINITELY TRIVIAL ---
# Empty or single char