Prevents Miku from confusing her own words with what users said.
User messages stored by discord_bridge now get a '[User]: ' prefix on
page_content, mirroring the existing '[Miku]: ' prefix on Miku's own
responses. When episodic memories are recalled via RAG and injected
into the prompt, the LLM can now clearly distinguish:
[User]: I like pizza
[Miku]: That's great! What toppings do you like?
Without this, raw user text looked identical to Miku's text in the
recalled memory context, causing potential confusion about who said what.
The consolidation classifier strips the [User]: prefix before analyzing
content, so word counts and pattern matching remain accurate.
Step 3 of memory system overhaul: smart junk detection.
Replaces the old 37-pattern frozenset (44% accuracy) with a 3-tier hybrid:
TIER 1 - DEFINITELY_TRIVIAL (instant delete, no LLM):
50+ exact-match patterns, pure emoji, single char, punctuation-only
TIER 2 - DEFINITELY_IMPORTANT (instant keep, no LLM):
8+ words, question with substance, first-person statements,
numbers/dates, links, mentions
TIER 3 - BORDERLINE (batch → LLM for economical classification):
2-7 word messages without clear markers
Compact prompt: ~150-200 tokens per 20-message batch
Safety default: KEEP on any parsing error
Real-time filtering (discord_bridge) uses conservative heuristics only:
- 1-char, pure reactions, single emoji, custom emoji-only
- 50+ single-word fillers
- Never deletes multi-word messages in real-time
- Philosophy: false negatives (junk stored) > false positives (data lost)
Consolidation gets the full hybrid pipeline with LLM for borderline
cases, achieving much better accuracy than the old 44% while keeping
token costs minimal (LLM only called during nightly consolidation,
not real-time chat).
Step 1 of memory system overhaul: persona tagging.
- discord_bridge: tag user messages with 'persona' metadata at storage time
- memory_consolidation: tag Miku's own responses with 'persona' metadata
- memory_consolidation: tag declarative facts with source persona during extraction
- memory_consolidation: pass persona context to LLM extraction prompt
- memory_consolidation: annotate cross-persona facts in prompt injection
(e.g., '(learned as Evil Miku)' when Evil facts appear for Normal Miku)
- Web UI: show persona badge (🎤 Miku / 😈 Evil Miku) on facts and episodic
memories in the Memory Management tab
This lets both personas know which version of Miku each memory came from,
enabling Evil Miku to distinguish her own memories from Normal Miku's.