miku-discord

Author	SHA1	Message	Date
koko210Serve	46ea4f2c53	fix(memory): prevent stale name facts from overriding Discord display name Two bugs were causing Miku to call users by wrong names: BUG 1 - No authoritative source: Declarative name facts ('The user's name is Lily') were injected into the prompt without any counterweight. If an old consolidation run extracted a wrong name, Miku would believe it forever. Fix: agent_prompt_prefix now appends the user's Discord display name as AUTHORITATIVE context, with explicit instruction to prefer it over any contradictory name facts. BUG 2 - Dedup prevented name updates: _is_duplicate_fact() used vector similarity to detect duplicates. 'The user's name is Lily' and 'The user's name is koko210Serve' are ~80% identical text, giving high cosine similarity (>0.85 threshold). New correct name facts were silently rejected as 'duplicates'. Fix: name facts now use _find_existing_fact() to compare fact_value directly. If the name changed, old fact is deleted and new one stored. Also: the extraction prompt now includes the user's Discord display name as a hint, so the LLM knows the authoritative name when extracting facts during consolidation.	2026-05-17 11:20:49 +03:00
koko210Serve	e7ec82d154	feat(memory): add [User]: prefix to user messages for speaker clarity Prevents Miku from confusing her own words with what users said. User messages stored by discord_bridge now get a '[User]: ' prefix on page_content, mirroring the existing '[Miku]: ' prefix on Miku's own responses. When episodic memories are recalled via RAG and injected into the prompt, the LLM can now clearly distinguish: [User]: I like pizza [Miku]: That's great! What toppings do you like? Without this, raw user text looked identical to Miku's text in the recalled memory context, causing potential confusion about who said what. The consolidation classifier strips the [User]: prefix before analyzing content, so word counts and pattern matching remain accurate.	2026-05-15 14:13:29 +03:00
koko210Serve	5a740c9334	feat(memory): hybrid trivial-message classifier (heuristics + LLM batch) Step 3 of memory system overhaul: smart junk detection. Replaces the old 37-pattern frozenset (44% accuracy) with a 3-tier hybrid: TIER 1 - DEFINITELY_TRIVIAL (instant delete, no LLM): 50+ exact-match patterns, pure emoji, single char, punctuation-only TIER 2 - DEFINITELY_IMPORTANT (instant keep, no LLM): 8+ words, question with substance, first-person statements, numbers/dates, links, mentions TIER 3 - BORDERLINE (batch → LLM for economical classification): 2-7 word messages without clear markers Compact prompt: ~150-200 tokens per 20-message batch Safety default: KEEP on any parsing error Real-time filtering (discord_bridge) uses conservative heuristics only: - 1-char, pure reactions, single emoji, custom emoji-only - 50+ single-word fillers - Never deletes multi-word messages in real-time - Philosophy: false negatives (junk stored) > false positives (data lost) Consolidation gets the full hybrid pipeline with LLM for borderline cases, achieving much better accuracy than the old 44% while keeping token costs minimal (LLM only called during nightly consolidation, not real-time chat).	2026-05-15 14:07:35 +03:00
koko210Serve	e6e81885b3	feat(memory): tag all memories with source persona (miku/evil_miku) Step 1 of memory system overhaul: persona tagging. - discord_bridge: tag user messages with 'persona' metadata at storage time - memory_consolidation: tag Miku's own responses with 'persona' metadata - memory_consolidation: tag declarative facts with source persona during extraction - memory_consolidation: pass persona context to LLM extraction prompt - memory_consolidation: annotate cross-persona facts in prompt injection (e.g., '(learned as Evil Miku)' when Evil facts appear for Normal Miku) - Web UI: show persona badge (🎤 Miku / 😈 Evil Miku) on facts and episodic memories in the Memory Management tab This lets both personas know which version of Miku each memory came from, enabling Evil Miku to distinguish her own memories from Normal Miku's.	2026-05-12 15:12:49 +03:00
koko210Serve	edb88e9ede	fix: Phase 2 integrity review - v2.0.0 rewrite & bugfixes Memory Consolidation Plugin (828 -> 465 lines): - Replace SentenceTransformer with cat.embedder.embed_query() for vector consistency - Fix per-user fact isolation: source=user_id instead of global - Add duplicate fact detection (_is_duplicate_fact, score_threshold=0.85) - Remove ~350 lines of dead async run_consolidation() code - Remove duplicate declarative search in before_cat_sends_message - Unify trivial patterns into TRIVIAL_PATTERNS frozenset - Remove all sys.stderr.write debug logging - Remove sentence-transformers from requirements.txt (no external deps) Loguru Fix (cheshire-cat/cat/log.py): - Patch Cat v1.6.2 loguru format to provide default extra fields - Fixes KeyError: 'original_name' from third-party libs (fastembed) - Mounted via docker-compose volume Discord Bridge: - Copy discord_bridge.py to cat-plugins/ (was empty directory) Test Results (6/7 pass, 100% fact recall): - 11 facts extracted, per-user isolation working - Duplicate detection effective (+2 on 2nd run) - 5/5 natural language recall queries correct	2026-02-07 19:24:46 +02:00
koko210Serve	83c103324c	feat: Phase 2 Memory Consolidation - Production Ready Implements intelligent memory consolidation system with LLM-based fact extraction: Features: - Bidirectional memory: stores both user and Miku messages - LLM-based fact extraction (replaces regex for intelligent pattern detection) - Filters Miku's responses during fact extraction (only user messages analyzed) - Trivial message filtering (removes lol, k, ok, etc.) - Manual consolidation trigger via 'consolidate now' command - Declarative fact recall with semantic search - User separation via metadata (user_id, guild_id) - Tested: 60% fact recall accuracy, 39 episodic memories, 11 facts extracted Phase 2 Requirements Complete: ✅ Minimal real-time filtering ✅ Nightly consolidation task (manual trigger works) ✅ Context-aware LLM analysis ✅ Extract declarative facts ✅ Metadata enrichment Test Results: - Episodic memories: 39 stored (user + Miku) - Declarative facts: 11 extracted from user messages only - Fact recall accuracy: 3/5 queries (60%) - Pipeline test: PASS Ready for production deployment with scheduled consolidation.	2026-02-03 23:17:27 +02:00

6 Commits