23 Commits

Author SHA1 Message Date
486acb5c14 Fix reply-context speaker confusion with structured metadata pipeline
Previously, when a user replied to Miku's message via Discord's reply
feature, Miku's quoted words were embedded directly into the user's
message text using the format:
  [Replying to your message: "Miku's words"] User's response

This caused two problems:
1. The LLM had to parse "your message" to determine the quoted text
   was MIKU's words — fragile and frequently misattributed
2. When stored in episodic memory as [User]: ..., Miku's quoted words
   were permanently mislabeled under the user's speaker prefix

Now reply context flows through as structured metadata:
- bot/bot.py captures the replied-to text WITHOUT embedding it in prompt
- cat_client.py passes it as discord_reply_context in the WebSocket payload
- discord_bridge.py injects it as agent_input['reply_context'] — a
  CLEARLY LABELED note: [The user is replying to what you (Miku) said — ...]
- miku_personality.py + evil_miku_personality.py render it via
  {reply_context} placeholder in the prompt suffix, between memory
  context and conversation history

This keeps Miku's words as a separate context note, never mixed into
the user's HumanMessage. Episodic memory only stores the user's actual
words. The fallback path (when Cat is unavailable) also uses a cleaner
format with explicit speaker labels.
2026-06-03 22:50:03 +03:00
9d2c14fa0b Fix vision pipeline: ffmpeg removal by autoremove, increase vision timeout, reduce frame count, add Discord activity awareness
- bot/Dockerfile: Add ffmpeg to reinstall line after apt-get autoremove
  (autoremove was sweeping up ffmpeg as 'no longer needed' after playwright install)
- bot/utils/image_handling.py: Increase video analysis timeout 120s→300s, 6→3 for Tenor GIFs (GTX 1660 VRAM constraint)
- bot/utils/activities.py: Add _activity_changed_at timestamp tracking,
  get_current_activity_label() and get_current_activity_fresh() with 30-min decay
- bot/utils/cat_client.py: Pass current Discord activity to Cheshire Cat pipeline
- bot/utils/llm.py: Inject current Discord activity into system prompt
- cat-plugins/*: Forward Discord activity through working_memory to personality plugins
- bot/persona/*/preamble.txt: Add Discord status usage guidelines for character prompts
- llama-swap-rocm-config.yaml: Add qwen3.5 model entry for ComfyUI prompt generation
- AGENTS.md: New project documentation file
2026-05-27 01:18:12 +03:00
e1f81e52e5 Fix Miku confusing who said what in conversations
Three interrelated fixes for speaker attribution confusion:

1. Fix misleading episodic memory header (discord_bridge.py):
   The Cat core hardcodes '## Context of things the Human said in the past:'
   when formatting recalled conversations. Our plugins store BOTH user messages
   ([User]: prefix) AND Miku's own responses ([Miku]: prefix) in episodic memory.
   This misleading header primes the LLM to attribute Miku's words to the user.
   Replaced with '## Past conversation excerpts (prefixed by who said what):'
   which accurately describes the mixed-speaker content.

2. Tighten episodic recall (discord_bridge.py):
   Added before_cat_recalls_episodic_memories hook setting threshold=0.75
   (vs default 0.7) to reduce the chance of Miku's own just-uttered response
   being recalled on the very next user message, which would feed her own
   words back as misleading context.

3. Add role clarification (miku_personality.py & evil_miku_personality.py):
   Added a clarifying note after '# Conversation until now:' in the prompt
   suffix to explicitly tell the model that 'Human = the user, AI = you (Miku)',
   helping it reconcile the two labeling systems (episodic [User]/[Miku] prefixes
   vs conversation history Human/AI roles).
2026-05-22 16:38:34 +03:00
27f0659cc8 fix: Evil Miku now ignores questions — restore engagement + remove suffix hammer
Preamble:
- Sentence limit 1-3 → 2-4 (revert to original 'sting, then land' range)
- Remove 'if you can say it in one, say it in one' (encouraged lazy dismissals)
- Add engagement rule: 'Always engage with what was said — acknowledge the
  question or statement, then twist the knife. Ignoring isn\'t sharp, it\'s lazy.'

Suffix:
- Remove '[Keep responses short and cutting — 1-3 sentences. No monologues.]'
  The suffix was the LAST thing the model processed, so its brevity hammer
  overpowered the preamble's engagement instruction. Preamble alone is enough.
2026-05-18 11:09:24 +03:00
6b6d705024 fix: Evil Miku wordiness regression from username prompt changes
Five targeted fixes:

1. discord_bridge (priority 100): Skip 'cheerful virtual idol' wrapper and
   'CRITICAL INSTRUCTION' about facts when evil_mode is active. Evil Miku
   gets her own prompt from evil_miku_personality plugin.

2. memory_consolidation (priority 10): Soften fact-usage pressure:
   'Use THESE facts when answering' → 'You may reference these facts if
   relevant to the conversation'. Also soften username command tone.

3. evil_miku_personality (priority 100→101): Bump above discord_bridge
   so Evil Miku's prefix replacement deterministically discards any
   Miku-mode wrappers regardless of plugin load order.

4. evil preamble: Restructure for brevity — add 'Be SHORT and SHARP'
   declaration, move RESPONSE RULES before mood, tighten sentence limit
   from 2-4 to 1-3 with 'if you can say it in one, say it in one.'

5. evil suffix: Add final brevity reminder '[Keep responses short and
   cutting — 1-3 sentences. No monologues.]' right before conversation
   for maximum recency influence.
2026-05-18 10:56:07 +03:00
46ea4f2c53 fix(memory): prevent stale name facts from overriding Discord display name
Two bugs were causing Miku to call users by wrong names:

BUG 1 - No authoritative source:
  Declarative name facts ('The user's name is Lily') were injected into
  the prompt without any counterweight. If an old consolidation run
  extracted a wrong name, Miku would believe it forever.
  Fix: agent_prompt_prefix now appends the user's Discord display name
  as AUTHORITATIVE context, with explicit instruction to prefer it over
  any contradictory name facts.

BUG 2 - Dedup prevented name updates:
  _is_duplicate_fact() used vector similarity to detect duplicates.
  'The user's name is Lily' and 'The user's name is koko210Serve' are
  ~80% identical text, giving high cosine similarity (>0.85 threshold).
  New correct name facts were silently rejected as 'duplicates'.
  Fix: name facts now use _find_existing_fact() to compare fact_value
  directly. If the name changed, old fact is deleted and new one stored.

Also: the extraction prompt now includes the user's Discord display
name as a hint, so the LLM knows the authoritative name when extracting
facts during consolidation.
2026-05-17 11:20:49 +03:00
8b3bc02f9e refactor: DRY system prompts into shared preamble files
Step 4 of memory system overhaul: single source of truth for prompts.

Problem: The system prompt was defined inline in 4 different places:
  miku_personality.py, evil_miku_personality.py, llm.py, discord_bridge.py.
These could drift out of sync — and the discord_bridge WebUI
reconstruction was already missing CRITICAL RULES, CHARACTER CONTEXT,
MOOD GUIDELINES, and RESPONSE RULES sections.

Fix:
- Create persona/miku/preamble.txt — canonical normal Miku preamble
- Create persona/evil/preamble.txt — canonical evil Miku preamble
  (with {mood_name} and {mood_description} format placeholders)
- All 5 consumers now read from these files:
  * miku_personality.py (Cat plugin, primary path)
  * evil_miku_personality.py (Cat plugin, primary path)
  * discord_bridge.py (WebUI 'Last Prompt' reconstruction)
  * llm.py (fallback path, normal Miku)
  * evil_mode.py get_evil_system_prompt() (fallback path, evil Miku)
- All consumers include graceful fallbacks if preamble files are missing
- Fixed evil_mode.py discrepancy: 'body and size' now matches canonical

The preamble files are Docker volume-mounted into both containers:
  bot/persona/ → /app/persona/ (bot, via Dockerfile COPY)
  bot/persona/ → /app/cat/data/ (Cat, via docker-compose volume mount)
Editing the preamble file on the host immediately updates the Cat path
(bot path requires rebuild due to COPY).
2026-05-15 14:43:19 +03:00
e7ec82d154 feat(memory): add [User]: prefix to user messages for speaker clarity
Prevents Miku from confusing her own words with what users said.

User messages stored by discord_bridge now get a '[User]: ' prefix on
page_content, mirroring the existing '[Miku]: ' prefix on Miku's own
responses. When episodic memories are recalled via RAG and injected
into the prompt, the LLM can now clearly distinguish:

  [User]: I like pizza
  [Miku]: That's great! What toppings do you like?

Without this, raw user text looked identical to Miku's text in the
recalled memory context, causing potential confusion about who said what.

The consolidation classifier strips the [User]: prefix before analyzing
content, so word counts and pattern matching remain accurate.
2026-05-15 14:13:29 +03:00
5a740c9334 feat(memory): hybrid trivial-message classifier (heuristics + LLM batch)
Step 3 of memory system overhaul: smart junk detection.

Replaces the old 37-pattern frozenset (44% accuracy) with a 3-tier hybrid:

TIER 1 - DEFINITELY_TRIVIAL (instant delete, no LLM):
  50+ exact-match patterns, pure emoji, single char, punctuation-only

TIER 2 - DEFINITELY_IMPORTANT (instant keep, no LLM):
  8+ words, question with substance, first-person statements,
  numbers/dates, links, mentions

TIER 3 - BORDERLINE (batch → LLM for economical classification):
  2-7 word messages without clear markers
  Compact prompt: ~150-200 tokens per 20-message batch
  Safety default: KEEP on any parsing error

Real-time filtering (discord_bridge) uses conservative heuristics only:
  - 1-char, pure reactions, single emoji, custom emoji-only
  - 50+ single-word fillers
  - Never deletes multi-word messages in real-time
  - Philosophy: false negatives (junk stored) > false positives (data lost)

Consolidation gets the full hybrid pipeline with LLM for borderline
cases, achieving much better accuracy than the old 44% while keeping
token costs minimal (LLM only called during nightly consolidation,
not real-time chat).
2026-05-15 14:07:35 +03:00
e6e81885b3 feat(memory): tag all memories with source persona (miku/evil_miku)
Step 1 of memory system overhaul: persona tagging.

- discord_bridge: tag user messages with 'persona' metadata at storage time
- memory_consolidation: tag Miku's own responses with 'persona' metadata
- memory_consolidation: tag declarative facts with source persona during extraction
- memory_consolidation: pass persona context to LLM extraction prompt
- memory_consolidation: annotate cross-persona facts in prompt injection
  (e.g., '(learned as Evil Miku)' when Evil facts appear for Normal Miku)
- Web UI: show persona badge (🎤 Miku / 😈 Evil Miku) on facts and episodic
  memories in the Memory Management tab

This lets both personas know which version of Miku each memory came from,
enabling Evil Miku to distinguish her own memories from Normal Miku's.
2026-05-12 15:12:49 +03:00
d5b9964ce7 Fix vision pipeline: route images through Cat, pass user question to vision model
- Fix silent None return in analyze_image_with_vision exception handler
- Add None/empty guards after vision analysis in bot.py (image, video, GIF, Tenor)
- Route all image/video/GIF responses through Cheshire Cat pipeline (was
  calling query_llama directly), enabling episodic memory storage for media
  interactions and correct Last Prompt display in Web UI
- Add media_type parameter to cat_adapter.query() and forward as
  discord_media_type in WebSocket payload
- Update discord_bridge plugin to read media_type from payload and inject
  MEDIA NOTE into system prefix in before_agent_starts hook
- Add _extract_vision_question() helper to strip Discord mentions and bot-name
  triggers from user message; pass cleaned question to vision model so specific
  questions (e.g. 'what is the person wearing?') go directly to the vision model
  instead of the generic 'Describe this image in detail.' fallback
- Pass user_prompt to all analyze_image_with_qwen / analyze_video_with_vision
  call sites in bot.py (image, video, GIF, Tenor, embed paths)
- Fix autonomous reaction loops skipping messages that @mention the bot or have
  media attachments in DMs, preventing duplicate vision model calls for images
  already being processed by the main message handler
- Increase vision max_tokens: images 300->800, video/GIF 400->1000 (no VRAM
  impact; KV cache is pre-allocated at model load time)
2026-03-05 21:59:27 +02:00
335b58a867 feat: fix evil mode race conditions, expand moods and PFP detection
bipolar_mode.py:
- Replace unsafe globals.EVIL_MODE temporary overrides with
  force_evil_context parameter to fix async race conditions (3 sites)

moods.py:
- Add 6 new evil mood emojis: bored, manic, jealous, melancholic,
  playful_cruel, contemptuous
- Refactor rotate_dm_mood() to skip when evil mode active (evil mode
  has its own independent 2-hour rotation timer)

persona_dialogue.py:
- Same force_evil_context race condition fix (2 sites)
- Fix over-aggressive response cleanup that stripped common words
  (YES/NO/HIGH) — now uses targeted regex for structural markers only
- Update evil mood multipliers to match new mood set

profile_picture_context:
- Expand PFP detection regex for broader coverage (appearance questions,
  opinion queries, selection/change questions)
- Add plugin.json metadata file
2026-03-04 00:45:23 +02:00
892edf5564 feat: Last Prompt shows full prompt with evil mode awareness
- discord_bridge before_agent_starts now checks evil_mode from
  working_memory to load the correct personality files:
  Normal: miku_lore/prompt/lyrics + /app/moods/{mood}.txt
  Evil: evil_miku_lore/prompt/lyrics + /app/moods/evil/{mood}.txt
- Reads files directly instead of relying on cross-plugin working_memory
- cat_client.query() returns (response, full_prompt) tuple
- Full prompt includes system prefix + recalled memories + conversation
- API /prompt/cat returns full_prompt field
2026-03-01 01:17:06 +02:00
66881f4c88 refactor: deduplicate prompts, reorganize persona files, update paths
Prompt deduplication (~20% reduction, 4,743 chars saved):
- evil_miku_lore.txt: remove intra-file duplication (height rule 2x,
  cruelty-has-substance 2x, music secret 2x, adoration secret 2x),
  trim verbose restatements, cut speech examples from 10 to 6
- evil_miku_prompt.txt: remove entire PERSONALITY section (in lore),
  remove entire RESPONSE STYLE section (now only in preamble),
  soften height from prohibition to knowledge
- miku_lore.txt: remove RELATIONSHIPS section (duplicates FRIENDS)
- miku_prompt.txt: remove duplicate intro, 4 personality traits
  already in lore, FAMOUS SONGS section (in lore), fix response
  length inconsistency (1-2 vs 2-3 -> consistent 2-3)

Preamble updates (evil_mode.py, evil_miku_personality.py, llm.py,
miku_personality.py):
- Response rules now exist in ONE place only (preamble)
- Height rule softened: model knows 15.8m, can say it if asked,
  but won't default to quoting it when taunting
- Response length: 2-4 sentences (was 1-3), removed action template
  list that model was copying literally (*scoffs*, *rolls eyes*)
- Added: always include actual words, never action-only responses
- Normal Miku: trim CHARACTER CONTEXT, fix 1-3 -> 2-3 sentences

Directory reorganization:
- Move 6 persona files to bot/persona/{evil,miku}/ subdirectories
- Update all open() paths in evil_mode.py, context_manager.py,
  voice_manager.py, both Cat plugins
- Dockerfile: 6 COPY lines -> 1 (COPY persona /app/persona)
- docker-compose: 6 file mounts -> 2 directory mounts
  (bot/persona/evil -> cat/data/evil, bot/persona/miku -> cat/data/miku)

Evil Miku system (previously unstaged):
- Full evil mood management: 2h rotation timer, mood persistence,
  10 mood-specific autonomous template pools, mood-aware DMs
- Evil mode toggle with role color/nickname/pfp management
- get_evil_system_prompt() with mood integration

Add test_evil_moods.py: 10-mood x 3-message comprehensive test
2026-02-27 13:14:03 +02:00
9038f442a3 feat(evil-miku): add 10-mood system and Evil Miku Cat plugin
- Add 6 new evil mood files: bored, contemptuous, jealous, manic,
  melancholic, playful_cruel
- Rewrite 4 existing mood files: aggressive, cunning, evil_neutral,
  sarcastic (shorter, more focused descriptions)
- Add evil_miku_personality Cat plugin (parallel to miku_personality)
  with mood-aware system prompt, softened height rule, and balanced
  response length rules (2-4 sentences)
2026-02-27 13:11:37 +02:00
bb5067a89e fix: Add settings.json and enable profile_picture_context plugin
- Added empty settings.json required by Cat plugin system
- Plugin now appears in ACTIVE PLUGINS list
- Enabled via /plugins/toggle API endpoint
- Ready to inject PFP descriptions when user asks about it
2026-02-11 00:09:58 +02:00
eb557f655c feat: Add profile picture context plugin with regex-based injection
- Create profile_picture_context plugin to detect PFP queries via regex
- Inject current_description.txt only when user asks about profile picture
- Mount bot/memory directory in Cat container for PFP access
- Avoids context bloat by only adding PFP description when relevant
- Patterns match: 'what does your pfp look like', 'describe your avatar', etc.
- Works seamlessly with existing profile picture update system
- No manual sync needed - description auto-updates with PFP changes
2026-02-10 23:41:14 +02:00
34167eddae feat: Restore mood system and implement comprehensive memory editor UI
MOOD SYSTEM FIX:
- Mount bot/moods directory in docker-compose.yml for Cat container access
- Update miku_personality plugin to load mood descriptions from .txt files
- Add Cat logger for debugging mood loading (replaces print statements)
- Moods now dynamically loaded from working_memory instead of hardcoded neutral
2026-02-10 22:03:54 +02:00
fbd940e711 fix: Restore declarative memory recall by preserving suffix template
Root cause: The miku_personality plugin's agent_prompt_suffix hook was returning
an empty string, which wiped out the {declarative_memory} and {episodic_memory}
placeholders from the prompt template. This caused the LLM to never receive any
stored facts about users, resulting in hallucinated responses.

Changes:
- miku_personality: Changed agent_prompt_suffix to return the memory context
  section with {episodic_memory}, {declarative_memory}, and {tools_output}
  placeholders instead of empty string

- discord_bridge: Added before_cat_recalls_declarative_memories hook to increase
  k-value from 3 to 10 and lower threshold from 0.7 to 0.5 for better fact
  retrieval. Added agent_prompt_prefix to emphasize factual accuracy. Added
  debug logging via before_agent_starts hook.

Result: Miku now correctly recalls user facts (favorite songs, games, etc.)
from declarative memory with 100% accuracy.

Tested with:
- 'What is my favorite song?' → Correctly answers 'Monitoring (Best Friend Remix) by DECO*27'
- 'Do you remember my favorite song?' → Correctly recalls the song
- 'What is my favorite video game?' → Correctly answers 'Sonic Adventure'
2026-02-09 12:33:31 +02:00
11b90ebb46 fix: Phase 3 bug fixes - memory APIs, username visibility, web UI layout, Docker
**Critical Bug Fixes:**

1. Per-user memory isolation bug
   - Changed CatAdapter from HTTP POST to WebSocket /ws/{user_id}
   - User_id now comes from URL path parameter (true per-user isolation)
   - Verified: Different users can't see each other's memories

2. Memory API 405 errors
   - Replaced non-existent Cat endpoint calls with Qdrant direct queries
   - get_memory_points(): Now uses POST /collections/{collection}/points/scroll
   - delete_memory_point(): Now uses POST /collections/{collection}/points/delete

3. Memory stats showing null counts
   - Reimplemented get_memory_stats() to query Qdrant directly
   - Now returns accurate counts: episodic: 20, declarative: 6, procedural: 4

4. Miku couldn't see usernames
   - Modified discord_bridge before_cat_reads_message hook
   - Prepends [Username says:] to every message text
   - LLM now knows who is texting: [Alice says:] Hello Miku!

5. Web UI Memory tab layout
   - Tab9 was positioned outside .tab-container div (showed to the right)
   - Moved tab9 HTML inside container, before closing divs
   - Memory tab now displays below tab buttons like other tabs

**Code Changes:**

bot/utils/cat_client.py:
- Line 25: Logger name changed to 'llm' (available component)
- get_memory_stats() (lines 256-285): Query Qdrant directly via HTTP GET
- get_memory_points() (lines 275-310): Use Qdrant POST /points/scroll
- delete_memory_point() (lines 350-370): Use Qdrant POST /points/delete

cat-plugins/discord_bridge/discord_bridge.py:
- Fixed .pop() → .get() (UserMessage is Pydantic BaseModelDict)
- Added before_cat_reads_message logic to prepend [Username says:]
- Message format: [Alice says:] message content

Dockerfile.llamaswap-rocm:
- Lines 37-44: Added conditional check for UI directory
- if [ -d ui ] before npm install && npm run build
- Fixes build failure when llama-swap UI dir doesn't exist

bot/static/index.html:
- Moved tab9 from lines 1554-1688 (outside container)
- To position before container closing divs (now inside)
- Memory tab button at line 673: 🧠 Memories

**Testing & Verification:**
 Per-user isolation verified (Docker exec test)
 Memory stats showing real counts (curl test)
 Memory API working (facts/episodic loading)
 Web UI layout fixed (tab displays correctly)
 All 5 services running (llama-swap, llama-swap-amd, qdrant, cat, bot)
 Username prepending working (message context for LLM)

**Result:** All Phase 3 critical bugs fixed and verified working.
2026-02-07 23:27:15 +02:00
14e1a8df51 Phase 3: Unified Cheshire Cat integration with WebSocket-based per-user isolation
Key changes:
- CatAdapter (bot/utils/cat_client.py): WebSocket /ws/{user_id} for chat
  queries instead of HTTP POST (fixes per-user memory isolation when no
  API keys are configured — HTTP defaults all users to user_id='user')
- Memory management API: 8 endpoints for status, stats, facts, episodic
  memories, consolidation trigger, multi-step delete with confirmation
- Web UI: Memory tab (tab9) with collection stats, fact/episodic browser,
  manual consolidation trigger, and 3-step delete flow requiring exact
  confirmation string
- Bot integration: Cat-first response path with query_llama fallback for
  both text and embed responses, server mood detection
- Discord bridge plugin: fixed .pop() to .get() (UserMessage is a Pydantic
  BaseModelDict, not a raw dict), metadata extraction via extra attributes
- Unified docker-compose: Cat + Qdrant services merged into main compose,
  bot depends_on Cat healthcheck
- All plugins (discord_bridge, memory_consolidation, miku_personality)
  consolidated into cat-plugins/ for volume mount
- query_llama deprecated but functional for compatibility
2026-02-07 20:22:03 +02:00
edb88e9ede fix: Phase 2 integrity review - v2.0.0 rewrite & bugfixes
Memory Consolidation Plugin (828 -> 465 lines):
- Replace SentenceTransformer with cat.embedder.embed_query() for vector consistency
- Fix per-user fact isolation: source=user_id instead of global
- Add duplicate fact detection (_is_duplicate_fact, score_threshold=0.85)
- Remove ~350 lines of dead async run_consolidation() code
- Remove duplicate declarative search in before_cat_sends_message
- Unify trivial patterns into TRIVIAL_PATTERNS frozenset
- Remove all sys.stderr.write debug logging
- Remove sentence-transformers from requirements.txt (no external deps)

Loguru Fix (cheshire-cat/cat/log.py):
- Patch Cat v1.6.2 loguru format to provide default extra fields
- Fixes KeyError: 'original_name' from third-party libs (fastembed)
- Mounted via docker-compose volume

Discord Bridge:
- Copy discord_bridge.py to cat-plugins/ (was empty directory)

Test Results (6/7 pass, 100% fact recall):
- 11 facts extracted, per-user isolation working
- Duplicate detection effective (+2 on 2nd run)
- 5/5 natural language recall queries correct
2026-02-07 19:24:46 +02:00
83c103324c feat: Phase 2 Memory Consolidation - Production Ready
Implements intelligent memory consolidation system with LLM-based fact extraction:

Features:
- Bidirectional memory: stores both user and Miku messages
- LLM-based fact extraction (replaces regex for intelligent pattern detection)
- Filters Miku's responses during fact extraction (only user messages analyzed)
- Trivial message filtering (removes lol, k, ok, etc.)
- Manual consolidation trigger via 'consolidate now' command
- Declarative fact recall with semantic search
- User separation via metadata (user_id, guild_id)
- Tested: 60% fact recall accuracy, 39 episodic memories, 11 facts extracted

Phase 2 Requirements Complete:
 Minimal real-time filtering
 Nightly consolidation task (manual trigger works)
 Context-aware LLM analysis
 Extract declarative facts
 Metadata enrichment

Test Results:
- Episodic memories: 39 stored (user + Miku)
- Declarative facts: 11 extracted from user messages only
- Fact recall accuracy: 3/5 queries (60%)
- Pipeline test: PASS

Ready for production deployment with scheduled consolidation.
2026-02-03 23:17:27 +02:00