Fix reply-context speaker confusion with structured metadata pipeline

Previously, when a user replied to Miku's message via Discord's reply feature, Miku's quoted words were embedded directly into the user's message text using the format: [Replying to your message: "Miku's words"] User's response This caused two problems: 1. The LLM had to parse "your message" to determine the quoted text was MIKU's words — fragile and frequently misattributed 2. When stored in episodic memory as [User]: ..., Miku's quoted words were permanently mislabeled under the user's speaker prefix Now reply context flows through as structured metadata: - bot/bot.py captures the replied-to text WITHOUT embedding it in prompt - cat_client.py passes it as discord_reply_context in the WebSocket payload - discord_bridge.py injects it as agent_input['reply_context'] — a CLEARLY LABELED note: [The user is replying to what you (Miku) said — ...] - miku_personality.py + evil_miku_personality.py render it via {reply_context} placeholder in the prompt suffix, between memory context and conversation history This keeps Miku's words as a separate context note, never mixed into the user's HumanMessage. Episodic memory only stores the user's actual words. The fallback path (when Cat is unavailable) also uses a cleaner format with explicit speaker labels.
2026-06-03 22:50:03 +03:00
parent 9d2c14fa0b
commit 486acb5c14
5 changed files with 39 additions and 5 deletions
--- a/bot/utils/cat_client.py
+++ b/bot/utils/cat_client.py
@@ -109,6 +109,7 @@ class CatAdapter:
        mood: Optional[str] = None,
        response_type: str = "dm_response",
        media_type: Optional[str] = None,
+        reply_context: Optional[str] = None,
    ) -> Optional[tuple]:
        """
        Send a message through the Cat pipeline via WebSocket and get a response.
@@ -162,6 +163,12 @@ class CatAdapter:
        # Pass media type so discord_bridge can add MEDIA NOTE to the prompt
        if media_type:
            payload["discord_media_type"] = media_type
+        # Pass the message the user is replying to (if any) as structured metadata.
+        # The discord_bridge plugin injects this into the prompt as a clearly-labeled
+        # context note — keeping Miku's words separate from the user's message text
+        # and preventing the speaker confusion that the old embed-in-prompt format caused.
+        if reply_context:
+            payload["discord_reply_context"] = reply_context
        # Pass current Discord activity if it changed recently (30-min decay window)
        activity_label = get_current_activity_fresh()
        if activity_label: