Fix reply-context speaker confusion with structured metadata pipeline

Previously, when a user replied to Miku's message via Discord's reply
feature, Miku's quoted words were embedded directly into the user's
message text using the format:
  [Replying to your message: "Miku's words"] User's response

This caused two problems:
1. The LLM had to parse "your message" to determine the quoted text
   was MIKU's words — fragile and frequently misattributed
2. When stored in episodic memory as [User]: ..., Miku's quoted words
   were permanently mislabeled under the user's speaker prefix

Now reply context flows through as structured metadata:
- bot/bot.py captures the replied-to text WITHOUT embedding it in prompt
- cat_client.py passes it as discord_reply_context in the WebSocket payload
- discord_bridge.py injects it as agent_input['reply_context'] — a
  CLEARLY LABELED note: [The user is replying to what you (Miku) said — ...]
- miku_personality.py + evil_miku_personality.py render it via
  {reply_context} placeholder in the prompt suffix, between memory
  context and conversation history

This keeps Miku's words as a separate context note, never mixed into
the user's HumanMessage. Episodic memory only stores the user's actual
words. The fallback path (when Cat is unavailable) also uses a cleaner
format with explicit speaker labels.
This commit is contained in:
2026-06-03 22:50:03 +03:00
parent 9d2c14fa0b
commit 486acb5c14
5 changed files with 39 additions and 5 deletions

View File

@@ -284,8 +284,12 @@ async def on_message(message):
prompt = text # No cleanup — keep it raw
user_id = str(message.author.id)
reply_context = None # Will be passed as structured metadata to Cat pipeline
# If user is replying to a specific message, add context marker
# If user is replying to a specific message, capture the context
# WITHOUT embedding it in the prompt text (that caused speaker confusion).
# Instead, it's passed as structured metadata — the Cat plugin injects it
# into the prompt as a clearly labeled context note, preserving speaker boundaries.
if message.reference:
try:
replied_msg = await message.channel.fetch_message(message.reference.message_id)
@@ -293,8 +297,7 @@ async def on_message(message):
if replied_msg.author == globals.client.user:
# Truncate the replied message to keep prompt manageable
replied_content = replied_msg.content[:200] + "..." if len(replied_msg.content) > 200 else replied_msg.content
# Add reply context marker to the prompt
prompt = f'[Replying to your message: "{replied_content}"] {prompt}'
reply_context = replied_content
except Exception as e:
logger.error(f"Failed to fetch replied message for context: {e}")
@@ -364,6 +367,7 @@ async def on_message(message):
author_name=author_name,
mood=current_mood,
response_type=response_type,
reply_context=reply_context,
)
if cat_result:
response, cat_full_prompt = cat_result
@@ -395,8 +399,11 @@ async def on_message(message):
# Fallback to direct LLM query if Cat didn't respond
if not response:
fallback_prompt = prompt
if reply_context:
fallback_prompt = f'[Context: you (Miku) said: {reply_context}]\n[User says:] {prompt}'
response = await query_llama(
prompt,
fallback_prompt,
user_id=str(message.author.id),
guild_id=guild_id,
response_type=response_type,