Files

koko210Serve 8c74ad5260 Initial commit: Miku Discord Bot

2025-12-07 17:15:09 +02:00

8.7 KiB

Raw Blame History

Conversation History System V2

Overview

The new conversation history system provides centralized, intelligent management of conversation context across all bot interactions.

Key Improvements

1. Per-Channel History (Was: Per-User Globally)

Servers: History tracked per guild_id - all users in a server share conversation context
DMs: History tracked per user_id - each DM has its own conversation thread
Benefit: Miku can follow multi-user conversations in servers and remember context across users

2. Rich Message Metadata

Each message stores:

author_name: Display name of the speaker
content: Message text
timestamp: When the message was sent
is_bot: Whether it's from Miku or a user

3. Intelligent Formatting

The system formats messages differently based on context:

Multi-user servers: "Alice: Hello!" format to distinguish speakers
DMs: Simple content without author prefix
LLM output: OpenAI-compatible {"role": "user"|"assistant", "content": "..."} format

4. Automatic Filtering

Empty messages automatically skipped
Messages truncated to 500 characters to prevent context overflow
Vision analysis context preserved inline

5. Backward Compatibility

Still writes to globals.conversation_history for legacy code
Uses same user_id parameter in query_llama()
Autonomous functions work without modification

Architecture

Core Class: `ConversationHistory`

Located in: bot/utils/conversation_history.py

from utils.conversation_history import conversation_history

# Add a message
conversation_history.add_message(
    channel_id="123456789",      # guild_id or user_id
    author_name="Alice",          # Display name
    content="Hello Miku!",        # Message text
    is_bot=False                  # True if from Miku
)

# Get recent messages
messages = conversation_history.get_recent_messages("123456789", max_messages=8)
# Returns: [(author, content, is_bot), ...]

# Format for LLM
llm_messages = conversation_history.format_for_llm("123456789", max_messages=8)
# Returns: [{"role": "user", "content": "Alice: Hello!"}, ...]

# Get statistics
stats = conversation_history.get_channel_stats("123456789")
# Returns: {"total_messages": 10, "bot_messages": 5, "user_messages": 5}

Usage in `query_llama()`

Updated Signature

async def query_llama(
    user_prompt,
    user_id,                  # For DMs: actual user ID; For servers: can be anything
    guild_id=None,            # Server ID (None for DMs)
    response_type="dm_response",
    model=None,
    author_name=None          # NEW: Display name for multi-user context
):

Channel ID Logic

channel_id = str(guild_id) if guild_id else str(user_id)

Server messages: channel_id = guild_id → All server users share history
DM messages: channel_id = user_id → Each DM has separate history

Example Calls

Server message:

response = await query_llama(
    prompt="What's the weather?",
    user_id=str(message.author.id),
    guild_id=message.guild.id,           # Server context
    response_type="server_response",
    author_name=message.author.display_name  # "Alice"
)
# History saved to channel_id=guild_id

DM message:

response = await query_llama(
    prompt="Tell me a joke",
    user_id=str(message.author.id),
    guild_id=None,                       # No server
    response_type="dm_response",
    author_name=message.author.display_name
)
# History saved to channel_id=user_id

Autonomous message:

message = await query_llama(
    prompt="Say something fun!",
    user_id=f"miku-autonomous-{guild_id}",  # Consistent ID
    guild_id=guild_id,                      # Server context
    response_type="autonomous_general"
)
# History saved to channel_id=guild_id

Image/Video Analysis

Updated `rephrase_as_miku()`

async def rephrase_as_miku(
    vision_output,
    user_prompt,
    guild_id=None,
    user_id=None,        # NEW: Actual user ID
    author_name=None     # NEW: Display name
):

How It Works

Vision analysis injected into history:

conversation_history.add_message(
    channel_id=channel_id,
    author_name="Vision System",
    content=f"[Image/Video Analysis: {vision_output}]",
    is_bot=False
)

Follow-up questions remember the image:
- User sends image → Vision analysis added to history
- User asks "What color is the car?" → Miku sees the vision analysis in history
- User asks "Who made this meme?" → Still has vision context

Example Flow

[USER] Bob: *sends meme.gif*
[Vision System]: [Image/Video Analysis: A cat wearing sunglasses with text "deal with it"]
[BOT] Miku: Haha, that's a classic meme! The cat looks so cool! 😎
[USER] Bob: Who made this meme?
[BOT] Miku: The "Deal With It" meme originated from...

Migration from Old System

What Changed

Old System	New System
`globals.conversation_history[user_id]`	`conversation_history.add_message(channel_id, ...)`
Per-user globally	Per-server or per-DM
`[(user_msg, bot_msg), ...]` tuples	Rich metadata with author, timestamp, role
Manual filtering in `llm.py`	Automatic filtering in `ConversationHistory`
Image analysis used `user_id="image_analysis"`	Uses actual user's channel_id
Reply feature added `("", message)` tuples	No manual reply handling needed

Backward Compatibility

The new system still writes to globals.conversation_history for any code that might depend on it:

# In llm.py after getting LLM response
globals.conversation_history[user_id].append((user_prompt, reply))

This ensures existing code doesn't break during migration.

Testing

Run the test suite:

cd /home/koko210Serve/docker/ollama-discord/bot
python test_conversation_history.py

Tests cover:

✅ Adding messages to server channels
✅ Adding messages to DM channels
✅ Formatting for LLM (OpenAI messages)
✅ Empty message filtering
✅ Message truncation (500 char limit)
✅ Channel statistics

Benefits

1. Context Preservation

Multi-user conversations tracked properly
Image/video descriptions persist across follow-up questions
No more lost context when using Discord reply feature

2. Token Efficiency

Automatic truncation prevents context overflow
Empty messages filtered out
Configurable message limits (default: 8 messages)

3. Better Multi-User Support

Server conversations include author names: "Alice: Hello!"
Miku understands who said what
Enables natural group chat dynamics

4. Debugging & Analytics

Rich metadata for each message
Channel statistics (total, bot, user message counts)
Timestamp tracking for future features

5. Maintainability

Single source of truth for conversation history
Clean API: add_message(), get_recent_messages(), format_for_llm()
Centralized filtering and formatting logic

Future Enhancements

Possible improvements:

Persistent storage (save history to disk/database)
Conversation summarization for very long threads
Per-user preferences (some users want more/less context)
Automatic context pruning based on relevance
Export conversation history for analysis
Integration with dm_interaction_analyzer

Code Locations

File	Changes
`bot/utils/conversation_history.py`	NEW - Core history management class
`bot/utils/llm.py`	Updated to use new system, added `author_name` parameter
`bot/bot.py`	Pass `author_name` to `query_llama()`, removed reply pollution
`bot/utils/image_handling.py`	`rephrase_as_miku()` accepts `user_id` and `author_name`
`bot/utils/autonomous_v1_legacy.py`	No changes needed (already guild-based)
`bot/test_conversation_history.py`	NEW - Test suite

Summary

The new conversation history system provides:

✅ Per-channel tracking (server-wide or DM-specific)
✅ Rich metadata (author, timestamp, role)
✅ Intelligent formatting (with author names in servers)
✅ Automatic filtering (empty messages, truncation)
✅ Image/video context (vision analysis persists)
✅ Backward compatibility (legacy code still works)
✅ Clean API (simple, testable functions)

This solves the original problems:

❌ ~~Video descriptions lost~~ → ✅ Now preserved in channel history
❌ ~~Reply feature polluted history~~ → ✅ No manual reply handling
❌ ~~Image analysis separate user_id~~ → ✅ Uses actual channel_id
❌ ~~Autonomous actions broke history~~ → ✅ Guild-based IDs work naturally

8.7 KiB Raw Blame History