Files
miku-discord/CHAT_INTERFACE_FEATURE.md
koko210Serve bb82b7f146 Add interactive Chat with LLM interface to Web UI
Features:
- Real-time streaming chat interface (ChatGPT-like experience)
- Model selection: Text model (fast) or Vision model (image analysis)
- System prompt toggle: Chat with Miku's personality or raw LLM
- Mood selector: Choose from 14 different emotional states
- Full context integration: Uses complete miku_lore.txt, miku_prompt.txt, and miku_lyrics.txt
- Conversation memory: Maintains chat history throughout session
- Image upload support for vision model
- Horizontal scrolling tabs for responsive design
- Clear chat history functionality
- SSE (Server-Sent Events) for streaming responses
- Keyboard shortcuts (Ctrl+Enter to send)

Technical changes:
- Added POST /chat/stream endpoint in api.py with streaming support
- Updated ChatMessage model with mood, conversation_history, and image_data
- Integrated context_manager for proper Miku personality context
- Added Chat with LLM tab to index.html
- Implemented JavaScript streaming client with EventSource-like handling
- Added CSS for chat messages, typing indicators, and animations
- Made tab navigation horizontally scrollable for narrow viewports
2025-12-13 00:23:03 +02:00

8.7 KiB

Chat Interface Feature Documentation

Overview

A new "Chat with LLM" tab has been added to the Miku bot Web UI, allowing you to chat directly with the language models with full streaming support (similar to ChatGPT).

Features

1. Model Selection

  • 💬 Text Model (Fast): Chat with the text-based LLM for quick conversations
  • 👁️ Vision Model (Images): Use the vision model to analyze and discuss images

2. System Prompt Options

  • Use Miku Personality: Attach the standard Miku personality system prompt
    • Text model: Gets the full Miku character prompt (same as query_llama)
    • Vision model: Gets a simplified Miku-themed image analysis prompt
  • Raw LLM (No Prompt): Chat directly with the base LLM without any personality
    • Great for testing raw model responses
    • No character constraints

3. Real-time Streaming

  • Messages stream in character-by-character like ChatGPT
  • Shows typing indicator while waiting for response
  • Smooth, responsive interface

4. Vision Model Support

  • Upload images when using the vision model
  • Image preview before sending
  • Analyze images with Miku's personality or raw vision capabilities

5. Chat Management

  • Clear chat history button
  • Timestamps on all messages
  • Color-coded messages (user vs assistant)
  • Auto-scroll to latest message
  • Keyboard shortcut: Ctrl+Enter to send messages

Technical Implementation

Backend (api.py)

New Endpoint: POST /chat/stream

# Accepts:
{
  "message": "Your chat message",
  "model_type": "text" | "vision",
  "use_system_prompt": true | false,
  "image_data": "base64_encoded_image" (optional, for vision model)
}

# Returns: Server-Sent Events (SSE) stream
data: {"content": "streamed text chunk"}
data: {"done": true}
data: {"error": "error message"}

Key Features:

  • Uses Server-Sent Events (SSE) for streaming
  • Supports both TEXT_MODEL and VISION_MODEL from globals
  • Dynamically switches system prompts based on configuration
  • Integrates with llama.cpp's streaming API

Frontend (index.html)

New Tab: "💬 Chat with LLM"

Located in the main navigation tabs (tab6)

Components:

  1. Configuration Panel

    • Radio buttons for model selection
    • Radio buttons for system prompt toggle
    • Image upload section (shows/hides based on model)
    • Clear chat history button
  2. Chat Messages Container

    • Scrollable message history
    • Animated message appearance
    • Typing indicator during streaming
    • Color-coded messages with timestamps
  3. Input Area

    • Multi-line text input
    • Send button with loading state
    • Keyboard shortcuts

JavaScript Functions:

  • sendChatMessage(): Handles message sending and streaming reception
  • toggleChatImageUpload(): Shows/hides image upload for vision model
  • addChatMessage(): Adds messages to chat display
  • showTypingIndicator() / hideTypingIndicator(): Typing animation
  • clearChatHistory(): Clears all messages
  • handleChatKeyPress(): Keyboard shortcuts

Usage Guide

Basic Text Chat with Miku

  1. Go to "💬 Chat with LLM" tab
  2. Ensure "💬 Text Model" is selected
  3. Ensure " Use Miku Personality" is selected
  4. Type your message and click "📤 Send" (or press Ctrl+Enter)
  5. Watch as Miku's response streams in real-time!

Raw LLM Testing

  1. Select "💬 Text Model"
  2. Select " Raw LLM (No Prompt)"
  3. Chat directly with the base language model without personality constraints

Vision Model Chat

  1. Select "👁️ Vision Model"
  2. Click "Upload Image" and select an image
  3. Type a message about the image (e.g., "What do you see in this image?")
  4. Click "📤 Send"
  5. The vision model will analyze the image and respond

Vision Model with Miku Personality

  1. Select "👁️ Vision Model"
  2. Keep " Use Miku Personality" selected
  3. Upload an image
  4. Miku will analyze and comment on the image with her cheerful personality!

System Prompts

Text Model (with Miku personality)

Uses the same comprehensive system prompt as query_llama():

  • Full Miku character context
  • Current mood integration
  • Character consistency rules
  • Natural conversation guidelines

Vision Model (with Miku personality)

Simplified prompt optimized for image analysis:

You are Hatsune Miku analyzing an image. Describe what you see naturally 
and enthusiastically as Miku would. Be detailed but conversational. 
React to what you see with Miku's cheerful, playful personality.

No System Prompt

Both models respond without personality constraints when this option is selected.

Streaming Technology

The interface uses Server-Sent Events (SSE) for real-time streaming:

  • Backend sends chunked responses from llama.cpp
  • Frontend receives and displays chunks as they arrive
  • Smooth, ChatGPT-like experience
  • Works with both text and vision models

UI/UX Features

Message Styling

  • User messages: Green accent, right-aligned feel
  • Assistant messages: Blue accent, left-aligned feel
  • Error messages: Red accent with error icon
  • Fade-in animation: Smooth appearance for new messages

Responsive Design

  • Chat container scrolls automatically
  • Image preview for vision model
  • Loading states on buttons
  • Typing indicators
  • Custom scrollbar styling

Keyboard Shortcuts

  • Ctrl+Enter: Send message quickly
  • Tab: Navigate between input fields

Configuration Options

All settings are preserved during the chat session:

  • Model type (text/vision)
  • System prompt toggle (Miku/Raw)
  • Uploaded image (for vision model)

Settings do NOT persist after page refresh (fresh session each time).

Error Handling

The interface handles various errors gracefully:

  • Connection failures
  • Model errors
  • Invalid image files
  • Empty messages
  • Timeout issues

All errors are displayed in the chat with clear error messages.

Performance Considerations

Text Model

  • Fast responses (typically 1-3 seconds)
  • Streaming starts almost immediately
  • Low latency

Vision Model

  • Slower due to image processing
  • First token may take 3-10 seconds
  • Streaming continues once started
  • Image is sent as base64 (efficient)

Development Notes

File Changes

  1. bot/api.py

    • Added from fastapi.responses import StreamingResponse
    • Added ChatMessage Pydantic model
    • Added POST /chat/stream endpoint with SSE support
  2. bot/static/index.html

    • Added tab6 button in navigation
    • Added complete chat interface HTML
    • Added CSS styles for chat messages and animations
    • Added JavaScript functions for chat functionality

Dependencies

  • Uses existing aiohttp for HTTP streaming
  • Uses existing globals.TEXT_MODEL and globals.VISION_MODEL
  • Uses existing globals.LLAMA_URL for llama.cpp connection
  • No new dependencies required!

Future Enhancements (Ideas)

Potential improvements for future versions:

  • Save/load chat sessions
  • Export chat history to file
  • Multi-user chat history (separate sessions per user)
  • Temperature and max_tokens controls
  • Model selection dropdown (if multiple models available)
  • Token count display
  • Voice input support
  • Markdown rendering in responses
  • Code syntax highlighting
  • Copy message button
  • Regenerate response button

Troubleshooting

"No response received from LLM"

  • Check if llama.cpp server is running
  • Verify LLAMA_URL in globals is correct
  • Check bot logs for connection errors

"Failed to read image file"

  • Ensure image is valid format (JPEG, PNG, GIF)
  • Check file size (large images may cause issues)
  • Try a different image

Streaming not working

  • Check browser console for JavaScript errors
  • Verify SSE is not blocked by proxy/firewall
  • Try refreshing the page

Model not responding

  • Check if correct model is loaded in llama.cpp
  • Verify model type matches what's configured
  • Check llama.cpp logs for errors

API Reference

POST /chat/stream

Request Body:

{
  "message": "string",          // Required: User's message
  "model_type": "text|vision",  // Required: Which model to use
  "use_system_prompt": boolean, // Required: Whether to add system prompt
  "image_data": "string|null"   // Optional: Base64 image for vision model
}

Response:

Content-Type: text/event-stream

data: {"content": "Hello"}
data: {"content": " there"}
data: {"content": "!"}
data: {"done": true}

Error Response:

data: {"error": "Error message here"}

Conclusion

The Chat Interface provides a powerful, user-friendly way to:

  • Test LLM responses interactively
  • Experiment with different prompting strategies
  • Analyze images with vision models
  • Chat with Miku's personality in real-time
  • Debug and understand model behavior

All with a smooth, modern streaming interface that feels like ChatGPT! 🎉