CHAT_INTERFACE_FEATURE.md

# Chat Interface Feature Documentation

## Overview
A new **"Chat with LLM"** tab has been added to the Miku bot Web UI, allowing you to chat directly with the language models with full streaming support (similar to ChatGPT).

## Features

### 1. Model Selection
- **💬 Text Model (Fast)**: Chat with the text-based LLM for quick conversations
- **👁️ Vision Model (Images)**: Use the vision model to analyze and discuss images

### 2. System Prompt Options
- **✅ Use Miku Personality**: Attach the standard Miku personality system prompt
  - Text model: Gets the full Miku character prompt (same as `query_llama`)
  - Vision model: Gets a simplified Miku-themed image analysis prompt
- **❌ Raw LLM (No Prompt)**: Chat directly with the base LLM without any personality
  - Great for testing raw model responses
  - No character constraints

### 3. Real-time Streaming
- Messages stream in character-by-character like ChatGPT
- Shows typing indicator while waiting for response
- Smooth, responsive interface

### 4. Vision Model Support
- Upload images when using the vision model
- Image preview before sending
- Analyze images with Miku's personality or raw vision capabilities

### 5. Chat Management
- Clear chat history button
- Timestamps on all messages
- Color-coded messages (user vs assistant)
- Auto-scroll to latest message
- Keyboard shortcut: **Ctrl+Enter** to send messages

## Technical Implementation

### Backend (api.py)

#### New Endpoint: `POST /chat/stream`
```python
# Accepts:
{
  "message": "Your chat message",
  "model_type": "text" | "vision",
  "use_system_prompt": true | false,
  "image_data": "base64_encoded_image" (optional, for vision model)
}

# Returns: Server-Sent Events (SSE) stream
data: {"content": "streamed text chunk"}
data: {"done": true}
data: {"error": "error message"}
```

**Key Features:**
- Uses Server-Sent Events (SSE) for streaming
- Supports both `TEXT_MODEL` and `VISION_MODEL` from globals
- Dynamically switches system prompts based on configuration
- Integrates with llama.cpp's streaming API

### Frontend (index.html)

#### New Tab: "💬 Chat with LLM"
Located in the main navigation tabs (tab6)

**Components:**
1. **Configuration Panel**
   - Radio buttons for model selection
   - Radio buttons for system prompt toggle
   - Image upload section (shows/hides based on model)
   - Clear chat history button

2. **Chat Messages Container**
   - Scrollable message history
   - Animated message appearance
   - Typing indicator during streaming
   - Color-coded messages with timestamps

3. **Input Area**
   - Multi-line text input
   - Send button with loading state
   - Keyboard shortcuts

**JavaScript Functions:**
- `sendChatMessage()`: Handles message sending and streaming reception
- `toggleChatImageUpload()`: Shows/hides image upload for vision model
- `addChatMessage()`: Adds messages to chat display
- `showTypingIndicator()` / `hideTypingIndicator()`: Typing animation
- `clearChatHistory()`: Clears all messages
- `handleChatKeyPress()`: Keyboard shortcuts

## Usage Guide

### Basic Text Chat with Miku
1. Go to "💬 Chat with LLM" tab
2. Ensure "💬 Text Model" is selected
3. Ensure "✅ Use Miku Personality" is selected
4. Type your message and click "📤 Send" (or press Ctrl+Enter)
5. Watch as Miku's response streams in real-time!

### Raw LLM Testing
1. Select "💬 Text Model"
2. Select "❌ Raw LLM (No Prompt)"
3. Chat directly with the base language model without personality constraints

### Vision Model Chat
1. Select "👁️ Vision Model"
2. Click "Upload Image" and select an image
3. Type a message about the image (e.g., "What do you see in this image?")
4. Click "📤 Send"
5. The vision model will analyze the image and respond

### Vision Model with Miku Personality
1. Select "👁️ Vision Model"
2. Keep "✅ Use Miku Personality" selected
3. Upload an image
4. Miku will analyze and comment on the image with her cheerful personality!

## System Prompts

### Text Model (with Miku personality)
Uses the same comprehensive system prompt as `query_llama()`:
- Full Miku character context
- Current mood integration
- Character consistency rules
- Natural conversation guidelines

### Vision Model (with Miku personality)
Simplified prompt optimized for image analysis:
```
You are Hatsune Miku analyzing an image. Describe what you see naturally 
and enthusiastically as Miku would. Be detailed but conversational. 
React to what you see with Miku's cheerful, playful personality.
```

### No System Prompt
Both models respond without personality constraints when this option is selected.

## Streaming Technology

The interface uses **Server-Sent Events (SSE)** for real-time streaming:
- Backend sends chunked responses from llama.cpp
- Frontend receives and displays chunks as they arrive
- Smooth, ChatGPT-like experience
- Works with both text and vision models

## UI/UX Features

### Message Styling
- **User messages**: Green accent, right-aligned feel
- **Assistant messages**: Blue accent, left-aligned feel
- **Error messages**: Red accent with error icon
- **Fade-in animation**: Smooth appearance for new messages

### Responsive Design
- Chat container scrolls automatically
- Image preview for vision model
- Loading states on buttons
- Typing indicators
- Custom scrollbar styling

### Keyboard Shortcuts
- **Ctrl+Enter**: Send message quickly
- **Tab**: Navigate between input fields

## Configuration Options

All settings are preserved during the chat session:
- Model type (text/vision)
- System prompt toggle (Miku/Raw)
- Uploaded image (for vision model)

Settings do NOT persist after page refresh (fresh session each time).

## Error Handling

The interface handles various errors gracefully:
- Connection failures
- Model errors
- Invalid image files
- Empty messages
- Timeout issues

All errors are displayed in the chat with clear error messages.

## Performance Considerations

### Text Model
- Fast responses (typically 1-3 seconds)
- Streaming starts almost immediately
- Low latency

### Vision Model
- Slower due to image processing
- First token may take 3-10 seconds
- Streaming continues once started
- Image is sent as base64 (efficient)

## Development Notes

### File Changes
1. **`bot/api.py`**
   - Added `from fastapi.responses import StreamingResponse`
   - Added `ChatMessage` Pydantic model
   - Added `POST /chat/stream` endpoint with SSE support

2. **`bot/static/index.html`**
   - Added tab6 button in navigation
   - Added complete chat interface HTML
   - Added CSS styles for chat messages and animations
   - Added JavaScript functions for chat functionality

### Dependencies
- Uses existing `aiohttp` for HTTP streaming
- Uses existing `globals.TEXT_MODEL` and `globals.VISION_MODEL`
- Uses existing `globals.LLAMA_URL` for llama.cpp connection
- No new dependencies required!

## Future Enhancements (Ideas)

Potential improvements for future versions:
- [ ] Save/load chat sessions
- [ ] Export chat history to file
- [ ] Multi-user chat history (separate sessions per user)
- [ ] Temperature and max_tokens controls
- [ ] Model selection dropdown (if multiple models available)
- [ ] Token count display
- [ ] Voice input support
- [ ] Markdown rendering in responses
- [ ] Code syntax highlighting
- [ ] Copy message button
- [ ] Regenerate response button

## Troubleshooting

### "No response received from LLM"
- Check if llama.cpp server is running
- Verify `LLAMA_URL` in globals is correct
- Check bot logs for connection errors

### "Failed to read image file"
- Ensure image is valid format (JPEG, PNG, GIF)
- Check file size (large images may cause issues)
- Try a different image

### Streaming not working
- Check browser console for JavaScript errors
- Verify SSE is not blocked by proxy/firewall
- Try refreshing the page

### Model not responding
- Check if correct model is loaded in llama.cpp
- Verify model type matches what's configured
- Check llama.cpp logs for errors

## API Reference

### POST /chat/stream

**Request Body:**
```json
{
  "message": "string",          // Required: User's message
  "model_type": "text|vision",  // Required: Which model to use
  "use_system_prompt": boolean, // Required: Whether to add system prompt
  "image_data": "string|null"   // Optional: Base64 image for vision model
}
```

**Response:**
```
Content-Type: text/event-stream

data: {"content": "Hello"}
data: {"content": " there"}
data: {"content": "!"}
data: {"done": true}
```

**Error Response:**
```
data: {"error": "Error message here"}
```

## Conclusion

The Chat Interface provides a powerful, user-friendly way to:
- Test LLM responses interactively
- Experiment with different prompting strategies
- Analyze images with vision models
- Chat with Miku's personality in real-time
- Debug and understand model behavior

All with a smooth, modern streaming interface that feels like ChatGPT! 🎉
Add interactive Chat with LLM interface to Web UI Features: - Real-time streaming chat interface (ChatGPT-like experience) - Model selection: Text model (fast) or Vision model (image analysis) - System prompt toggle: Chat with Miku's personality or raw LLM - Mood selector: Choose from 14 different emotional states - Full context integration: Uses complete miku_lore.txt, miku_prompt.txt, and miku_lyrics.txt - Conversation memory: Maintains chat history throughout session - Image upload support for vision model - Horizontal scrolling tabs for responsive design - Clear chat history functionality - SSE (Server-Sent Events) for streaming responses - Keyboard shortcuts (Ctrl+Enter to send) Technical changes: - Added POST /chat/stream endpoint in api.py with streaming support - Updated ChatMessage model with mood, conversation_history, and image_data - Integrated context_manager for proper Miku personality context - Added Chat with LLM tab to index.html - Implemented JavaScript streaming client with EventSource-like handling - Added CSS for chat messages, typing indicators, and animations - Made tab navigation horizontally scrollable for narrow viewports 2025-12-13 00:23:03 +02:00			`# Chat Interface Feature Documentation`

			`## Overview`
			`A new "Chat with LLM" tab has been added to the Miku bot Web UI, allowing you to chat directly with the language models with full streaming support (similar to ChatGPT).`

			`## Features`

			`### 1. Model Selection`
			`- 💬 Text Model (Fast): Chat with the text-based LLM for quick conversations`
			`- 👁️ Vision Model (Images): Use the vision model to analyze and discuss images`

			`### 2. System Prompt Options`
			`- ✅ Use Miku Personality: Attach the standard Miku personality system prompt`
			- Text model: Gets the full Miku character prompt (same as `query_llama`)
			`- Vision model: Gets a simplified Miku-themed image analysis prompt`
			`- ❌ Raw LLM (No Prompt): Chat directly with the base LLM without any personality`
			`- Great for testing raw model responses`
			`- No character constraints`

			`### 3. Real-time Streaming`
			`- Messages stream in character-by-character like ChatGPT`
			`- Shows typing indicator while waiting for response`
			`- Smooth, responsive interface`

			`### 4. Vision Model Support`
			`- Upload images when using the vision model`
			`- Image preview before sending`
			`- Analyze images with Miku's personality or raw vision capabilities`

			`### 5. Chat Management`
			`- Clear chat history button`
			`- Timestamps on all messages`
			`- Color-coded messages (user vs assistant)`
			`- Auto-scroll to latest message`
			`- Keyboard shortcut: Ctrl+Enter to send messages`

			`## Technical Implementation`

			`### Backend (api.py)`

			#### New Endpoint: `POST /chat/stream`
			```python
			`# Accepts:`
			`{`
			`"message": "Your chat message",`
			`"model_type": "text" \| "vision",`
			`"use_system_prompt": true \| false,`
			`"image_data": "base64_encoded_image" (optional, for vision model)`
			`}`

			`# Returns: Server-Sent Events (SSE) stream`
			`data: {"content": "streamed text chunk"}`
			`data: {"done": true}`
			`data: {"error": "error message"}`
			```

			`Key Features:`
			`- Uses Server-Sent Events (SSE) for streaming`
			- Supports both `TEXT_MODEL` and `VISION_MODEL` from globals
			`- Dynamically switches system prompts based on configuration`
			`- Integrates with llama.cpp's streaming API`

			`### Frontend (index.html)`

			`#### New Tab: "💬 Chat with LLM"`
			`Located in the main navigation tabs (tab6)`

			`Components:`
			`1. Configuration Panel`
			`- Radio buttons for model selection`
			`- Radio buttons for system prompt toggle`
			`- Image upload section (shows/hides based on model)`
			`- Clear chat history button`

			`2. Chat Messages Container`
			`- Scrollable message history`
			`- Animated message appearance`
			`- Typing indicator during streaming`
			`- Color-coded messages with timestamps`

			`3. Input Area`
			`- Multi-line text input`
			`- Send button with loading state`
			`- Keyboard shortcuts`

			`JavaScript Functions:`
			- `sendChatMessage()`: Handles message sending and streaming reception
			- `toggleChatImageUpload()`: Shows/hides image upload for vision model
			- `addChatMessage()`: Adds messages to chat display
			- `showTypingIndicator()` / `hideTypingIndicator()`: Typing animation
			- `clearChatHistory()`: Clears all messages
			- `handleChatKeyPress()`: Keyboard shortcuts

			`## Usage Guide`

			`### Basic Text Chat with Miku`
			`1. Go to "💬 Chat with LLM" tab`
			`2. Ensure "💬 Text Model" is selected`
			`3. Ensure "✅ Use Miku Personality" is selected`
			`4. Type your message and click "📤 Send" (or press Ctrl+Enter)`
			`5. Watch as Miku's response streams in real-time!`

			`### Raw LLM Testing`
			`1. Select "💬 Text Model"`
			`2. Select "❌ Raw LLM (No Prompt)"`
			`3. Chat directly with the base language model without personality constraints`

			`### Vision Model Chat`
			`1. Select "👁️ Vision Model"`
			`2. Click "Upload Image" and select an image`
			`3. Type a message about the image (e.g., "What do you see in this image?")`
			`4. Click "📤 Send"`
			`5. The vision model will analyze the image and respond`

			`### Vision Model with Miku Personality`
			`1. Select "👁️ Vision Model"`
			`2. Keep "✅ Use Miku Personality" selected`
			`3. Upload an image`
			`4. Miku will analyze and comment on the image with her cheerful personality!`

			`## System Prompts`

			`### Text Model (with Miku personality)`
			Uses the same comprehensive system prompt as `query_llama()`:
			`- Full Miku character context`
			`- Current mood integration`
			`- Character consistency rules`
			`- Natural conversation guidelines`

			`### Vision Model (with Miku personality)`
			`Simplified prompt optimized for image analysis:`
			```
			`You are Hatsune Miku analyzing an image. Describe what you see naturally`
			`and enthusiastically as Miku would. Be detailed but conversational.`
			`React to what you see with Miku's cheerful, playful personality.`
			```

			`### No System Prompt`
			`Both models respond without personality constraints when this option is selected.`

			`## Streaming Technology`

			`The interface uses Server-Sent Events (SSE) for real-time streaming:`
			`- Backend sends chunked responses from llama.cpp`
			`- Frontend receives and displays chunks as they arrive`
			`- Smooth, ChatGPT-like experience`
			`- Works with both text and vision models`

			`## UI/UX Features`

			`### Message Styling`
			`- User messages: Green accent, right-aligned feel`
			`- Assistant messages: Blue accent, left-aligned feel`
			`- Error messages: Red accent with error icon`
			`- Fade-in animation: Smooth appearance for new messages`

			`### Responsive Design`
			`- Chat container scrolls automatically`
			`- Image preview for vision model`
			`- Loading states on buttons`
			`- Typing indicators`
			`- Custom scrollbar styling`

			`### Keyboard Shortcuts`
			`- Ctrl+Enter: Send message quickly`
			`- Tab: Navigate between input fields`

			`## Configuration Options`

			`All settings are preserved during the chat session:`
			`- Model type (text/vision)`
			`- System prompt toggle (Miku/Raw)`
			`- Uploaded image (for vision model)`

			`Settings do NOT persist after page refresh (fresh session each time).`

			`## Error Handling`

			`The interface handles various errors gracefully:`
			`- Connection failures`
			`- Model errors`
			`- Invalid image files`
			`- Empty messages`
			`- Timeout issues`

			`All errors are displayed in the chat with clear error messages.`

			`## Performance Considerations`

			`### Text Model`
			`- Fast responses (typically 1-3 seconds)`
			`- Streaming starts almost immediately`
			`- Low latency`

			`### Vision Model`
			`- Slower due to image processing`
			`- First token may take 3-10 seconds`
			`- Streaming continues once started`
			`- Image is sent as base64 (efficient)`

			`## Development Notes`

			`### File Changes`
			1. `bot/api.py`
			- Added `from fastapi.responses import StreamingResponse`
			- Added `ChatMessage` Pydantic model
			- Added `POST /chat/stream` endpoint with SSE support

			2. `bot/static/index.html`
			`- Added tab6 button in navigation`
			`- Added complete chat interface HTML`
			`- Added CSS styles for chat messages and animations`
			`- Added JavaScript functions for chat functionality`

			`### Dependencies`
			- Uses existing `aiohttp` for HTTP streaming
			- Uses existing `globals.TEXT_MODEL` and `globals.VISION_MODEL`
			- Uses existing `globals.LLAMA_URL` for llama.cpp connection
			`- No new dependencies required!`

			`## Future Enhancements (Ideas)`

			`Potential improvements for future versions:`
			`- [ ] Save/load chat sessions`
			`- [ ] Export chat history to file`
			`- [ ] Multi-user chat history (separate sessions per user)`
			`- [ ] Temperature and max_tokens controls`
			`- [ ] Model selection dropdown (if multiple models available)`
			`- [ ] Token count display`
			`- [ ] Voice input support`
			`- [ ] Markdown rendering in responses`
			`- [ ] Code syntax highlighting`
			`- [ ] Copy message button`
			`- [ ] Regenerate response button`

			`## Troubleshooting`

			`### "No response received from LLM"`
			`- Check if llama.cpp server is running`
			- Verify `LLAMA_URL` in globals is correct
			`- Check bot logs for connection errors`

			`### "Failed to read image file"`
			`- Ensure image is valid format (JPEG, PNG, GIF)`
			`- Check file size (large images may cause issues)`
			`- Try a different image`

			`### Streaming not working`
			`- Check browser console for JavaScript errors`
			`- Verify SSE is not blocked by proxy/firewall`
			`- Try refreshing the page`

			`### Model not responding`
			`- Check if correct model is loaded in llama.cpp`
			`- Verify model type matches what's configured`
			`- Check llama.cpp logs for errors`

			`## API Reference`

			`### POST /chat/stream`

			`Request Body:`
			```json
			`{`
			`"message": "string", // Required: User's message`
			`"model_type": "text\|vision", // Required: Which model to use`
			`"use_system_prompt": boolean, // Required: Whether to add system prompt`
			`"image_data": "string\|null" // Optional: Base64 image for vision model`
			`}`
			```

			`Response:`
			```
			`Content-Type: text/event-stream`

			`data: {"content": "Hello"}`
			`data: {"content": " there"}`
			`data: {"content": "!"}`
			`data: {"done": true}`
			```

			`Error Response:`
			```
			`data: {"error": "Error message here"}`
			```

			`## Conclusion`

			`The Chat Interface provides a powerful, user-friendly way to:`
			`- Test LLM responses interactively`
			`- Experiment with different prompting strategies`
			`- Analyze images with vision models`
			`- Chat with Miku's personality in real-time`
			`- Debug and understand model behavior`

			`All with a smooth, modern streaming interface that feels like ChatGPT! 🎉`