297 lines
8.7 KiB
Markdown
297 lines
8.7 KiB
Markdown
|
|
# Chat Interface Feature Documentation
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
A new **"Chat with LLM"** tab has been added to the Miku bot Web UI, allowing you to chat directly with the language models with full streaming support (similar to ChatGPT).
|
||
|
|
|
||
|
|
## Features
|
||
|
|
|
||
|
|
### 1. Model Selection
|
||
|
|
- **💬 Text Model (Fast)**: Chat with the text-based LLM for quick conversations
|
||
|
|
- **👁️ Vision Model (Images)**: Use the vision model to analyze and discuss images
|
||
|
|
|
||
|
|
### 2. System Prompt Options
|
||
|
|
- **✅ Use Miku Personality**: Attach the standard Miku personality system prompt
|
||
|
|
- Text model: Gets the full Miku character prompt (same as `query_llama`)
|
||
|
|
- Vision model: Gets a simplified Miku-themed image analysis prompt
|
||
|
|
- **❌ Raw LLM (No Prompt)**: Chat directly with the base LLM without any personality
|
||
|
|
- Great for testing raw model responses
|
||
|
|
- No character constraints
|
||
|
|
|
||
|
|
### 3. Real-time Streaming
|
||
|
|
- Messages stream in character-by-character like ChatGPT
|
||
|
|
- Shows typing indicator while waiting for response
|
||
|
|
- Smooth, responsive interface
|
||
|
|
|
||
|
|
### 4. Vision Model Support
|
||
|
|
- Upload images when using the vision model
|
||
|
|
- Image preview before sending
|
||
|
|
- Analyze images with Miku's personality or raw vision capabilities
|
||
|
|
|
||
|
|
### 5. Chat Management
|
||
|
|
- Clear chat history button
|
||
|
|
- Timestamps on all messages
|
||
|
|
- Color-coded messages (user vs assistant)
|
||
|
|
- Auto-scroll to latest message
|
||
|
|
- Keyboard shortcut: **Ctrl+Enter** to send messages
|
||
|
|
|
||
|
|
## Technical Implementation
|
||
|
|
|
||
|
|
### Backend (api.py)
|
||
|
|
|
||
|
|
#### New Endpoint: `POST /chat/stream`
|
||
|
|
```python
|
||
|
|
# Accepts:
|
||
|
|
{
|
||
|
|
"message": "Your chat message",
|
||
|
|
"model_type": "text" | "vision",
|
||
|
|
"use_system_prompt": true | false,
|
||
|
|
"image_data": "base64_encoded_image" (optional, for vision model)
|
||
|
|
}
|
||
|
|
|
||
|
|
# Returns: Server-Sent Events (SSE) stream
|
||
|
|
data: {"content": "streamed text chunk"}
|
||
|
|
data: {"done": true}
|
||
|
|
data: {"error": "error message"}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Key Features:**
|
||
|
|
- Uses Server-Sent Events (SSE) for streaming
|
||
|
|
- Supports both `TEXT_MODEL` and `VISION_MODEL` from globals
|
||
|
|
- Dynamically switches system prompts based on configuration
|
||
|
|
- Integrates with llama.cpp's streaming API
|
||
|
|
|
||
|
|
### Frontend (index.html)
|
||
|
|
|
||
|
|
#### New Tab: "💬 Chat with LLM"
|
||
|
|
Located in the main navigation tabs (tab6)
|
||
|
|
|
||
|
|
**Components:**
|
||
|
|
1. **Configuration Panel**
|
||
|
|
- Radio buttons for model selection
|
||
|
|
- Radio buttons for system prompt toggle
|
||
|
|
- Image upload section (shows/hides based on model)
|
||
|
|
- Clear chat history button
|
||
|
|
|
||
|
|
2. **Chat Messages Container**
|
||
|
|
- Scrollable message history
|
||
|
|
- Animated message appearance
|
||
|
|
- Typing indicator during streaming
|
||
|
|
- Color-coded messages with timestamps
|
||
|
|
|
||
|
|
3. **Input Area**
|
||
|
|
- Multi-line text input
|
||
|
|
- Send button with loading state
|
||
|
|
- Keyboard shortcuts
|
||
|
|
|
||
|
|
**JavaScript Functions:**
|
||
|
|
- `sendChatMessage()`: Handles message sending and streaming reception
|
||
|
|
- `toggleChatImageUpload()`: Shows/hides image upload for vision model
|
||
|
|
- `addChatMessage()`: Adds messages to chat display
|
||
|
|
- `showTypingIndicator()` / `hideTypingIndicator()`: Typing animation
|
||
|
|
- `clearChatHistory()`: Clears all messages
|
||
|
|
- `handleChatKeyPress()`: Keyboard shortcuts
|
||
|
|
|
||
|
|
## Usage Guide
|
||
|
|
|
||
|
|
### Basic Text Chat with Miku
|
||
|
|
1. Go to "💬 Chat with LLM" tab
|
||
|
|
2. Ensure "💬 Text Model" is selected
|
||
|
|
3. Ensure "✅ Use Miku Personality" is selected
|
||
|
|
4. Type your message and click "📤 Send" (or press Ctrl+Enter)
|
||
|
|
5. Watch as Miku's response streams in real-time!
|
||
|
|
|
||
|
|
### Raw LLM Testing
|
||
|
|
1. Select "💬 Text Model"
|
||
|
|
2. Select "❌ Raw LLM (No Prompt)"
|
||
|
|
3. Chat directly with the base language model without personality constraints
|
||
|
|
|
||
|
|
### Vision Model Chat
|
||
|
|
1. Select "👁️ Vision Model"
|
||
|
|
2. Click "Upload Image" and select an image
|
||
|
|
3. Type a message about the image (e.g., "What do you see in this image?")
|
||
|
|
4. Click "📤 Send"
|
||
|
|
5. The vision model will analyze the image and respond
|
||
|
|
|
||
|
|
### Vision Model with Miku Personality
|
||
|
|
1. Select "👁️ Vision Model"
|
||
|
|
2. Keep "✅ Use Miku Personality" selected
|
||
|
|
3. Upload an image
|
||
|
|
4. Miku will analyze and comment on the image with her cheerful personality!
|
||
|
|
|
||
|
|
## System Prompts
|
||
|
|
|
||
|
|
### Text Model (with Miku personality)
|
||
|
|
Uses the same comprehensive system prompt as `query_llama()`:
|
||
|
|
- Full Miku character context
|
||
|
|
- Current mood integration
|
||
|
|
- Character consistency rules
|
||
|
|
- Natural conversation guidelines
|
||
|
|
|
||
|
|
### Vision Model (with Miku personality)
|
||
|
|
Simplified prompt optimized for image analysis:
|
||
|
|
```
|
||
|
|
You are Hatsune Miku analyzing an image. Describe what you see naturally
|
||
|
|
and enthusiastically as Miku would. Be detailed but conversational.
|
||
|
|
React to what you see with Miku's cheerful, playful personality.
|
||
|
|
```
|
||
|
|
|
||
|
|
### No System Prompt
|
||
|
|
Both models respond without personality constraints when this option is selected.
|
||
|
|
|
||
|
|
## Streaming Technology
|
||
|
|
|
||
|
|
The interface uses **Server-Sent Events (SSE)** for real-time streaming:
|
||
|
|
- Backend sends chunked responses from llama.cpp
|
||
|
|
- Frontend receives and displays chunks as they arrive
|
||
|
|
- Smooth, ChatGPT-like experience
|
||
|
|
- Works with both text and vision models
|
||
|
|
|
||
|
|
## UI/UX Features
|
||
|
|
|
||
|
|
### Message Styling
|
||
|
|
- **User messages**: Green accent, right-aligned feel
|
||
|
|
- **Assistant messages**: Blue accent, left-aligned feel
|
||
|
|
- **Error messages**: Red accent with error icon
|
||
|
|
- **Fade-in animation**: Smooth appearance for new messages
|
||
|
|
|
||
|
|
### Responsive Design
|
||
|
|
- Chat container scrolls automatically
|
||
|
|
- Image preview for vision model
|
||
|
|
- Loading states on buttons
|
||
|
|
- Typing indicators
|
||
|
|
- Custom scrollbar styling
|
||
|
|
|
||
|
|
### Keyboard Shortcuts
|
||
|
|
- **Ctrl+Enter**: Send message quickly
|
||
|
|
- **Tab**: Navigate between input fields
|
||
|
|
|
||
|
|
## Configuration Options
|
||
|
|
|
||
|
|
All settings are preserved during the chat session:
|
||
|
|
- Model type (text/vision)
|
||
|
|
- System prompt toggle (Miku/Raw)
|
||
|
|
- Uploaded image (for vision model)
|
||
|
|
|
||
|
|
Settings do NOT persist after page refresh (fresh session each time).
|
||
|
|
|
||
|
|
## Error Handling
|
||
|
|
|
||
|
|
The interface handles various errors gracefully:
|
||
|
|
- Connection failures
|
||
|
|
- Model errors
|
||
|
|
- Invalid image files
|
||
|
|
- Empty messages
|
||
|
|
- Timeout issues
|
||
|
|
|
||
|
|
All errors are displayed in the chat with clear error messages.
|
||
|
|
|
||
|
|
## Performance Considerations
|
||
|
|
|
||
|
|
### Text Model
|
||
|
|
- Fast responses (typically 1-3 seconds)
|
||
|
|
- Streaming starts almost immediately
|
||
|
|
- Low latency
|
||
|
|
|
||
|
|
### Vision Model
|
||
|
|
- Slower due to image processing
|
||
|
|
- First token may take 3-10 seconds
|
||
|
|
- Streaming continues once started
|
||
|
|
- Image is sent as base64 (efficient)
|
||
|
|
|
||
|
|
## Development Notes
|
||
|
|
|
||
|
|
### File Changes
|
||
|
|
1. **`bot/api.py`**
|
||
|
|
- Added `from fastapi.responses import StreamingResponse`
|
||
|
|
- Added `ChatMessage` Pydantic model
|
||
|
|
- Added `POST /chat/stream` endpoint with SSE support
|
||
|
|
|
||
|
|
2. **`bot/static/index.html`**
|
||
|
|
- Added tab6 button in navigation
|
||
|
|
- Added complete chat interface HTML
|
||
|
|
- Added CSS styles for chat messages and animations
|
||
|
|
- Added JavaScript functions for chat functionality
|
||
|
|
|
||
|
|
### Dependencies
|
||
|
|
- Uses existing `aiohttp` for HTTP streaming
|
||
|
|
- Uses existing `globals.TEXT_MODEL` and `globals.VISION_MODEL`
|
||
|
|
- Uses existing `globals.LLAMA_URL` for llama.cpp connection
|
||
|
|
- No new dependencies required!
|
||
|
|
|
||
|
|
## Future Enhancements (Ideas)
|
||
|
|
|
||
|
|
Potential improvements for future versions:
|
||
|
|
- [ ] Save/load chat sessions
|
||
|
|
- [ ] Export chat history to file
|
||
|
|
- [ ] Multi-user chat history (separate sessions per user)
|
||
|
|
- [ ] Temperature and max_tokens controls
|
||
|
|
- [ ] Model selection dropdown (if multiple models available)
|
||
|
|
- [ ] Token count display
|
||
|
|
- [ ] Voice input support
|
||
|
|
- [ ] Markdown rendering in responses
|
||
|
|
- [ ] Code syntax highlighting
|
||
|
|
- [ ] Copy message button
|
||
|
|
- [ ] Regenerate response button
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
### "No response received from LLM"
|
||
|
|
- Check if llama.cpp server is running
|
||
|
|
- Verify `LLAMA_URL` in globals is correct
|
||
|
|
- Check bot logs for connection errors
|
||
|
|
|
||
|
|
### "Failed to read image file"
|
||
|
|
- Ensure image is valid format (JPEG, PNG, GIF)
|
||
|
|
- Check file size (large images may cause issues)
|
||
|
|
- Try a different image
|
||
|
|
|
||
|
|
### Streaming not working
|
||
|
|
- Check browser console for JavaScript errors
|
||
|
|
- Verify SSE is not blocked by proxy/firewall
|
||
|
|
- Try refreshing the page
|
||
|
|
|
||
|
|
### Model not responding
|
||
|
|
- Check if correct model is loaded in llama.cpp
|
||
|
|
- Verify model type matches what's configured
|
||
|
|
- Check llama.cpp logs for errors
|
||
|
|
|
||
|
|
## API Reference
|
||
|
|
|
||
|
|
### POST /chat/stream
|
||
|
|
|
||
|
|
**Request Body:**
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"message": "string", // Required: User's message
|
||
|
|
"model_type": "text|vision", // Required: Which model to use
|
||
|
|
"use_system_prompt": boolean, // Required: Whether to add system prompt
|
||
|
|
"image_data": "string|null" // Optional: Base64 image for vision model
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Response:**
|
||
|
|
```
|
||
|
|
Content-Type: text/event-stream
|
||
|
|
|
||
|
|
data: {"content": "Hello"}
|
||
|
|
data: {"content": " there"}
|
||
|
|
data: {"content": "!"}
|
||
|
|
data: {"done": true}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Error Response:**
|
||
|
|
```
|
||
|
|
data: {"error": "Error message here"}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Conclusion
|
||
|
|
|
||
|
|
The Chat Interface provides a powerful, user-friendly way to:
|
||
|
|
- Test LLM responses interactively
|
||
|
|
- Experiment with different prompting strategies
|
||
|
|
- Analyze images with vision models
|
||
|
|
- Chat with Miku's personality in real-time
|
||
|
|
- Debug and understand model behavior
|
||
|
|
|
||
|
|
All with a smooth, modern streaming interface that feels like ChatGPT! 🎉
|