Files
miku-discord/ERROR_HANDLING_SYSTEM.md

132 lines
4.7 KiB
Markdown

# Error Handling System
## Overview
The Miku bot now includes a comprehensive error handling system that catches errors from the llama-swap containers (both NVIDIA and AMD) and provides user-friendly responses while notifying the bot administrator.
## Features
### 1. Error Detection
The system automatically detects various types of errors including:
- HTTP error codes (502, 500, 503, etc.)
- Connection errors (refused, timeout, failed)
- LLM server errors
- Timeout errors
- Generic error messages
### 2. User-Friendly Responses
When an error is detected, instead of showing technical error messages like "Error 502" or "Sorry, there was an error", Miku will respond with:
> **"Someone tell Koko-nii there is a problem with my AI."**
This keeps Miku in character and provides a better user experience.
### 3. Administrator Notifications
When an error occurs, a webhook notification is automatically sent to Discord with:
- **Error Message**: The full error text from the container
- **Context Information**:
- User who triggered the error
- Channel/Server where the error occurred
- User's prompt that caused the error
- Exception type (if applicable)
- Full traceback (if applicable)
- **Mention**: Automatically mentions Koko-nii for immediate attention
### 4. Conversation History Protection
Error messages are NOT saved to conversation history, preventing errors from polluting the context for future interactions.
## Implementation Details
### Files Modified
1. **`bot/utils/error_handler.py`** (NEW)
- Core error detection and webhook notification logic
- `is_error_response()`: Detects error messages using regex patterns
- `handle_llm_error()`: Handles exceptions from the LLM
- `handle_response_error()`: Handles error responses from the LLM
- `send_error_webhook()`: Sends formatted error notifications
2. **`bot/utils/llm.py`**
- Integrated error handling into `query_llama()` function
- Catches all exceptions and HTTP errors
- Filters responses to detect error messages
- Prevents error messages from being saved to history
### Webhook URL
```
https://discord.com/api/webhooks/1462216811293708522/4kdGenpxZFsP0z3VBgebYENODKmcRrmEzoIwCN81jCirnAxuU2YvxGgwGCNBb6TInA9Z
```
## Error Detection Patterns
The system detects errors using the following patterns:
- `Error: XXX` or `Error XXX` (with HTTP status codes)
- `XXX Error` format
- "Sorry, there was an error"
- "Sorry, the response took too long"
- Connection-related errors (refused, timeout, failed)
- Server errors (service unavailable, internal server error, bad gateway)
- HTTP status codes >= 400
## Coverage
The error handler is automatically applied to:
- ✅ Direct messages to Miku
- ✅ Server messages mentioning Miku
- ✅ Autonomous messages (general, engaging users, tweets)
- ✅ Conversation joining
- ✅ All responses using `query_llama()`
- ✅ Both NVIDIA and AMD GPU containers
## Testing
A test suite is included in `bot/test_error_handler.py` that validates the error detection logic with 26 test cases covering:
- Various error message formats
- Normal responses (should NOT be detected as errors)
- HTTP status codes
- Edge cases
Run tests with:
```bash
cd /home/koko210Serve/docker/miku-discord/bot
python test_error_handler.py
```
## Example Scenarios
### Scenario 1: llama-swap Container Down
**User**: "Hi Miku!"
**Without Error Handler**: "Error: 502"
**With Error Handler**: "Someone tell Koko-nii there is a problem with my AI."
**Webhook Notification**: Sent with full error details
### Scenario 2: Connection Timeout
**User**: "Tell me a story"
**Without Error Handler**: "Sorry, the response took too long. Please try again."
**With Error Handler**: "Someone tell Koko-nii there is a problem with my AI."
**Webhook Notification**: Sent with timeout exception details
### Scenario 3: LLM Server Error
**User**: "How are you?"
**Without Error Handler**: "Error: Internal server error"
**With Error Handler**: "Someone tell Koko-nii there is a problem with my AI."
**Webhook Notification**: Sent with HTTP 500 error details
## Benefits
1. **Better User Experience**: Users see a friendly, in-character message instead of technical errors
2. **Immediate Notifications**: Administrator is notified immediately via Discord webhook
3. **Detailed Context**: Full error information is provided for debugging
4. **Clean History**: Errors don't pollute conversation history
5. **Consistent Handling**: All error types are handled uniformly
6. **Container Agnostic**: Works with both NVIDIA and AMD containers
## Future Enhancements
Potential improvements:
- Add retry logic for transient errors
- Track error frequency to detect systemic issues
- Automatic container restart if errors persist
- Error categorization (transient vs. critical)
- Rate limiting on webhook notifications to prevent spam