readmes/VOICE_CHAT_CONTEXT.md

# Voice Chat Context System

## Implementation Complete ✅

Added comprehensive voice chat context to give Miku awareness of the conversation environment.

---

## Features

### 1. Voice-Aware System Prompt
Miku now knows she's in a voice chat and adjusts her behavior:
- ✅ Aware she's speaking via TTS
- ✅ Knows who she's talking to (user names included)
- ✅ Understands responses will be spoken aloud
- ✅ Instructed to keep responses short (1-3 sentences)
- ✅ **CRITICAL: Instructed to only use English** (TTS can't handle Japanese well)

### 2. Conversation History (Last 8 Exchanges)
- Stores last 16 messages (8 user + 8 assistant)
- Maintains context across multiple voice interactions
- Automatically trimmed to keep memory manageable
- Each message includes username for multi-user context

### 3. Personality Integration
- Loads `miku_lore.txt` - Her background, personality, likes/dislikes
- Loads `miku_prompt.txt` - Core personality instructions
- Combines with voice-specific instructions
- Maintains character consistency

### 4. Reduced Log Spam
- Set voice_recv logger to CRITICAL level
- Suppresses routine CryptoErrors and RTCP packets
- Only shows actual critical errors

---

## System Prompt Structure

```
[miku_prompt.txt content]

[miku_lore.txt content]

VOICE CHAT CONTEXT:
- You are currently in a voice channel speaking with {user.name} and others
- Your responses will be spoken aloud via text-to-speech
- Keep responses SHORT and CONVERSATIONAL (1-3 sentences max)
- Speak naturally as if having a real-time voice conversation
- IMPORTANT: Only respond in ENGLISH! The TTS system cannot handle Japanese or other languages well.
- Be expressive and use casual language, but stay in character as Miku

Remember: This is a live voice conversation, so be concise and engaging!
```

---

## Conversation Flow

```
User speaks → STT transcribes → Add to history
                                      ↓
                              [System Prompt]
                              [Last 8 exchanges]
                              [Current user message]
                                      ↓
                                  LLM generates
                                      ↓
                              Add response to history
                                      ↓
                              Stream to TTS → Speak
```

---

## Message History Format

```python
conversation_history = [
    {"role": "user", "content": "koko210: Hey Miku, how are you?"},
    {"role": "assistant", "content": "Hey koko210! I'm doing great, thanks for asking!"},
    {"role": "user", "content": "koko210: Can you sing something?"},
    {"role": "assistant", "content": "I'd love to! What song would you like to hear?"},
    # ... up to 16 messages total (8 exchanges)
]
```

---

## Configuration

### Conversation History Limit
**Current**: 16 messages (8 exchanges)

To adjust, edit `voice_manager.py`:
```python
# Keep only last 8 exchanges (16 messages = 8 user + 8 assistant)
if len(self.conversation_history) > 16:
    self.conversation_history = self.conversation_history[-16:]
```

**Recommendations**:
- **8 exchanges**: Good balance (current setting)
- **12 exchanges**: More context, slightly more tokens
- **4 exchanges**: Minimal context, faster responses

### Response Length
**Current**: max_tokens=200

To adjust:
```python
payload = {
    "max_tokens": 200  # Change this
}
```

---

## Language Enforcement

### Why English-Only?
The RVC TTS system is trained on English audio and struggles with:
- Japanese characters (even though Miku is Japanese!)
- Special characters
- Mixed language text
- Non-English phonetics

### Implementation
The system prompt explicitly tells Miku:
> **IMPORTANT: Only respond in ENGLISH! The TTS system cannot handle Japanese or other languages well.**

This is reinforced in every voice chat interaction.

---

## Testing

### Test 1: Basic Conversation
```
User: "Hey Miku!"
Miku: "Hi there! Great to hear from you!" (should be in English)
User: "How are you doing?"
Miku: "I'm doing wonderful! How about you?" (remembers previous exchange)
```

### Test 2: Context Retention
Have a multi-turn conversation and verify Miku remembers:
- Previous topics discussed
- User names
- Conversation flow

### Test 3: Response Length
Verify responses are:
- Short (1-3 sentences)
- Conversational
- Not truncated mid-sentence

### Test 4: Language Enforcement
Try asking in Japanese or requesting Japanese response:
- Miku should politely respond in English
- Should explain she needs to use English for voice chat

---

## Monitoring

### Check Conversation History
```bash
# Add debug logging to voice_manager.py to see history
logger.debug(f"Conversation history: {self.conversation_history}")
```

### Check System Prompt
```bash
docker exec miku-bot cat /app/miku_prompt.txt
docker exec miku-bot cat /app/miku_lore.txt
```

### Monitor Responses
```bash
docker logs -f miku-bot | grep "Voice response complete"
```

---

## Files Modified

1. **bot/bot.py**
   - Changed voice_recv logger level from WARNING to CRITICAL
   - Suppresses CryptoError spam

2. **bot/utils/voice_manager.py**
   - Added `conversation_history` to `VoiceSession.__init__()`
   - Updated `_generate_voice_response()` to load lore files
   - Built comprehensive voice-aware system prompt
   - Implemented conversation history tracking (last 8 exchanges)
   - Added English-only instruction
   - Saves both user and assistant messages to history

---

## Benefits

✅ **Better Context**: Miku remembers previous exchanges  
✅ **Cleaner Logs**: No more CryptoError spam  
✅ **Natural Responses**: Knows she's in voice chat, responds appropriately  
✅ **Language Consistency**: Enforces English for TTS compatibility  
✅ **Personality Intact**: Still loads lore and personality files  
✅ **User Awareness**: Knows who she's talking to  

---

## Next Steps

1. **Test thoroughly** with multi-turn conversations
2. **Adjust history length** if needed (currently 8 exchanges)
3. **Fine-tune response length** based on TTS performance
4. **Add conversation reset** command if needed (e.g., `!miku reset`)
5. **Consider adding** conversation summaries for very long sessions

---

**Status**: ✅ **DEPLOYED AND READY FOR TESTING**

Miku now has full context awareness in voice chat with personality, conversation history, and language enforcement!
moved AI generated readmes to readme folder (may delete) 2026-01-27 19:57:48 +02:00			`# Voice Chat Context System`

			`## Implementation Complete ✅`

			`Added comprehensive voice chat context to give Miku awareness of the conversation environment.`

			`---`

			`## Features`

			`### 1. Voice-Aware System Prompt`
			`Miku now knows she's in a voice chat and adjusts her behavior:`
			`- ✅ Aware she's speaking via TTS`
			`- ✅ Knows who she's talking to (user names included)`
			`- ✅ Understands responses will be spoken aloud`
			`- ✅ Instructed to keep responses short (1-3 sentences)`
			`- ✅ CRITICAL: Instructed to only use English (TTS can't handle Japanese well)`

			`### 2. Conversation History (Last 8 Exchanges)`
			`- Stores last 16 messages (8 user + 8 assistant)`
			`- Maintains context across multiple voice interactions`
			`- Automatically trimmed to keep memory manageable`
			`- Each message includes username for multi-user context`

			`### 3. Personality Integration`
			- Loads `miku_lore.txt` - Her background, personality, likes/dislikes
			- Loads `miku_prompt.txt` - Core personality instructions
			`- Combines with voice-specific instructions`
			`- Maintains character consistency`

			`### 4. Reduced Log Spam`
			`- Set voice_recv logger to CRITICAL level`
			`- Suppresses routine CryptoErrors and RTCP packets`
			`- Only shows actual critical errors`

			`---`

			`## System Prompt Structure`

			```
			`[miku_prompt.txt content]`

			`[miku_lore.txt content]`

			`VOICE CHAT CONTEXT:`
			`- You are currently in a voice channel speaking with {user.name} and others`
			`- Your responses will be spoken aloud via text-to-speech`
			`- Keep responses SHORT and CONVERSATIONAL (1-3 sentences max)`
			`- Speak naturally as if having a real-time voice conversation`
			`- IMPORTANT: Only respond in ENGLISH! The TTS system cannot handle Japanese or other languages well.`
			`- Be expressive and use casual language, but stay in character as Miku`

			`Remember: This is a live voice conversation, so be concise and engaging!`
			```

			`---`

			`## Conversation Flow`

			```
			`User speaks → STT transcribes → Add to history`
			`↓`
			`[System Prompt]`
			`[Last 8 exchanges]`
			`[Current user message]`
			`↓`
			`LLM generates`
			`↓`
			`Add response to history`
			`↓`
			`Stream to TTS → Speak`
			```

			`---`

			`## Message History Format`

			```python
			`conversation_history = [`
			`{"role": "user", "content": "koko210: Hey Miku, how are you?"},`
			`{"role": "assistant", "content": "Hey koko210! I'm doing great, thanks for asking!"},`
			`{"role": "user", "content": "koko210: Can you sing something?"},`
			`{"role": "assistant", "content": "I'd love to! What song would you like to hear?"},`
			`# ... up to 16 messages total (8 exchanges)`
			`]`
			```

			`---`

			`## Configuration`

			`### Conversation History Limit`
			`Current: 16 messages (8 exchanges)`

			To adjust, edit `voice_manager.py`:
			```python
			`# Keep only last 8 exchanges (16 messages = 8 user + 8 assistant)`
			`if len(self.conversation_history) > 16:`
			`self.conversation_history = self.conversation_history[-16:]`
			```

			`Recommendations:`
			`- 8 exchanges: Good balance (current setting)`
			`- 12 exchanges: More context, slightly more tokens`
			`- 4 exchanges: Minimal context, faster responses`

			`### Response Length`
			`Current: max_tokens=200`

			`To adjust:`
			```python
			`payload = {`
			`"max_tokens": 200 # Change this`
			`}`
			```

			`---`

			`## Language Enforcement`

			`### Why English-Only?`
			`The RVC TTS system is trained on English audio and struggles with:`
			`- Japanese characters (even though Miku is Japanese!)`
			`- Special characters`
			`- Mixed language text`
			`- Non-English phonetics`

			`### Implementation`
			`The system prompt explicitly tells Miku:`
			`> IMPORTANT: Only respond in ENGLISH! The TTS system cannot handle Japanese or other languages well.`

			`This is reinforced in every voice chat interaction.`

			`---`

			`## Testing`

			`### Test 1: Basic Conversation`
			```
			`User: "Hey Miku!"`
			`Miku: "Hi there! Great to hear from you!" (should be in English)`
			`User: "How are you doing?"`
			`Miku: "I'm doing wonderful! How about you?" (remembers previous exchange)`
			```

			`### Test 2: Context Retention`
			`Have a multi-turn conversation and verify Miku remembers:`
			`- Previous topics discussed`
			`- User names`
			`- Conversation flow`

			`### Test 3: Response Length`
			`Verify responses are:`
			`- Short (1-3 sentences)`
			`- Conversational`
			`- Not truncated mid-sentence`

			`### Test 4: Language Enforcement`
			`Try asking in Japanese or requesting Japanese response:`
			`- Miku should politely respond in English`
			`- Should explain she needs to use English for voice chat`

			`---`

			`## Monitoring`

			`### Check Conversation History`
			```bash
			`# Add debug logging to voice_manager.py to see history`
			`logger.debug(f"Conversation history: {self.conversation_history}")`
			```

			`### Check System Prompt`
			```bash
			`docker exec miku-bot cat /app/miku_prompt.txt`
			`docker exec miku-bot cat /app/miku_lore.txt`
			```

			`### Monitor Responses`
			```bash
			`docker logs -f miku-bot \| grep "Voice response complete"`
			```

			`---`

			`## Files Modified`

			`1. bot/bot.py`
			`- Changed voice_recv logger level from WARNING to CRITICAL`
			`- Suppresses CryptoError spam`

			`2. bot/utils/voice_manager.py`
			- Added `conversation_history` to `VoiceSession.__init__()`
			- Updated `_generate_voice_response()` to load lore files
			`- Built comprehensive voice-aware system prompt`
			`- Implemented conversation history tracking (last 8 exchanges)`
			`- Added English-only instruction`
			`- Saves both user and assistant messages to history`

			`---`

			`## Benefits`

			`✅ Better Context: Miku remembers previous exchanges`
			`✅ Cleaner Logs: No more CryptoError spam`
			`✅ Natural Responses: Knows she's in voice chat, responds appropriately`
			`✅ Language Consistency: Enforces English for TTS compatibility`
			`✅ Personality Intact: Still loads lore and personality files`
			`✅ User Awareness: Knows who she's talking to`

			`---`

			`## Next Steps`

			`1. Test thoroughly with multi-turn conversations`
			`2. Adjust history length if needed (currently 8 exchanges)`
			`3. Fine-tune response length based on TTS performance`
			4. Add conversation reset command if needed (e.g., `!miku reset`)
			`5. Consider adding conversation summaries for very long sessions`

			`---`

			`Status: ✅ DEPLOYED AND READY FOR TESTING`

			`Miku now has full context awareness in voice chat with personality, conversation history, and language enforcement!`