180 lines
5.7 KiB
Markdown
180 lines
5.7 KiB
Markdown
|
|
# Japanese Language Mode Implementation
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
Successfully implemented a **Japanese language mode** for Miku that allows toggling between English and Japanese text output using the **Llama 3.1 Swallow model**.
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
### Files Modified/Created
|
||
|
|
|
||
|
|
#### 1. **New Japanese Context Files** ✅
|
||
|
|
- `bot/miku_prompt_jp.txt` - Japanese version with language instruction appended
|
||
|
|
- `bot/miku_lore_jp.txt` - Japanese character lore (English content + note)
|
||
|
|
- `bot/miku_lyrics_jp.txt` - Japanese song lyrics (English content + note)
|
||
|
|
|
||
|
|
**Approach:** Rather than translating all prompts to Japanese, we:
|
||
|
|
- Keep English context to help the model understand Miku's personality
|
||
|
|
- **Append a critical instruction**: "Please respond entirely in Japanese (日本語) for all messages."
|
||
|
|
- Rely on Swallow's strong Japanese capabilities to understand English instructions and respond in Japanese
|
||
|
|
|
||
|
|
#### 2. **globals.py** ✅
|
||
|
|
Added:
|
||
|
|
```python
|
||
|
|
JAPANESE_TEXT_MODEL = os.getenv("JAPANESE_TEXT_MODEL", "swallow") # Llama 3.1 Swallow model
|
||
|
|
LANGUAGE_MODE = "english" # Can be "english" or "japanese"
|
||
|
|
```
|
||
|
|
|
||
|
|
#### 3. **utils/context_manager.py** ✅
|
||
|
|
Added functions:
|
||
|
|
- `get_japanese_miku_prompt()` - Loads Japanese prompt
|
||
|
|
- `get_japanese_miku_lore()` - Loads Japanese lore
|
||
|
|
- `get_japanese_miku_lyrics()` - Loads Japanese lyrics
|
||
|
|
|
||
|
|
Updated existing functions:
|
||
|
|
- `get_complete_context()` - Now checks `globals.LANGUAGE_MODE` to return English or Japanese context
|
||
|
|
- `get_context_for_response_type()` - Now checks language mode for both English and Japanese paths
|
||
|
|
|
||
|
|
#### 4. **utils/llm.py** ✅
|
||
|
|
Updated `query_llama()` function to:
|
||
|
|
```python
|
||
|
|
# Model selection logic now:
|
||
|
|
if model is None:
|
||
|
|
if evil_mode:
|
||
|
|
model = globals.EVIL_TEXT_MODEL # DarkIdol
|
||
|
|
elif globals.LANGUAGE_MODE == "japanese":
|
||
|
|
model = globals.JAPANESE_TEXT_MODEL # Swallow
|
||
|
|
else:
|
||
|
|
model = globals.TEXT_MODEL # Default (llama3.1)
|
||
|
|
```
|
||
|
|
|
||
|
|
#### 5. **api.py** ✅
|
||
|
|
Added three new API endpoints:
|
||
|
|
|
||
|
|
**GET `/language`** - Get current language status
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"language_mode": "english",
|
||
|
|
"available_languages": ["english", "japanese"],
|
||
|
|
"current_model": "llama3.1"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**POST `/language/toggle`** - Toggle between English and Japanese
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"status": "ok",
|
||
|
|
"language_mode": "japanese",
|
||
|
|
"model_now_using": "swallow",
|
||
|
|
"message": "Miku is now speaking in JAPANESE!"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**POST `/language/set?language=japanese`** - Set specific language
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"status": "ok",
|
||
|
|
"language_mode": "japanese",
|
||
|
|
"model_now_using": "swallow",
|
||
|
|
"message": "Miku is now speaking in JAPANESE!"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## How It Works
|
||
|
|
|
||
|
|
### Flow Diagram
|
||
|
|
```
|
||
|
|
User Request
|
||
|
|
↓
|
||
|
|
query_llama() called
|
||
|
|
↓
|
||
|
|
Check LANGUAGE_MODE global
|
||
|
|
↓
|
||
|
|
If Japanese:
|
||
|
|
- Load miku_prompt_jp.txt (with "respond in Japanese" instruction)
|
||
|
|
- Use Swallow model
|
||
|
|
- Model receives English context + Japanese instruction
|
||
|
|
↓
|
||
|
|
If English:
|
||
|
|
- Load miku_prompt.txt (normal English prompts)
|
||
|
|
- Use default TEXT_MODEL
|
||
|
|
↓
|
||
|
|
Generate response in appropriate language
|
||
|
|
```
|
||
|
|
|
||
|
|
## Design Decisions
|
||
|
|
|
||
|
|
### 1. **No Full Translation Needed** ✅
|
||
|
|
Instead of translating all context files to Japanese, we:
|
||
|
|
- Keep English prompts/lore (helps the model understand Miku's core personality)
|
||
|
|
- Add a **language instruction** at the end of the prompt
|
||
|
|
- Rely on Swallow's ability to understand English instructions and respond in Japanese
|
||
|
|
|
||
|
|
**Benefits:**
|
||
|
|
- Minimal effort (no translation maintenance)
|
||
|
|
- Model still understands Miku's complete personality
|
||
|
|
- Easy to expand to other languages later
|
||
|
|
|
||
|
|
### 2. **Model Switching** ✅
|
||
|
|
The Swallow model is automatically selected when Japanese mode is active:
|
||
|
|
- English mode: Uses whatever TEXT_MODEL is configured (default: llama3.1)
|
||
|
|
- Japanese mode: Automatically switches to Swallow
|
||
|
|
- Evil mode: Always uses DarkIdol (evil mode takes priority)
|
||
|
|
|
||
|
|
### 3. **Context Inheritance** ✅
|
||
|
|
Japanese context files include metadata noting they're for Japanese mode:
|
||
|
|
```
|
||
|
|
**NOTE FOR JAPANESE MODE: This context is provided in English to help the language model understand Miku's character. Respond entirely in Japanese (日本語).**
|
||
|
|
```
|
||
|
|
|
||
|
|
## Testing
|
||
|
|
|
||
|
|
### Quick Test
|
||
|
|
1. Check current language:
|
||
|
|
```bash
|
||
|
|
curl http://localhost:8000/language
|
||
|
|
```
|
||
|
|
|
||
|
|
2. Toggle to Japanese:
|
||
|
|
```bash
|
||
|
|
curl -X POST http://localhost:8000/language/toggle
|
||
|
|
```
|
||
|
|
|
||
|
|
3. Send a message to Miku - should respond in Japanese!
|
||
|
|
|
||
|
|
4. Toggle back to English:
|
||
|
|
```bash
|
||
|
|
curl -X POST http://localhost:8000/language/toggle
|
||
|
|
```
|
||
|
|
|
||
|
|
### Full Workflow Test
|
||
|
|
1. Start with English mode (default)
|
||
|
|
2. Send message → Miku responds in English
|
||
|
|
3. Toggle to Japanese mode
|
||
|
|
4. Send message → Miku responds in Japanese using Swallow
|
||
|
|
5. Toggle back to English
|
||
|
|
6. Send message → Miku responds in English again
|
||
|
|
|
||
|
|
## Compatibility
|
||
|
|
|
||
|
|
- ✅ Works with existing mood system
|
||
|
|
- ✅ Works with evil mode (evil mode takes priority)
|
||
|
|
- ✅ Works with bipolar mode
|
||
|
|
- ✅ Works with conversation history
|
||
|
|
- ✅ Works with server-specific configurations
|
||
|
|
- ✅ Works with vision model (vision stays on NVIDIA, text can use Swallow)
|
||
|
|
|
||
|
|
## Future Enhancements
|
||
|
|
|
||
|
|
1. **Per-Server Language Settings** - Store language mode in `servers_config.json`
|
||
|
|
2. **Per-Channel Language** - Different channels could have different languages
|
||
|
|
3. **Language-Specific Moods** - Japanese moods with different descriptions
|
||
|
|
4. **Auto-Detection** - Detect user's language and auto-switch modes
|
||
|
|
5. **Translation Variants** - Create actual Japanese prompt files with proper translations
|
||
|
|
|
||
|
|
## Notes
|
||
|
|
|
||
|
|
- Swallow model must be available in llama-swap as model named "swallow"
|
||
|
|
- The model will load/unload automatically via llama-swap
|
||
|
|
- Conversation history is agnostic to language - it stores both English and Japanese messages
|
||
|
|
- Evil mode takes priority - if both evil mode and Japanese are enabled, evil mode's model selection wins (though you could enhance this if needed)
|