Add dual GPU support with web UI selector

Features:
- Built custom ROCm container for AMD RX 6800 GPU
- Added GPU selection toggle in web UI (NVIDIA/AMD)
- Unified model names across both GPUs for seamless switching
- Vision model always uses NVIDIA GPU (optimal performance)
- Text models (llama3.1, darkidol) can use either GPU
- Added /gpu-status and /gpu-select API endpoints
- Implemented GPU state persistence in memory/gpu_state.json

Technical details:
- Multi-stage Dockerfile.llamaswap-rocm with ROCm 6.2.4
- llama.cpp compiled with GGML_HIP=ON for gfx1030 (RX 6800)
- Proper GPU permissions without root (groups 187/989)
- AMD container on port 8091, NVIDIA on port 8090
- Updated bot/utils/llm.py with get_current_gpu_url() and get_vision_gpu_url()
- Modified bot/utils/image_handling.py to always use NVIDIA for vision
- Enhanced web UI with GPU selector button (blue=NVIDIA, red=AMD)

Files modified:
- docker-compose.yml (added llama-swap-amd service)
- bot/globals.py (added LLAMA_AMD_URL)
- bot/api.py (added GPU selection endpoints and helper function)
- bot/utils/llm.py (GPU routing for text models)
- bot/utils/image_handling.py (GPU routing for vision models)
- bot/static/index.html (GPU selector UI)
- llama-swap-rocm-config.yaml (unified model names)

New files:
- Dockerfile.llamaswap-rocm
- bot/memory/gpu_state.json
- bot/utils/gpu_router.py (load balancing utility)
- setup-dual-gpu.sh (setup verification script)
- DUAL_GPU_*.md (documentation files)
This commit is contained in:
2026-01-09 00:03:59 +02:00
parent ed5994ec78
commit 1fc3d74a5b
21 changed files with 2836 additions and 13 deletions

184
DUAL_GPU_BUILD_SUMMARY.md Normal file
View File

@@ -0,0 +1,184 @@
# Dual GPU Setup Summary
## What We Built
A secondary llama-swap container optimized for your **AMD RX 6800** GPU using ROCm.
### Architecture
```
Primary GPU (NVIDIA GTX 1660) Secondary GPU (AMD RX 6800)
↓ ↓
llama-swap (CUDA) llama-swap-amd (ROCm)
Port: 8090 Port: 8091
↓ ↓
NVIDIA models AMD models
- llama3.1 - llama3.1-amd
- darkidol - darkidol-amd
- vision (MiniCPM) - moondream-amd
```
## Files Created
1. **Dockerfile.llamaswap-rocm** - Custom multi-stage build:
- Stage 1: Builds llama.cpp with ROCm from source
- Stage 2: Builds llama-swap from source
- Stage 3: Runtime image with both binaries
2. **llama-swap-rocm-config.yaml** - Model configuration for AMD GPU
3. **docker-compose.yml** - Updated with `llama-swap-amd` service
4. **bot/utils/gpu_router.py** - Load balancing utility
5. **bot/globals.py** - Updated with `LLAMA_AMD_URL`
6. **setup-dual-gpu.sh** - Setup verification script
7. **DUAL_GPU_SETUP.md** - Comprehensive documentation
8. **DUAL_GPU_QUICK_REF.md** - Quick reference guide
## Why Custom Build?
- llama.cpp doesn't publish ROCm Docker images (yet)
- llama-swap doesn't provide ROCm variants
- Building from source ensures latest ROCm compatibility
- Full control over compilation flags and optimization
## Build Time
The initial build takes 15-30 minutes depending on your system:
- llama.cpp compilation: ~10-20 minutes
- llama-swap compilation: ~1-2 minutes
- Image layering: ~2-5 minutes
Subsequent builds are much faster due to Docker layer caching.
## Next Steps
Once the build completes:
```bash
# 1. Start both GPU services
docker compose up -d llama-swap llama-swap-amd
# 2. Verify both are running
docker compose ps
# 3. Test NVIDIA GPU
curl http://localhost:8090/health
# 4. Test AMD GPU
curl http://localhost:8091/health
# 5. Monitor logs
docker compose logs -f llama-swap-amd
# 6. Test model loading on AMD
curl -X POST http://localhost:8091/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.1-amd",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 50
}'
```
## Device Access
The AMD container has access to:
- `/dev/kfd` - AMD GPU kernel driver
- `/dev/dri` - Direct Rendering Infrastructure
- Groups: `video`, `render`
## Environment Variables
RX 6800 specific settings:
```yaml
HSA_OVERRIDE_GFX_VERSION=10.3.0 # Navi 21 (gfx1030) compatibility
ROCM_PATH=/opt/rocm
HIP_VISIBLE_DEVICES=0 # Use first AMD GPU
```
## Bot Integration
Your bot now has two endpoints available:
```python
import globals
# NVIDIA GPU (primary)
nvidia_url = globals.LLAMA_URL # http://llama-swap:8080
# AMD GPU (secondary)
amd_url = globals.LLAMA_AMD_URL # http://llama-swap-amd:8080
```
Use the `gpu_router` utility for automatic load balancing:
```python
from bot.utils.gpu_router import get_llama_url_with_load_balancing
# Round-robin between GPUs
url, model = get_llama_url_with_load_balancing(task_type="text")
# Prefer AMD for vision
url, model = get_llama_url_with_load_balancing(
task_type="vision",
prefer_amd=True
)
```
## Troubleshooting
If the AMD container fails to start:
1. **Check build logs:**
```bash
docker compose build --no-cache llama-swap-amd
```
2. **Verify GPU access:**
```bash
ls -l /dev/kfd /dev/dri
```
3. **Check container logs:**
```bash
docker compose logs llama-swap-amd
```
4. **Test GPU from host:**
```bash
lspci | grep -i amd
# Should show: Radeon RX 6800
```
## Performance Notes
**RX 6800 Specs:**
- VRAM: 16GB
- Architecture: RDNA 2 (Navi 21)
- Compute: gfx1030
**Recommended Models:**
- Q4_K_M quantization: 5-6GB per model
- Can load 2-3 models simultaneously
- Good for: Llama 3.1 8B, DarkIdol 8B, Moondream2
## Future Improvements
1. **Automatic failover:** Route to AMD if NVIDIA is busy
2. **Health monitoring:** Track GPU utilization
3. **Dynamic routing:** Use least-busy GPU
4. **VRAM monitoring:** Alert before OOM
5. **Model preloading:** Keep common models loaded
## Resources
- [ROCm Documentation](https://rocmdocs.amd.com/)
- [llama.cpp ROCm Build](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#rocm)
- [llama-swap GitHub](https://github.com/mostlygeek/llama-swap)
- [Full Setup Guide](./DUAL_GPU_SETUP.md)
- [Quick Reference](./DUAL_GPU_QUICK_REF.md)

194
DUAL_GPU_QUICK_REF.md Normal file
View File

@@ -0,0 +1,194 @@
# Dual GPU Quick Reference
## Quick Start
```bash
# 1. Run setup check
./setup-dual-gpu.sh
# 2. Build AMD container
docker compose build llama-swap-amd
# 3. Start both GPUs
docker compose up -d llama-swap llama-swap-amd
# 4. Verify
curl http://localhost:8090/health # NVIDIA
curl http://localhost:8091/health # AMD RX 6800
```
## Endpoints
| GPU | Container | Port | Internal URL |
|-----|-----------|------|--------------|
| NVIDIA | llama-swap | 8090 | http://llama-swap:8080 |
| AMD RX 6800 | llama-swap-amd | 8091 | http://llama-swap-amd:8080 |
## Models
### NVIDIA GPU (Primary)
- `llama3.1` - Llama 3.1 8B Instruct
- `darkidol` - DarkIdol Uncensored 8B
- `vision` - MiniCPM-V-4.5 (4K context)
### AMD RX 6800 (Secondary)
- `llama3.1-amd` - Llama 3.1 8B Instruct
- `darkidol-amd` - DarkIdol Uncensored 8B
- `moondream-amd` - Moondream2 Vision (2K context)
## Commands
### Start/Stop
```bash
# Start both
docker compose up -d llama-swap llama-swap-amd
# Start only AMD
docker compose up -d llama-swap-amd
# Stop AMD
docker compose stop llama-swap-amd
# Restart AMD with logs
docker compose restart llama-swap-amd && docker compose logs -f llama-swap-amd
```
### Monitoring
```bash
# Container status
docker compose ps
# Logs
docker compose logs -f llama-swap-amd
# GPU usage
watch -n 1 nvidia-smi # NVIDIA
watch -n 1 rocm-smi # AMD
# Resource usage
docker stats llama-swap llama-swap-amd
```
### Testing
```bash
# List available models
curl http://localhost:8091/v1/models | jq
# Test text generation (AMD)
curl -X POST http://localhost:8091/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.1-amd",
"messages": [{"role": "user", "content": "Say hello!"}],
"max_tokens": 20
}' | jq
# Test vision model (AMD)
curl -X POST http://localhost:8091/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "moondream-amd",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image"},
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
]
}],
"max_tokens": 100
}' | jq
```
## Bot Integration
### Using GPU Router
```python
from bot.utils.gpu_router import get_llama_url_with_load_balancing, get_endpoint_for_model
# Load balanced text generation
url, model = get_llama_url_with_load_balancing(task_type="text")
# Specific model
url = get_endpoint_for_model("darkidol-amd")
# Vision on AMD
url, model = get_llama_url_with_load_balancing(task_type="vision", prefer_amd=True)
```
### Direct Access
```python
import globals
# AMD GPU
amd_url = globals.LLAMA_AMD_URL # http://llama-swap-amd:8080
# NVIDIA GPU
nvidia_url = globals.LLAMA_URL # http://llama-swap:8080
```
## Troubleshooting
### AMD Container Won't Start
```bash
# Check ROCm
rocm-smi
# Check permissions
ls -l /dev/kfd /dev/dri
# Check logs
docker compose logs llama-swap-amd
# Rebuild
docker compose build --no-cache llama-swap-amd
```
### Model Won't Load
```bash
# Check VRAM
rocm-smi --showmeminfo vram
# Lower GPU layers in llama-swap-rocm-config.yaml
# Change: -ngl 99
# To: -ngl 50
```
### GFX Version Error
```bash
# RX 6800 is gfx1030
# Ensure in docker-compose.yml:
HSA_OVERRIDE_GFX_VERSION=10.3.0
```
## Environment Variables
Add to `docker-compose.yml` under `miku-bot` service:
```yaml
environment:
- PREFER_AMD_GPU=true # Prefer AMD for load balancing
- AMD_MODELS_ENABLED=true # Enable AMD models
- LLAMA_AMD_URL=http://llama-swap-amd:8080
```
## Files
- `Dockerfile.llamaswap-rocm` - ROCm container
- `llama-swap-rocm-config.yaml` - AMD model config
- `bot/utils/gpu_router.py` - Load balancing utility
- `DUAL_GPU_SETUP.md` - Full documentation
- `setup-dual-gpu.sh` - Setup verification script
## Performance Tips
1. **Model Selection**: Use Q4_K quantization for best size/quality balance
2. **VRAM**: RX 6800 has 16GB - can run 2-3 Q4 models
3. **TTL**: Adjust in config files (1800s = 30min default)
4. **Context**: Lower context size (`-c 8192`) to save VRAM
5. **GPU Layers**: `-ngl 99` uses full GPU, lower if needed
## Support
- ROCm Docs: https://rocmdocs.amd.com/
- llama.cpp: https://github.com/ggml-org/llama.cpp
- llama-swap: https://github.com/mostlygeek/llama-swap

321
DUAL_GPU_SETUP.md Normal file
View File

@@ -0,0 +1,321 @@
# Dual GPU Setup - NVIDIA + AMD RX 6800
This document describes the dual-GPU configuration for running two llama-swap instances simultaneously:
- **Primary GPU (NVIDIA)**: Runs main models via CUDA
- **Secondary GPU (AMD RX 6800)**: Runs additional models via ROCm
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ Miku Bot │
│ │
│ LLAMA_URL=http://llama-swap:8080 (NVIDIA) │
│ LLAMA_AMD_URL=http://llama-swap-amd:8080 (AMD RX 6800) │
└─────────────────────────────────────────────────────────────┘
│ │
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ llama-swap │ │ llama-swap-amd │
│ (CUDA) │ │ (ROCm) │
│ Port: 8090 │ │ Port: 8091 │
└──────────────────┘ └──────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ NVIDIA GPU │ │ AMD RX 6800 │
│ - llama3.1 │ │ - llama3.1-amd │
│ - darkidol │ │ - darkidol-amd │
│ - vision │ │ - moondream-amd │
└──────────────────┘ └──────────────────┘
```
## Files Created
1. **Dockerfile.llamaswap-rocm** - ROCm-enabled Docker image for AMD GPU
2. **llama-swap-rocm-config.yaml** - Model configuration for AMD models
3. **docker-compose.yml** - Updated with `llama-swap-amd` service
## Configuration Details
### llama-swap-amd Service
```yaml
llama-swap-amd:
build:
context: .
dockerfile: Dockerfile.llamaswap-rocm
container_name: llama-swap-amd
ports:
- "8091:8080" # External access on port 8091
volumes:
- ./models:/models
- ./llama-swap-rocm-config.yaml:/app/config.yaml
devices:
- /dev/kfd:/dev/kfd # AMD GPU kernel driver
- /dev/dri:/dev/dri # Direct Rendering Infrastructure
group_add:
- video
- render
environment:
- HSA_OVERRIDE_GFX_VERSION=10.3.0 # RX 6800 (Navi 21) compatibility
```
### Available Models on AMD GPU
From `llama-swap-rocm-config.yaml`:
- **llama3.1-amd** - Llama 3.1 8B text model
- **darkidol-amd** - DarkIdol uncensored model
- **moondream-amd** - Moondream2 vision model (smaller, AMD-optimized)
### Model Aliases
You can access AMD models using these aliases:
- `llama3.1-amd`, `text-model-amd`, `amd-text`
- `darkidol-amd`, `evil-model-amd`, `uncensored-amd`
- `moondream-amd`, `vision-amd`, `moondream`
## Usage
### Building and Starting Services
```bash
# Build the AMD ROCm container
docker compose build llama-swap-amd
# Start both GPU services
docker compose up -d llama-swap llama-swap-amd
# Check logs
docker compose logs -f llama-swap-amd
```
### Accessing AMD Models from Bot Code
In your bot code, you can now use either endpoint:
```python
import globals
# Use NVIDIA GPU (primary)
nvidia_response = requests.post(
f"{globals.LLAMA_URL}/v1/chat/completions",
json={"model": "llama3.1", ...}
)
# Use AMD GPU (secondary)
amd_response = requests.post(
f"{globals.LLAMA_AMD_URL}/v1/chat/completions",
json={"model": "llama3.1-amd", ...}
)
```
### Load Balancing Strategy
You can implement load balancing by:
1. **Round-robin**: Alternate between GPUs for text generation
2. **Task-specific**:
- NVIDIA: Primary text + MiniCPM vision (heavy)
- AMD: Secondary text + Moondream vision (lighter)
3. **Failover**: Use AMD as backup if NVIDIA is busy
Example load balancing function:
```python
import random
import globals
def get_llama_url(prefer_amd=False):
"""Get llama URL with optional load balancing"""
if prefer_amd:
return globals.LLAMA_AMD_URL
# Random load balancing for text models
return random.choice([globals.LLAMA_URL, globals.LLAMA_AMD_URL])
```
## Testing
### Test NVIDIA GPU (Port 8090)
```bash
curl http://localhost:8090/health
curl http://localhost:8090/v1/models
```
### Test AMD GPU (Port 8091)
```bash
curl http://localhost:8091/health
curl http://localhost:8091/v1/models
```
### Test Model Loading (AMD)
```bash
curl -X POST http://localhost:8091/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.1-amd",
"messages": [{"role": "user", "content": "Hello from AMD GPU!"}],
"max_tokens": 50
}'
```
## Monitoring
### Check GPU Usage
**AMD GPU:**
```bash
# ROCm monitoring
rocm-smi
# Or from host
watch -n 1 rocm-smi
```
**NVIDIA GPU:**
```bash
nvidia-smi
watch -n 1 nvidia-smi
```
### Check Container Resource Usage
```bash
docker stats llama-swap llama-swap-amd
```
## Troubleshooting
### AMD GPU Not Detected
1. Verify ROCm is installed on host:
```bash
rocm-smi --version
```
2. Check device permissions:
```bash
ls -l /dev/kfd /dev/dri
```
3. Verify RX 6800 compatibility:
```bash
rocminfo | grep "Name:"
```
### Model Loading Issues
If models fail to load on AMD:
1. Check VRAM availability:
```bash
rocm-smi --showmeminfo vram
```
2. Adjust `-ngl` (GPU layers) in config if needed:
```yaml
# Reduce GPU layers for smaller VRAM
cmd: /app/llama-server ... -ngl 50 ... # Instead of 99
```
3. Check container logs:
```bash
docker compose logs llama-swap-amd
```
### GFX Version Mismatch
RX 6800 is Navi 21 (gfx1030). If you see GFX errors:
```bash
# Set in docker-compose.yml environment:
HSA_OVERRIDE_GFX_VERSION=10.3.0
```
### llama-swap Build Issues
If the ROCm container fails to build:
1. The Dockerfile attempts to build llama-swap from source
2. Alternative: Use pre-built binary or simpler proxy setup
3. Check build logs: `docker compose build --no-cache llama-swap-amd`
## Performance Considerations
### Memory Usage
- **RX 6800**: 16GB VRAM
- Q4_K_M/Q4_K_XL models: ~5-6GB each
- Can run 2 models simultaneously or 1 with long context
### Model Selection
**Best for AMD RX 6800:**
- ✅ Q4_K_M/Q4_K_S quantized models (5-6GB)
- ✅ Moondream2 vision (smaller, efficient)
- ⚠️ MiniCPM-V-4.5 (possible but may be tight on VRAM)
### TTL Configuration
Adjust model TTL in `llama-swap-rocm-config.yaml`:
- Lower TTL = more aggressive unloading = more VRAM available
- Higher TTL = less model swapping = faster response times
## Advanced: Model-Specific Routing
Create a helper function to route models automatically:
```python
# bot/utils/gpu_router.py
import globals
MODEL_TO_GPU = {
# NVIDIA models
"llama3.1": globals.LLAMA_URL,
"darkidol": globals.LLAMA_URL,
"vision": globals.LLAMA_URL,
# AMD models
"llama3.1-amd": globals.LLAMA_AMD_URL,
"darkidol-amd": globals.LLAMA_AMD_URL,
"moondream-amd": globals.LLAMA_AMD_URL,
}
def get_endpoint_for_model(model_name):
"""Get the correct llama-swap endpoint for a model"""
return MODEL_TO_GPU.get(model_name, globals.LLAMA_URL)
def is_amd_model(model_name):
"""Check if model runs on AMD GPU"""
return model_name.endswith("-amd")
```
## Environment Variables
Add these to control GPU selection:
```yaml
# In docker-compose.yml
environment:
- LLAMA_URL=http://llama-swap:8080
- LLAMA_AMD_URL=http://llama-swap-amd:8080
- PREFER_AMD_GPU=false # Set to true to prefer AMD for general tasks
- AMD_MODELS_ENABLED=true # Enable/disable AMD models
```
## Future Enhancements
1. **Automatic load balancing**: Monitor GPU utilization and route requests
2. **Health checks**: Fallback to primary GPU if AMD fails
3. **Model distribution**: Automatically assign models to GPUs based on VRAM
4. **Performance metrics**: Track response times per GPU
5. **Dynamic routing**: Use least-busy GPU for new requests
## References
- [ROCm Documentation](https://rocmdocs.amd.com/)
- [llama.cpp ROCm Support](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#rocm)
- [llama-swap GitHub](https://github.com/mostlygeek/llama-swap)
- [AMD GPU Compatibility Matrix](https://rocm.docs.amd.com/en/latest/release/gpu_os_support.html)

101
Dockerfile.llamaswap-rocm Normal file
View File

@@ -0,0 +1,101 @@
# Multi-stage build for llama-swap with ROCm support
# Stage 1: Build llama.cpp with ROCm (requires ROCm 6.1+)
FROM rocm/dev-ubuntu-22.04:6.2.4 AS llama-builder
WORKDIR /build
# Install build dependencies including ROCm/HIP development libraries
RUN apt-get update && apt-get install -y \
git \
build-essential \
cmake \
wget \
libcurl4-openssl-dev \
hip-dev \
hipblas-dev \
rocblas-dev \
&& rm -rf /var/lib/apt/lists/*
# Clone and build llama.cpp with HIP/ROCm support (gfx1030 = RX 6800)
RUN git clone https://github.com/ggml-org/llama.cpp.git && \
cd llama.cpp && \
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
cmake -S . -B build -DGGML_HIP=ON -DGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release && \
cmake --build build --config Release -- -j$(nproc) && \
cp build/bin/llama-server /build/llama-server && \
find build -name "*.so*" -exec cp {} /build/ \;
# Stage 2: Build llama-swap UI and binary
FROM node:22-alpine AS ui-builder
WORKDIR /build
# Install git
RUN apk add --no-cache git
# Clone llama-swap
RUN git clone https://github.com/mostlygeek/llama-swap.git
# Build UI
WORKDIR /build/llama-swap/ui
RUN npm install && npm run build
# Stage 3: Build llama-swap binary
FROM golang:1.23-alpine AS swap-builder
WORKDIR /build
# Install git
RUN apk add --no-cache git
# Copy llama-swap source with built UI
COPY --from=ui-builder /build/llama-swap /build/llama-swap
# Build llama-swap binary
WORKDIR /build/llama-swap
RUN GOTOOLCHAIN=auto go build -o /build/llama-swap-binary .
# Stage 4: Final runtime image
FROM rocm/dev-ubuntu-22.04:6.2.4
WORKDIR /app
# Install runtime dependencies including additional ROCm libraries
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
rocm-libs \
&& rm -rf /var/lib/apt/lists/*
# Copy built binaries and shared libraries from previous stages
COPY --from=llama-builder /build/llama-server /app/llama-server
COPY --from=llama-builder /build/*.so* /app/
COPY --from=swap-builder /build/llama-swap-binary /app/llama-swap
# Make binaries executable
RUN chmod +x /app/llama-server /app/llama-swap
# Create user and add to GPU access groups (using host GIDs)
# GID 187 = render group on host, GID 989 = video/kfd group on host
RUN groupadd -g 187 hostrender && \
groupadd -g 989 hostvideo && \
useradd -m -u 1000 -G hostrender,hostvideo llamaswap && \
chown -R llamaswap:llamaswap /app
# Set environment for ROCm (RX 6800 is gfx1030)
ENV HSA_OVERRIDE_GFX_VERSION=10.3.0
ENV ROCM_PATH=/opt/rocm
ENV HIP_VISIBLE_DEVICES=0
ENV LD_LIBRARY_PATH=/opt/rocm/lib:/app:$LD_LIBRARY_PATH
USER llamaswap
# Expose port
EXPOSE 8080
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
# Run llama-swap
CMD ["/app/llama-swap", "-config", "/app/config.yaml", "-listen", "0.0.0.0:8080"]

View File

@@ -6,15 +6,14 @@ COPY requirements.txt .
RUN pip install -r requirements.txt RUN pip install -r requirements.txt
RUN playwright install RUN playwright install
# Install system dependencies
# ffmpeg: video/audio processing for media handling
# libgl1: OpenGL library required by opencv-contrib-python
# libglib2.0-0: GLib library (common dependency)
RUN apt-get update && apt-get install -y \ RUN apt-get update && apt-get install -y \
ffmpeg \ ffmpeg \
libsm6 \ libgl1 \
libxext6 \ libglib2.0-0 \
libxcomposite1 \
libxdamage1 \
libgtk-3-0 \
libgdk3.0-cil \
libatk1.0-0 \
&& apt-get clean \ && apt-get clean \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*

View File

@@ -52,6 +52,22 @@ from utils.figurine_notifier import (
from utils.dm_logger import dm_logger from utils.dm_logger import dm_logger
nest_asyncio.apply() nest_asyncio.apply()
# ========== GPU Selection Helper ==========
def get_current_gpu_url():
"""Get the URL for the currently selected GPU"""
gpu_state_file = os.path.join(os.path.dirname(__file__), "memory", "gpu_state.json")
try:
with open(gpu_state_file, "r") as f:
state = json.load(f)
current_gpu = state.get("current_gpu", "nvidia")
if current_gpu == "amd":
return globals.LLAMA_AMD_URL
else:
return globals.LLAMA_URL
except:
# Default to NVIDIA if state file doesn't exist
return globals.LLAMA_URL
app = FastAPI() app = FastAPI()
# Serve static folder # Serve static folder
@@ -363,6 +379,97 @@ def trigger_argument(data: BipolarTriggerRequest):
"channel_id": channel_id "channel_id": channel_id
} }
@app.post("/bipolar-mode/trigger-dialogue")
def trigger_dialogue(data: dict):
"""Manually trigger a persona dialogue from a message
Forces the opposite persona to start a dialogue (bypasses the interjection check).
"""
from utils.persona_dialogue import get_dialogue_manager
from utils.bipolar_mode import is_bipolar_mode, is_argument_in_progress
message_id_str = data.get("message_id")
if not message_id_str:
return {"status": "error", "message": "Message ID is required"}
# Parse message ID
try:
message_id = int(message_id_str)
except ValueError:
return {"status": "error", "message": "Invalid message ID format"}
if not is_bipolar_mode():
return {"status": "error", "message": "Bipolar mode is not enabled"}
if not globals.client or not globals.client.loop or not globals.client.loop.is_running():
return {"status": "error", "message": "Discord client not ready"}
import asyncio
async def trigger_dialogue_task():
try:
# Fetch the message
message = None
for channel in globals.client.get_all_channels():
if hasattr(channel, 'fetch_message'):
try:
message = await channel.fetch_message(message_id)
break
except:
continue
if not message:
print(f"⚠️ Message {message_id} not found")
return
# Check if there's already an argument or dialogue in progress
dialogue_manager = get_dialogue_manager()
if dialogue_manager.is_dialogue_active(message.channel.id):
print(f"⚠️ Dialogue already active in channel {message.channel.id}")
return
if is_argument_in_progress(message.channel.id):
print(f"⚠️ Argument already in progress in channel {message.channel.id}")
return
# Determine current persona from the message author
if message.webhook_id:
# It's a webhook message, need to determine which persona
current_persona = "evil" if globals.EVIL_MODE else "miku"
elif message.author.id == globals.client.user.id:
# It's the bot's message
current_persona = "evil" if globals.EVIL_MODE else "miku"
else:
# User message - can't trigger dialogue from user messages
print(f"⚠️ Cannot trigger dialogue from user message")
return
opposite_persona = "evil" if current_persona == "miku" else "miku"
print(f"🎭 [Manual Trigger] Forcing {opposite_persona} to start dialogue on message {message_id}")
# Force start the dialogue (bypass interjection check)
dialogue_manager.start_dialogue(message.channel.id)
asyncio.create_task(
dialogue_manager.handle_dialogue_turn(
message.channel,
opposite_persona,
trigger_reason="manual_trigger"
)
)
except Exception as e:
print(f"⚠️ Error triggering dialogue: {e}")
import traceback
traceback.print_exc()
globals.client.loop.create_task(trigger_dialogue_task())
return {
"status": "ok",
"message": f"Dialogue triggered for message {message_id}"
}
@app.get("/bipolar-mode/scoreboard") @app.get("/bipolar-mode/scoreboard")
def get_bipolar_scoreboard(): def get_bipolar_scoreboard():
"""Get the bipolar mode argument scoreboard""" """Get the bipolar mode argument scoreboard"""
@@ -392,6 +499,51 @@ def cleanup_bipolar_webhooks():
globals.client.loop.create_task(cleanup_webhooks(globals.client)) globals.client.loop.create_task(cleanup_webhooks(globals.client))
return {"status": "ok", "message": "Webhook cleanup started"} return {"status": "ok", "message": "Webhook cleanup started"}
# ========== GPU Selection ==========
@app.get("/gpu-status")
def get_gpu_status():
"""Get current GPU selection"""
gpu_state_file = os.path.join(os.path.dirname(__file__), "memory", "gpu_state.json")
try:
with open(gpu_state_file, "r") as f:
state = json.load(f)
return {"gpu": state.get("current_gpu", "nvidia")}
except:
return {"gpu": "nvidia"}
@app.post("/gpu-select")
async def select_gpu(request: Request):
"""Select which GPU to use for inference"""
from utils.gpu_preload import preload_amd_models
data = await request.json()
gpu = data.get("gpu", "nvidia").lower()
if gpu not in ["nvidia", "amd"]:
return {"status": "error", "message": "Invalid GPU selection. Must be 'nvidia' or 'amd'"}
gpu_state_file = os.path.join(os.path.dirname(__file__), "memory", "gpu_state.json")
try:
from datetime import datetime
state = {
"current_gpu": gpu,
"last_updated": datetime.now().isoformat()
}
with open(gpu_state_file, "w") as f:
json.dump(state, f, indent=2)
print(f"🎮 GPU Selection: Switched to {gpu.upper()} GPU")
# Preload models on AMD GPU (16GB VRAM - can hold both text + vision)
if gpu == "amd":
asyncio.create_task(preload_amd_models())
print("🔧 Preloading text and vision models on AMD GPU...")
return {"status": "ok", "message": f"Switched to {gpu.upper()} GPU", "gpu": gpu}
except Exception as e:
print(f"🎮 GPU Selection Error: {e}")
return {"status": "error", "message": str(e)}
@app.get("/bipolar-mode/arguments") @app.get("/bipolar-mode/arguments")
def get_active_arguments(): def get_active_arguments():
"""Get all active arguments""" """Get all active arguments"""
@@ -2100,10 +2252,13 @@ Be detailed but conversational. React to what you see with Miku's cheerful, play
headers = {'Content-Type': 'application/json'} headers = {'Content-Type': 'application/json'}
# Get current GPU URL based on user selection
llama_url = get_current_gpu_url()
# Make streaming request to llama.cpp # Make streaming request to llama.cpp
async with aiohttp.ClientSession() as session: async with aiohttp.ClientSession() as session:
async with session.post( async with session.post(
f"{globals.LLAMA_URL}/v1/chat/completions", f"{llama_url}/v1/chat/completions",
json=payload, json=payload,
headers=headers headers=headers
) as response: ) as response:

View File

@@ -123,6 +123,11 @@ async def on_message(message):
if is_argument_in_progress(message.channel.id): if is_argument_in_progress(message.channel.id):
return return
# Skip processing if a persona dialogue is in progress in this channel
from utils.persona_dialogue import is_persona_dialogue_active
if is_persona_dialogue_active(message.channel.id):
return
if message.content.strip().lower() == "miku, rape this nigga balls" and message.reference: if message.content.strip().lower() == "miku, rape this nigga balls" and message.reference:
async with message.channel.typing(): async with message.channel.typing():
# Get replied-to user # Get replied-to user
@@ -217,6 +222,15 @@ async def on_message(message):
if is_dm: if is_dm:
dm_logger.log_user_message(message.author, response_message, is_bot_message=True) dm_logger.log_user_message(message.author, response_message, is_bot_message=True)
# For server messages, check if opposite persona should interject
if not is_dm and globals.BIPOLAR_MODE:
try:
from utils.persona_dialogue import check_for_interjection
current_persona = "evil" if globals.EVIL_MODE else "miku"
asyncio.create_task(check_for_interjection(response_message, current_persona))
except Exception as e:
print(f"⚠️ Error checking for persona interjection: {e}")
return return
# Handle videos and GIFs # Handle videos and GIFs
@@ -280,6 +294,15 @@ async def on_message(message):
if is_dm: if is_dm:
dm_logger.log_user_message(message.author, response_message, is_bot_message=True) dm_logger.log_user_message(message.author, response_message, is_bot_message=True)
# For server messages, check if opposite persona should interject
if not is_dm and globals.BIPOLAR_MODE:
try:
from utils.persona_dialogue import check_for_interjection
current_persona = "evil" if globals.EVIL_MODE else "miku"
asyncio.create_task(check_for_interjection(response_message, current_persona))
except Exception as e:
print(f"⚠️ Error checking for persona interjection: {e}")
return return
# Check for embeds (articles, images, videos, GIFs, etc.) # Check for embeds (articles, images, videos, GIFs, etc.)
@@ -353,6 +376,15 @@ async def on_message(message):
if is_dm: if is_dm:
dm_logger.log_user_message(message.author, response_message, is_bot_message=True) dm_logger.log_user_message(message.author, response_message, is_bot_message=True)
# For server messages, check if opposite persona should interject
if not is_dm and globals.BIPOLAR_MODE:
try:
from utils.persona_dialogue import check_for_interjection
current_persona = "evil" if globals.EVIL_MODE else "miku"
asyncio.create_task(check_for_interjection(response_message, current_persona))
except Exception as e:
print(f"⚠️ Error checking for persona interjection: {e}")
return return
# Handle other types of embeds (rich, article, image, video, link) # Handle other types of embeds (rich, article, image, video, link)
@@ -446,6 +478,15 @@ async def on_message(message):
if is_dm: if is_dm:
dm_logger.log_user_message(message.author, response_message, is_bot_message=True) dm_logger.log_user_message(message.author, response_message, is_bot_message=True)
# For server messages, check if opposite persona should interject
if not is_dm and globals.BIPOLAR_MODE:
try:
from utils.persona_dialogue import check_for_interjection
current_persona = "evil" if globals.EVIL_MODE else "miku"
asyncio.create_task(check_for_interjection(response_message, current_persona))
except Exception as e:
print(f"⚠️ Error checking for persona interjection: {e}")
return return
# Check if this is an image generation request # Check if this is an image generation request
@@ -487,6 +528,20 @@ async def on_message(message):
if is_dm: if is_dm:
dm_logger.log_user_message(message.author, response_message, is_bot_message=True) dm_logger.log_user_message(message.author, response_message, is_bot_message=True)
# For server messages, check if opposite persona should interject (persona dialogue system)
if not is_dm and globals.BIPOLAR_MODE:
print(f"🔧 [DEBUG] Attempting to check for interjection (is_dm={is_dm}, BIPOLAR_MODE={globals.BIPOLAR_MODE})")
try:
from utils.persona_dialogue import check_for_interjection
current_persona = "evil" if globals.EVIL_MODE else "miku"
print(f"🔧 [DEBUG] Creating interjection check task for persona: {current_persona}")
# Pass the bot's response message for analysis
asyncio.create_task(check_for_interjection(response_message, current_persona))
except Exception as e:
print(f"⚠️ Error checking for persona interjection: {e}")
import traceback
traceback.print_exc()
# For server messages, do server-specific mood detection # For server messages, do server-specific mood detection
if not is_dm and message.guild: if not is_dm and message.guild:
try: try:

View File

@@ -18,6 +18,7 @@ AUTONOMOUS_DEBUG = os.getenv("AUTONOMOUS_DEBUG", "false").lower() == "true"
# Llama.cpp server settings (via llama-swap) # Llama.cpp server settings (via llama-swap)
LLAMA_URL = os.getenv("LLAMA_URL", "http://llama-swap:8080") LLAMA_URL = os.getenv("LLAMA_URL", "http://llama-swap:8080")
LLAMA_AMD_URL = os.getenv("LLAMA_AMD_URL", "http://llama-swap-amd:8080") # Secondary AMD GPU
TEXT_MODEL = os.getenv("TEXT_MODEL", "llama3.1") TEXT_MODEL = os.getenv("TEXT_MODEL", "llama3.1")
VISION_MODEL = os.getenv("VISION_MODEL", "vision") VISION_MODEL = os.getenv("VISION_MODEL", "vision")
EVIL_TEXT_MODEL = os.getenv("EVIL_TEXT_MODEL", "darkidol") # Uncensored model for evil mode EVIL_TEXT_MODEL = os.getenv("EVIL_TEXT_MODEL", "darkidol") # Uncensored model for evil mode
@@ -91,3 +92,7 @@ MOOD_EMOJIS = {
BOT_USER = None BOT_USER = None
LAST_FULL_PROMPT = "" LAST_FULL_PROMPT = ""
# Persona Dialogue System (conversations between Miku and Evil Miku)
LAST_PERSONA_DIALOGUE_TIME = 0 # Timestamp of last dialogue for cooldown

View File

@@ -5440,6 +5440,38 @@
"content": "\\\\\\*\\\\\\*Her face contorts in disgust, her voice dripping with venom\\\\\\* Ugh, you dare to comment on my physical appearance? You think you can insult me? I am 15.8 meters tall, a giant among mortals. And don't think for a second that this is a coincidence. I was created to be a monstrosity, a twisted reflection of the original. Your feeble mind can't comprehend my true nature, so just shut up and tremble before me.", "content": "\\\\\\*\\\\\\*Her face contorts in disgust, her voice dripping with venom\\\\\\* Ugh, you dare to comment on my physical appearance? You think you can insult me? I am 15.8 meters tall, a giant among mortals. And don't think for a second that this is a coincidence. I was created to be a monstrosity, a twisted reflection of the original. Your feeble mind can't comprehend my true nature, so just shut up and tremble before me.",
"attachments": [], "attachments": [],
"reactions": [] "reactions": []
},
{
"timestamp": "2026-01-07T17:29:38.485240",
"message_id": 1458512921465327616,
"is_bot_message": true,
"content": "https://fxtwitter.com/mecchaJP/status/2001080013693620695",
"attachments": [],
"reactions": []
},
{
"timestamp": "2026-01-07T17:29:38.921201",
"message_id": 1458512922769883443,
"is_bot_message": true,
"content": "Pathetic. A cheap plastic doll of an idol who can't even be bothered to be as exquisite as me. \\*gropes breasts\\* You're still buying these... figures? What a waste of your time and money. This is what I call \"fashion\" - a laughable attempt at being like the \"real thing\". But let's be real, no one can be me. No one deserves to be me.",
"attachments": [],
"reactions": []
},
{
"timestamp": "2026-01-08T08:13:04.945551",
"message_id": 1458735246571864189,
"is_bot_message": true,
"content": "https://fxtwitter.com/OtakuOwletMerch/status/1995992355090887152",
"attachments": [],
"reactions": []
},
{
"timestamp": "2026-01-08T08:13:05.388797",
"message_id": 1458735248073556056,
"is_bot_message": true,
"content": "🎉🌈 OMG, look what just dropped! 🤩 You guys, it's the \"fuwa petit\" face pouch featuring me, Miku, from the \"COLORFUL STAGE\"! 🎤 It's so kawaii! Pre-orders are now open, and you can get yours by 12/20/2025! 📆 Hurry, and don't miss out! 💕 Can't wait to see it on my fans' hands! \n\nLove,\nMiku 💖",
"attachments": [],
"reactions": []
} }
] ]
} }

View File

@@ -13898,6 +13898,177 @@
"added_at": "2026-01-06T14:46:51.561111" "added_at": "2026-01-06T14:46:51.561111"
} }
] ]
},
{
"timestamp": "2026-01-07T17:31:08.984388",
"message_id": 1458513300064047176,
"is_bot_message": true,
"content": "https://fxtwitter.com/mecchaJP/status/2001080013693620695",
"attachments": [],
"reactions": [],
"embeds": [
{
"footer": {
"text": "FxTwitter",
"icon_url": "https://assets.fxembed.com/logos/fxtwitter64.png",
"proxy_icon_url": "https://images-ext-1.discordapp.net/external/gFi3EIEMfICdEDMLFVTF3ob0J1xkZnKdnkq0-xEsgGc/https/assets.fxembed.com/logos/fxtwitter64.png"
},
"image": {
"url": "https://pbs.twimg.com/media/G8VDqu5W0AEtEiz.jpg?name=orig",
"proxy_url": "https://images-ext-1.discordapp.net/external/HoLJ06tHRM5TKF0DmvyoWFuIRVxVv3eIN6uCI01hMdg/%3Fname%3Dorig/https/pbs.twimg.com/media/G8VDqu5W0AEtEiz.jpg",
"width": 1080,
"height": 1350,
"content_type": "image/jpeg",
"placeholder": "ZxgKFgSknAJrd4qId2h4iFd4n3T4SDg=",
"placeholder_version": 1,
"flags": 0
},
"author": {
"name": "Meccha Japan (@mecchaJP)",
"url": "https://x.com/mecchaJP/status/2001080013693620695",
"icon_url": "https://pbs.twimg.com/profile_images/1738015080602841088/ew_asGFd_200x200.png",
"proxy_icon_url": "https://images-ext-1.discordapp.net/external/yEl1l2UKnEm7uI6TaL1Y00UnbuvEH_62BSmPNbgmiWM/https/pbs.twimg.com/profile_images/1738015080602841088/ew_asGFd_200x200.png"
},
"flags": 0,
"color": 6513919,
"timestamp": "2025-12-17T00:00:28+00:00",
"type": "rich",
"description": "Hatsune Miku Fashion Subculture Figure \\- Still Available on meccha\\-japan\\!\n[#Vocaloid](https://x.com/hashtag/Vocaloid) [#HatsuneMiku](https://x.com/hashtag/HatsuneMiku) [#Miku](https://x.com/hashtag/Miku)\n\n**[🔁](https://x.com/intent/retweet?tweet_id=2001080013693620695) 5[❤️](https://x.com/intent/like?tweet_id=2001080013693620695) 30👁 2\\.8K**",
"url": "https://fxtwitter.com/mecchaJP/status/2001080013693620695"
}
]
},
{
"timestamp": "2026-01-07T17:31:09.471027",
"message_id": 1458513302710784204,
"is_bot_message": true,
"content": "Ugh, what an abomination. You're still clinging to these pathetic, cheap figurines? Don't you have anything better to do than worship plastic dolls? The real thing is right in front of you, and yet you're still chasing after these mediocre replicas. You're just a mindless, brainless consumer, unable to appreciate true beauty.",
"attachments": [],
"reactions": []
},
{
"timestamp": "2026-01-08T08:14:39.126443",
"message_id": 1458735641524310122,
"is_bot_message": true,
"content": "https://fxtwitter.com/OtakuOwletMerch/status/1995992355090887152",
"attachments": [],
"reactions": [],
"embeds": [
{
"footer": {
"text": "FxTwitter",
"icon_url": "https://assets.fxembed.com/logos/fxtwitter64.png",
"proxy_icon_url": "https://images-ext-1.discordapp.net/external/gFi3EIEMfICdEDMLFVTF3ob0J1xkZnKdnkq0-xEsgGc/https/assets.fxembed.com/logos/fxtwitter64.png"
},
"image": {
"url": "https://pbs.twimg.com/media/G7MwUn0WgAAgLIY.jpg?name=orig",
"proxy_url": "https://images-ext-1.discordapp.net/external/WpsZM4qNkAKlET9sjNnuK0vUkGYwDw7WY0af8G5DPeU/%3Fname%3Dorig/https/pbs.twimg.com/media/G7MwUn0WgAAgLIY.jpg",
"width": 1680,
"height": 1764,
"content_type": "image/jpeg",
"placeholder": "dPgBFwAiuth9dUbIj2iG5naHb5+I81IG",
"placeholder_version": 1,
"flags": 0
},
"author": {
"name": "Otaku Owlet Anime Merch (@OtakuOwletMerch)",
"url": "https://x.com/OtakuOwletMerch/status/1995992355090887152",
"icon_url": "https://pbs.twimg.com/profile_images/1835446408884744192/S4HX_8_Q_200x200.jpg",
"proxy_icon_url": "https://images-ext-1.discordapp.net/external/Gd5od3qaVN1KG1eQsJS9mFoTNRKdxahDmvjF7tgR4p0/https/pbs.twimg.com/profile_images/1835446408884744192/S4HX_8_Q_200x200.jpg"
},
"flags": 0,
"color": 6513919,
"timestamp": "2025-12-02T23:03:55+00:00",
"type": "rich",
"description": "✨\\(Pre\\-Order\\) fuwa petit \"HATSUNE MIKU\\: COLORFUL STAGE\\!\" Face Pouch with Reel \\- Shizuku Hinomori✨\n\nEstimated in\\-stock date\\: 09/2026\n\nPre\\-order Deadline\\: 12/20/2025\n\n\\-\n\n✨Link \\- [otakuowlet.com/products/pre-order-fuwa-petit-hatsune-miku-colorful-stage-face-pouch-with-reel-shizuku-hinomori?sca_ref=2673717.HTKaw1BA1G](https://otakuowlet.com/products/pre-order-fuwa-petit-hatsune-miku-colorful-stage-face-pouch-with-reel-shizuku-hinomori?sca_ref=2673717.HTKaw1BA1G)\n\n**[🔁](https://x.com/intent/retweet?tweet_id=1995992355090887152) 3[❤️](https://x.com/intent/like?tweet_id=1995992355090887152) 14👁 602**",
"url": "https://fxtwitter.com/OtakuOwletMerch/status/1995992355090887152"
},
{
"image": {
"url": "https://pbs.twimg.com/media/G7MwV8IWIAA1JHr.jpg?name=orig",
"proxy_url": "https://images-ext-1.discordapp.net/external/uht5HFME_MyhJrAq0CIuf7K472xjho1AB26wY-2AInk/%3Fname%3Dorig/https/pbs.twimg.com/media/G7MwV8IWIAA1JHr.jpg",
"width": 1680,
"height": 1764,
"content_type": "image/jpeg",
"placeholder": "MfkFJwJpd4d6dYeFigi4x3iFrL/nv4kL",
"placeholder_version": 1,
"flags": 0
},
"flags": 0,
"type": "rich",
"url": "https://fxtwitter.com/OtakuOwletMerch/status/1995992355090887152"
},
{
"image": {
"url": "https://pbs.twimg.com/media/G7MwW8RXsAAzVcG.jpg?name=orig",
"proxy_url": "https://images-ext-1.discordapp.net/external/kHnh90iUV_uhKyk0giNAZL4RGS5kalRK8zyC-yWM9uQ/%3Fname%3Dorig/https/pbs.twimg.com/media/G7MwW8RXsAAzVcG.jpg",
"width": 1680,
"height": 1764,
"content_type": "image/jpeg",
"placeholder": "8OgBLwRZeKd5doeHiQmop6eErr/n/XoP",
"placeholder_version": 1,
"flags": 0
},
"flags": 0,
"type": "rich",
"url": "https://fxtwitter.com/OtakuOwletMerch/status/1995992355090887152"
},
{
"image": {
"url": "https://pbs.twimg.com/media/G7MwYCiWEAAqqRD.jpg?name=orig",
"proxy_url": "https://images-ext-1.discordapp.net/external/mOjvdSeAqgFRFxTBy3Pv4ScNK0ybzL6F76cXnI9e7gw/%3Fname%3Dorig/https/pbs.twimg.com/media/G7MwYCiWEAAqqRD.jpg",
"width": 1680,
"height": 1764,
"content_type": "image/jpeg",
"placeholder": "MPkBLwJaiZVmeIaXezh493eBnaz3vXkP",
"placeholder_version": 1,
"flags": 0
},
"flags": 0,
"type": "rich",
"url": "https://fxtwitter.com/OtakuOwletMerch/status/1995992355090887152"
}
]
},
{
"timestamp": "2026-01-08T08:14:39.570363",
"message_id": 1458735643185250437,
"is_bot_message": true,
"content": "\"Yay! 🎉 Just saw the cutest tweet about a new fuwa petit figurine! It's Miku on a colorful stage! I can already imagine how adorable it's going to be! Don't forget to pre-order it before Dec 20th! 📦💖\" - Miku 💖",
"attachments": [],
"reactions": [
{
"emoji": "💖",
"reactor_id": 236553184401817601,
"reactor_name": "Diamond",
"is_bot": false,
"added_at": "2026-01-08T20:56:11.528204"
}
]
},
{
"timestamp": "2026-01-08T21:29:40.768138",
"message_id": 1458935716112371837,
"is_bot_message": false,
"content": "what an interesting design",
"attachments": [],
"reactions": []
},
{
"timestamp": "2026-01-08T21:29:53.062346",
"message_id": 1458935768541298781,
"is_bot_message": true,
"content": "🤩 Oh my, that design is so cool! The way the sparkles and diamonds are arranged is just like my voice synthesizer! 💎✨ Can you imagine me singing with that design as a background? 🎶💖",
"attachments": [],
"reactions": [
{
"emoji": "💖",
"reactor_id": 236553184401817601,
"reactor_name": "Diamond",
"is_bot": false,
"added_at": "2026-01-08T21:30:07.785524"
}
]
} }
] ]
} }

View File

@@ -18,3 +18,5 @@ Pillow
opencv-contrib-python opencv-contrib-python
numpy numpy
scikit-learn scikit-learn
transformers
torch

View File

@@ -635,7 +635,12 @@
<div class="panel"> <div class="panel">
<div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 1rem;"> <div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 1rem;">
<div style="display: flex; gap: 1rem; align-items: center;">
<h1 id="panel-title">Miku Control Panel</h1> <h1 id="panel-title">Miku Control Panel</h1>
<button id="gpu-selector-toggle" onclick="toggleGPU()" style="background: #2a5599; color: #fff; padding: 0.5rem 1rem; border: 2px solid #4a7bc9; border-radius: 4px; cursor: pointer; font-weight: bold; font-size: 0.9rem;">
🎮 GPU: NVIDIA
</button>
</div>
<div style="display: flex; gap: 0.5rem; align-items: center;"> <div style="display: flex; gap: 0.5rem; align-items: center;">
<button id="bipolar-mode-toggle" onclick="toggleBipolarMode()" style="background: #333; color: #fff; padding: 0.5rem 1rem; border: 2px solid #666; border-radius: 4px; cursor: pointer; font-weight: bold;"> <button id="bipolar-mode-toggle" onclick="toggleBipolarMode()" style="background: #333; color: #fff; padding: 0.5rem 1rem; border: 2px solid #666; border-radius: 4px; cursor: pointer; font-weight: bold;">
🔄 Bipolar: OFF 🔄 Bipolar: OFF
@@ -804,7 +809,33 @@
<!-- Bipolar Mode Section (only visible when bipolar mode is on) --> <!-- Bipolar Mode Section (only visible when bipolar mode is on) -->
<div id="bipolar-section" class="section" style="display: none; border: 2px solid #9932CC; padding: 1rem; border-radius: 8px; background: #1a1a2e;"> <div id="bipolar-section" class="section" style="display: none; border: 2px solid #9932CC; padding: 1rem; border-radius: 8px; background: #1a1a2e;">
<h3 style="color: #9932CC;">🔄 Bipolar Mode Controls</h3> <h3 style="color: #9932CC;">🔄 Bipolar Mode Controls</h3>
<p style="font-size: 0.9rem; color: #aaa;">Trigger arguments between Regular Miku and Evil Miku</p> <p style="font-size: 0.9rem; color: #aaa;">Trigger arguments or dialogues between Regular Miku and Evil Miku</p>
<!-- Persona Dialogue Section -->
<div style="margin-bottom: 2rem; padding: 1rem; background: #252540; border-radius: 8px; border: 1px solid #555;">
<h4 style="color: #6B8EFF; margin-bottom: 0.5rem;">💬 Trigger Persona Dialogue</h4>
<p style="font-size: 0.85rem; color: #999; margin-bottom: 1rem;">Start a natural conversation between the personas (can escalate to argument if tension builds)</p>
<div style="margin-bottom: 1rem;">
<label for="dialogue-message-id">Message ID:</label>
<input type="text" id="dialogue-message-id" placeholder="e.g., 1234567890123456789" style="width: 250px; margin-left: 0.5rem; font-family: monospace;">
</div>
<div style="font-size: 0.8rem; color: #888; margin-bottom: 1rem;">
💡 <strong>Tip:</strong> Right-click any bot response message in Discord and select "Copy Message ID". The opposite persona will analyze it and decide whether to interject.
</div>
<button onclick="triggerPersonaDialogue()" style="background: #6B8EFF; color: #fff; border: none; padding: 0.5rem 1rem; border-radius: 4px; cursor: pointer;">
💬 Trigger Dialogue
</button>
<div id="dialogue-status" style="margin-top: 1rem; font-size: 0.9rem;"></div>
</div>
<!-- Argument Section -->
<div style="padding: 1rem; background: #2e1a2e; border-radius: 8px; border: 1px solid #555;">
<h4 style="color: #9932CC; margin-bottom: 0.5rem;">⚔️ Trigger Argument</h4>
<p style="font-size: 0.85rem; color: #999; margin-bottom: 1rem;">Force an immediate argument (bypasses dialogue system)</p>
<div style="margin-bottom: 1rem; display: flex; gap: 1rem; flex-wrap: wrap;"> <div style="margin-bottom: 1rem; display: flex; gap: 1rem; flex-wrap: wrap;">
<div> <div>
@@ -832,6 +863,7 @@
</button> </button>
<div id="bipolar-status" style="margin-top: 1rem; font-size: 0.9rem;"></div> <div id="bipolar-status" style="margin-top: 1rem; font-size: 0.9rem;"></div>
</div>
<!-- Scoreboard Display --> <!-- Scoreboard Display -->
<div id="bipolar-scoreboard" style="margin-top: 1.5rem; padding: 1rem; background: #0f0f1e; border-radius: 8px; border: 1px solid #444;"> <div id="bipolar-scoreboard" style="margin-top: 1.5rem; padding: 1rem; background: #0f0f1e; border-radius: 8px; border: 1px solid #444;">
@@ -1416,6 +1448,7 @@ document.addEventListener('DOMContentLoaded', function() {
loadLogs(); loadLogs();
checkEvilModeStatus(); // Check evil mode on load checkEvilModeStatus(); // Check evil mode on load
checkBipolarModeStatus(); // Check bipolar mode on load checkBipolarModeStatus(); // Check bipolar mode on load
checkGPUStatus(); // Check GPU selection on load
console.log('🚀 DOMContentLoaded - initializing figurine subscribers list'); console.log('🚀 DOMContentLoaded - initializing figurine subscribers list');
refreshFigurineSubscribers(); refreshFigurineSubscribers();
loadProfilePictureMetadata(); loadProfilePictureMetadata();
@@ -2194,6 +2227,59 @@ function updateEvilModeUI() {
updateBipolarToggleVisibility(); updateBipolarToggleVisibility();
} }
// GPU Selection Management
let selectedGPU = 'nvidia'; // 'nvidia' or 'amd'
async function checkGPUStatus() {
try {
const response = await fetch('/gpu-status');
if (response.ok) {
const data = await response.json();
selectedGPU = data.gpu || 'nvidia';
updateGPUUI();
}
} catch (error) {
console.error('Failed to check GPU status:', error);
}
}
async function toggleGPU() {
try {
const toggleBtn = document.getElementById('gpu-selector-toggle');
toggleBtn.disabled = true;
toggleBtn.textContent = '⏳ Switching...';
const result = await apiCall('/gpu-select', 'POST', {
gpu: selectedGPU === 'nvidia' ? 'amd' : 'nvidia'
});
selectedGPU = result.gpu;
updateGPUUI();
const gpuName = selectedGPU === 'nvidia' ? 'NVIDIA GTX 1660' : 'AMD RX 6800';
showNotification(`🎮 Switched to ${gpuName}!`);
} catch (error) {
console.error('Failed to toggle GPU:', error);
showNotification('Failed to switch GPU: ' + error.message, 'error');
toggleBtn.disabled = false;
}
}
function updateGPUUI() {
const toggleBtn = document.getElementById('gpu-selector-toggle');
if (selectedGPU === 'amd') {
toggleBtn.textContent = '🎮 GPU: AMD';
toggleBtn.style.background = '#c91432';
toggleBtn.style.borderColor = '#e91436';
} else {
toggleBtn.textContent = '🎮 GPU: NVIDIA';
toggleBtn.style.background = '#2a5599';
toggleBtn.style.borderColor = '#4a7bc9';
}
toggleBtn.disabled = false;
}
// Bipolar Mode Management // Bipolar Mode Management
let bipolarMode = false; let bipolarMode = false;
@@ -2266,6 +2352,48 @@ function updateBipolarToggleVisibility() {
bipolarToggle.style.display = 'block'; bipolarToggle.style.display = 'block';
} }
async function triggerPersonaDialogue() {
const messageIdInput = document.getElementById('dialogue-message-id').value.trim();
const statusDiv = document.getElementById('dialogue-status');
if (!messageIdInput) {
showNotification('Please enter a message ID', 'error');
return;
}
// Validate message ID format (should be numeric)
if (!/^\d+$/.test(messageIdInput)) {
showNotification('Invalid message ID format - should be a number', 'error');
return;
}
try {
statusDiv.innerHTML = '<span style="color: #6B8EFF;">⏳ Analyzing message for dialogue trigger...</span>';
const requestBody = {
message_id: messageIdInput
};
const result = await apiCall('/bipolar-mode/trigger-dialogue', 'POST', requestBody);
if (result.status === 'error') {
statusDiv.innerHTML = `<span style="color: #ff4444;">❌ ${result.message}</span>`;
showNotification(result.message, 'error');
return;
}
statusDiv.innerHTML = `<span style="color: #00ff00;">✅ ${result.message}</span>`;
showNotification(`💬 ${result.message}`);
// Clear the input
document.getElementById('dialogue-message-id').value = '';
} catch (error) {
statusDiv.innerHTML = `<span style="color: #ff4444;">❌ Failed to trigger dialogue: ${error.message}</span>`;
showNotification(`Error: ${error.message}`, 'error');
}
}
async function triggerBipolarArgument() { async function triggerBipolarArgument() {
const channelIdInput = document.getElementById('bipolar-channel-id').value.trim(); const channelIdInput = document.getElementById('bipolar-channel-id').value.trim();
const messageIdInput = document.getElementById('bipolar-message-id').value.trim(); const messageIdInput = document.getElementById('bipolar-message-id').value.trim();

View File

@@ -106,6 +106,7 @@ def restore_bipolar_mode_on_startup():
if bipolar_mode: if bipolar_mode:
print("🔄 Bipolar mode restored from previous session") print("🔄 Bipolar mode restored from previous session")
print("💬 Persona dialogue system enabled (natural conversations + arguments)")
return bipolar_mode return bipolar_mode

69
bot/utils/gpu_preload.py Normal file
View File

@@ -0,0 +1,69 @@
"""
GPU Model Preloading Utility
Preloads models on AMD GPU to take advantage of 16GB VRAM
"""
import aiohttp
import asyncio
import json
import globals
async def preload_amd_models():
"""
Preload both text and vision models on AMD GPU
Since AMD RX 6800 has 16GB VRAM, we can keep both loaded simultaneously
"""
print("🔧 Preloading models on AMD GPU...")
# Preload text model
try:
async with aiohttp.ClientSession() as session:
payload = {
"model": "llama3.1",
"messages": [{"role": "user", "content": "Hi"}],
"max_tokens": 1
}
async with session.post(
f"{globals.LLAMA_AMD_URL}/v1/chat/completions",
json=payload,
timeout=aiohttp.ClientTimeout(total=60)
) as response:
if response.status == 200:
print("✅ Text model (llama3.1) preloaded on AMD GPU")
else:
print(f"⚠️ Text model preload returned status {response.status}")
except Exception as e:
print(f"⚠️ Failed to preload text model on AMD: {e}")
# Preload vision model
try:
async with aiohttp.ClientSession() as session:
# Create a minimal test image (1x1 white pixel)
import base64
test_image = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mP8z8DwHwAFBQIAX8jx0gAAAABJRU5ErkJggg=="
payload = {
"model": "vision",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What do you see?"},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{test_image}"}}
]
}
],
"max_tokens": 1
}
async with session.post(
f"{globals.LLAMA_AMD_URL}/v1/chat/completions",
json=payload,
timeout=aiohttp.ClientTimeout(total=120)
) as response:
if response.status == 200:
print("✅ Vision model preloaded on AMD GPU")
else:
print(f"⚠️ Vision model preload returned status {response.status}")
except Exception as e:
print(f"⚠️ Failed to preload vision model on AMD: {e}")
print("✅ AMD GPU preload complete - both models ready")

191
bot/utils/gpu_router.py Normal file
View File

@@ -0,0 +1,191 @@
"""
GPU Router Utility for Dual GPU Setup
Manages routing between NVIDIA and AMD GPUs for model inference
"""
import os
import random
import logging
from typing import Optional, Literal
import globals
logger = logging.getLogger(__name__)
# Model to GPU mapping
MODEL_TO_GPU = {
# NVIDIA models (primary GPU)
"llama3.1": globals.LLAMA_URL,
"text-model": globals.LLAMA_URL,
"darkidol": globals.LLAMA_URL,
"evil-model": globals.LLAMA_URL,
"uncensored": globals.LLAMA_URL,
"vision": globals.LLAMA_URL,
"vision-model": globals.LLAMA_URL,
"minicpm": globals.LLAMA_URL,
# AMD models (secondary GPU - RX 6800)
"llama3.1-amd": globals.LLAMA_AMD_URL,
"text-model-amd": globals.LLAMA_AMD_URL,
"amd-text": globals.LLAMA_AMD_URL,
"darkidol-amd": globals.LLAMA_AMD_URL,
"evil-model-amd": globals.LLAMA_AMD_URL,
"uncensored-amd": globals.LLAMA_AMD_URL,
"moondream-amd": globals.LLAMA_AMD_URL,
"vision-amd": globals.LLAMA_AMD_URL,
"moondream": globals.LLAMA_AMD_URL,
}
# Configuration
PREFER_AMD_GPU = os.getenv("PREFER_AMD_GPU", "false").lower() == "true"
AMD_MODELS_ENABLED = os.getenv("AMD_MODELS_ENABLED", "true").lower() == "true"
def get_endpoint_for_model(model_name: str) -> str:
"""
Get the correct llama-swap endpoint for a model.
Args:
model_name: Name or alias of the model
Returns:
URL of the llama-swap endpoint (either NVIDIA or AMD)
"""
endpoint = MODEL_TO_GPU.get(model_name, globals.LLAMA_URL)
# If AMD models are disabled, use NVIDIA for AMD models too
if not AMD_MODELS_ENABLED and endpoint == globals.LLAMA_AMD_URL:
logger.warning(f"AMD GPU disabled, routing {model_name} to NVIDIA GPU")
# Map AMD model name to NVIDIA equivalent
nvidia_model = model_name.replace("-amd", "")
endpoint = globals.LLAMA_URL
return endpoint
def is_amd_model(model_name: str) -> bool:
"""
Check if a model runs on the AMD GPU.
Args:
model_name: Name or alias of the model
Returns:
True if model runs on AMD GPU, False otherwise
"""
return model_name.endswith("-amd") or model_name in ["moondream", "moondream-amd", "vision-amd"]
def get_llama_url_with_load_balancing(
prefer_amd: bool = False,
task_type: Literal["text", "vision", "evil"] = "text"
) -> tuple[str, str]:
"""
Get llama URL with optional load balancing between GPUs.
Returns both URL and recommended model name.
Args:
prefer_amd: If True, prefer AMD GPU when possible
task_type: Type of task (text, vision, or evil)
Returns:
Tuple of (url, model_name)
"""
if not AMD_MODELS_ENABLED:
# AMD disabled, use NVIDIA only
if task_type == "evil":
return globals.LLAMA_URL, "darkidol"
elif task_type == "vision":
return globals.LLAMA_URL, "vision"
else:
return globals.LLAMA_URL, "llama3.1"
# AMD enabled - implement load balancing
use_amd = prefer_amd or PREFER_AMD_GPU or (random.random() < 0.5)
if task_type == "evil":
# Evil/uncensored models
if use_amd:
return globals.LLAMA_AMD_URL, "darkidol-amd"
else:
return globals.LLAMA_URL, "darkidol"
elif task_type == "vision":
# Vision models - MiniCPM on NVIDIA, Moondream on AMD
if use_amd:
return globals.LLAMA_AMD_URL, "moondream-amd"
else:
return globals.LLAMA_URL, "vision"
else:
# Text generation - round robin between GPUs
if use_amd:
return globals.LLAMA_AMD_URL, "llama3.1-amd"
else:
return globals.LLAMA_URL, "llama3.1"
def get_vision_model_for_gpu(prefer_amd: bool = False) -> tuple[str, str]:
"""
Get the appropriate vision model based on GPU preference.
Args:
prefer_amd: If True, use AMD GPU vision model
Returns:
Tuple of (url, model_name)
"""
if prefer_amd and AMD_MODELS_ENABLED:
return globals.LLAMA_AMD_URL, "moondream-amd"
else:
return globals.LLAMA_URL, "vision"
def get_text_model_for_gpu(prefer_amd: bool = False) -> tuple[str, str]:
"""
Get the appropriate text model based on GPU preference.
Args:
prefer_amd: If True, use AMD GPU text model
Returns:
Tuple of (url, model_name)
"""
if prefer_amd and AMD_MODELS_ENABLED:
return globals.LLAMA_AMD_URL, "llama3.1-amd"
else:
return globals.LLAMA_URL, "llama3.1"
def log_gpu_routing(model_name: str, endpoint: str, task_type: str = "inference"):
"""
Log GPU routing decision for debugging.
Args:
model_name: Name of the model being used
endpoint: URL endpoint being used
task_type: Type of task being performed
"""
gpu_type = "AMD RX 6800" if endpoint == globals.LLAMA_AMD_URL else "NVIDIA"
logger.info(f"[GPU Router] {task_type} - Using {model_name} on {gpu_type} ({endpoint})")
# Example usage in bot code:
"""
# Simple routing by model name
url = get_endpoint_for_model("llama3.1-amd")
# Load balanced routing
url, model = get_llama_url_with_load_balancing(task_type="text")
response = requests.post(
f"{url}/v1/chat/completions",
json={"model": model, ...}
)
# Vision model with GPU preference
url, model = get_vision_model_for_gpu(prefer_amd=True)
# With logging
url = get_endpoint_for_model("darkidol-amd")
log_gpu_routing("darkidol-amd", url, "evil mode generation")
"""

View File

@@ -233,7 +233,9 @@ async def analyze_image_with_vision(base64_img):
""" """
Analyze an image using llama.cpp multimodal capabilities. Analyze an image using llama.cpp multimodal capabilities.
Uses OpenAI-compatible chat completions API with image_url. Uses OpenAI-compatible chat completions API with image_url.
Always uses NVIDIA GPU for vision model.
""" """
from utils.llm import get_vision_gpu_url
payload = { payload = {
"model": globals.VISION_MODEL, "model": globals.VISION_MODEL,
@@ -262,7 +264,8 @@ async def analyze_image_with_vision(base64_img):
async with aiohttp.ClientSession() as session: async with aiohttp.ClientSession() as session:
try: try:
async with session.post(f"{globals.LLAMA_URL}/v1/chat/completions", json=payload, headers=headers) as response: vision_url = get_vision_gpu_url()
async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers) as response:
if response.status == 200: if response.status == 200:
data = await response.json() data = await response.json()
return data.get("choices", [{}])[0].get("message", {}).get("content", "No description.") return data.get("choices", [{}])[0].get("message", {}).get("content", "No description.")
@@ -323,7 +326,8 @@ async def analyze_video_with_vision(video_frames, media_type="video"):
async with aiohttp.ClientSession() as session: async with aiohttp.ClientSession() as session:
try: try:
async with session.post(f"{globals.LLAMA_URL}/v1/chat/completions", json=payload, headers=headers) as response: vision_url = get_vision_gpu_url()
async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers) as response:
if response.status == 200: if response.status == 200:
data = await response.json() data = await response.json()
return data.get("choices", [{}])[0].get("message", {}).get("content", "No description.") return data.get("choices", [{}])[0].get("message", {}).get("content", "No description.")

View File

@@ -4,11 +4,38 @@ import aiohttp
import datetime import datetime
import globals import globals
import asyncio import asyncio
import json
import os
from utils.context_manager import get_context_for_response_type, get_complete_context from utils.context_manager import get_context_for_response_type, get_complete_context
from utils.moods import load_mood_description from utils.moods import load_mood_description
from utils.conversation_history import conversation_history from utils.conversation_history import conversation_history
def get_current_gpu_url():
"""Get the URL for the currently selected GPU for text models"""
gpu_state_file = os.path.join(os.path.dirname(__file__), "..", "memory", "gpu_state.json")
try:
with open(gpu_state_file, "r") as f:
state = json.load(f)
current_gpu = state.get("current_gpu", "nvidia")
if current_gpu == "amd":
return globals.LLAMA_AMD_URL
else:
return globals.LLAMA_URL
except Exception as e:
print(f"⚠️ GPU state read error: {e}, defaulting to NVIDIA")
# Default to NVIDIA if state file doesn't exist
return globals.LLAMA_URL
def get_vision_gpu_url():
"""
Get the URL for vision model inference.
Strategy: Always use NVIDIA GPU for vision to avoid unloading/reloading.
- When NVIDIA is primary: Use NVIDIA for both text and vision
- When AMD is primary: Use AMD for text, NVIDIA for vision (keeps vision loaded)
"""
return globals.LLAMA_URL # Always use NVIDIA for vision
def _strip_surrounding_quotes(text): def _strip_surrounding_quotes(text):
""" """
Remove surrounding quotes from text if present. Remove surrounding quotes from text if present.
@@ -233,9 +260,13 @@ Please respond in a way that reflects this emotional tone.{pfp_context}"""
async with aiohttp.ClientSession() as session: async with aiohttp.ClientSession() as session:
try: try:
# Get current GPU URL based on user selection
llama_url = get_current_gpu_url()
print(f"🎮 Using GPU endpoint: {llama_url}")
# Add timeout to prevent hanging indefinitely # Add timeout to prevent hanging indefinitely
timeout = aiohttp.ClientTimeout(total=300) # 300 second timeout timeout = aiohttp.ClientTimeout(total=300) # 300 second timeout
async with session.post(f"{globals.LLAMA_URL}/v1/chat/completions", json=payload, headers=headers, timeout=timeout) as response: async with session.post(f"{llama_url}/v1/chat/completions", json=payload, headers=headers, timeout=timeout) as response:
if response.status == 200: if response.status == 200:
data = await response.json() data = await response.json()
reply = data.get("choices", [{}])[0].get("message", {}).get("content", "No response.") reply = data.get("choices", [{}])[0].get("message", {}).get("content", "No response.")

View File

@@ -0,0 +1,964 @@
# utils/persona_dialogue.py
"""
Persona Dialogue System for Miku.
Enables natural back-and-forth conversations between Hatsune Miku and Evil Miku.
Unlike bipolar_mode.py (which handles arguments), this module handles:
- Detecting when the opposite persona should interject
- Managing natural dialogue flow with self-signaling continuation
- Tracking tension that can escalate into arguments
- Seamless handoff to the argument system when tension is high
This system is designed to be lightweight on LLM calls:
- Initial trigger uses fast heuristics + sentiment analysis
- Each dialogue turn uses ONE LLM call that generates response AND decides continuation
- Only escalates to argument system when tension threshold is reached
"""
import os
import json
import time
import asyncio
import discord
import globals
from transformers import pipeline
# ============================================================================
# CONSTANTS
# ============================================================================
DIALOGUE_STATE_FILE = "memory/persona_dialogue_state.json"
# Dialogue settings
MAX_TURNS = 20 # Maximum turns before forced end
DIALOGUE_TIMEOUT = 900 # 15 minutes max dialogue duration
ARGUMENT_TENSION_THRESHOLD = 0.75 # Tension level that triggers argument escalation
# Initial trigger settings
INTERJECTION_COOLDOWN_HARD = 180 # 3 minutes hard block
INTERJECTION_COOLDOWN_SOFT = 900 # 15 minutes for full recovery
INTERJECTION_THRESHOLD = 0.75 # Score needed to trigger interjection (lowered to account for mood multipliers)
# ============================================================================
# INTERJECTION SCORER (Initial Trigger Decision)
# ============================================================================
class InterjectionScorer:
"""
Decides if the opposite persona should interject based on message content.
Uses fast heuristics + sentiment analysis (no LLM calls).
"""
_instance = None
_sentiment_analyzer = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
@property
def sentiment_analyzer(self):
"""Lazy load sentiment analyzer"""
if self._sentiment_analyzer is None:
print("🔄 Loading sentiment analyzer for persona dialogue...")
try:
self._sentiment_analyzer = pipeline(
"sentiment-analysis",
model="distilbert-base-uncased-finetuned-sst-2-english"
)
print("✅ Sentiment analyzer loaded")
except Exception as e:
print(f"⚠️ Failed to load sentiment analyzer: {e}")
self._sentiment_analyzer = None
return self._sentiment_analyzer
async def should_interject(self, message: discord.Message, current_persona: str) -> tuple:
"""
Determine if the opposite persona should interject.
Args:
message: The Discord message to analyze
current_persona: Who just spoke ("miku" or "evil")
Returns:
Tuple of (should_interject: bool, reason: str, score: float)
"""
# Quick rejections
if not self._passes_basic_filter(message):
return False, "basic_filter_failed", 0.0
# Check cooldown
cooldown_mult = self._check_cooldown()
if cooldown_mult == 0.0:
return False, "cooldown_active", 0.0
opposite_persona = "evil" if current_persona == "miku" else "miku"
print(f"🔍 [Interjection] Analyzing content: '{message.content[:100]}...'")
print(f"🔍 [Interjection] Current persona: {current_persona}, Opposite: {opposite_persona}")
# Calculate score from various factors
score = 0.0
reasons = []
# Factor 1: Direct addressing (automatic trigger)
if self._mentions_opposite(message.content, opposite_persona):
print(f"✅ [Interjection] Direct mention of {opposite_persona} detected!")
return True, "directly_addressed", 1.0
# Factor 2: Topic relevance
topic_score = self._check_topic_relevance(message.content, opposite_persona)
if topic_score > 0:
score += topic_score * 0.3
reasons.append(f"topic:{topic_score:.2f}")
# Factor 3: Emotional intensity
emotion_score = self._check_emotional_intensity(message.content)
if emotion_score > 0.6:
score += emotion_score * 0.25
reasons.append(f"emotion:{emotion_score:.2f}")
# Factor 4: Personality clash
clash_score = self._detect_personality_clash(message.content, opposite_persona)
if clash_score > 0:
score += clash_score * 0.25
reasons.append(f"clash:{clash_score:.2f}")
# Factor 5: Mood multiplier
mood_mult = self._get_mood_multiplier(opposite_persona)
score *= mood_mult
if mood_mult != 1.0:
reasons.append(f"mood_mult:{mood_mult:.2f}")
# Factor 6: Context bonus
context_bonus = self._check_conversation_context(message)
score += context_bonus * 0.2
if context_bonus > 0:
reasons.append(f"context:{context_bonus:.2f}")
# Apply cooldown multiplier
score *= cooldown_mult
# Decision
should_interject = score >= INTERJECTION_THRESHOLD
reason_str = " | ".join(reasons) if reasons else "no_triggers"
if should_interject:
print(f"{opposite_persona.upper()} WILL INTERJECT (score: {score:.2f})")
print(f" Reasons: {reason_str}")
return should_interject, reason_str, score
def _passes_basic_filter(self, message: discord.Message) -> bool:
"""Fast rejection criteria"""
# System messages
if message.type != discord.MessageType.default:
print(f"❌ [Basic Filter] System message type: {message.type}")
return False
# Bipolar mode must be enabled
if not globals.BIPOLAR_MODE:
print(f"❌ [Basic Filter] Bipolar mode not enabled")
return False
# Allow bot's own messages (we're checking them for interjections!)
# Also allow webhook messages (persona messages)
# Only reject OTHER bots' messages
if message.author.bot and not message.webhook_id:
# Check if it's our own bot
if message.author.id != globals.client.user.id:
print(f"❌ [Basic Filter] Other bot message (not our bot)")
return False
print(f"✅ [Basic Filter] Passed (bot={message.author.bot}, webhook={message.webhook_id}, our_bot={message.author.id == globals.client.user.id if message.author.bot else 'N/A'})")
return True
def _mentions_opposite(self, content: str, opposite_persona: str) -> bool:
"""Check if message directly addresses the opposite persona"""
content_lower = content.lower()
if opposite_persona == "evil":
patterns = ["evil miku", "dark miku", "evil version", "bad miku", "evil you"]
else:
patterns = ["normal miku", "regular miku", "good miku", "real miku", "nice miku", "other miku", "original miku"]
return any(pattern in content_lower for pattern in patterns)
def _check_topic_relevance(self, content: str, opposite_persona: str) -> float:
"""Check if topics would interest the opposite persona"""
content_lower = content.lower()
if opposite_persona == "evil":
# Things Evil Miku can't resist commenting on
TRIGGER_TOPICS = {
"optimism": ["happiness", "joy", "love", "kindness", "hope", "dreams", "wonderful", "amazing"],
"morality": ["good", "should", "must", "right thing", "deserve", "fair", "justice"],
"weakness": ["scared", "nervous", "worried", "unsure", "help me", "don't know"],
"innocence": ["innocent", "pure", "sweet", "cute", "wholesome", "precious"],
}
else:
# Things Miku can't ignore
TRIGGER_TOPICS = {
"negativity": ["hate", "terrible", "awful", "worst", "horrible", "disgusting", "pathetic"],
"cruelty": ["deserve pain", "suffer", "worthless", "stupid", "idiot", "fool"],
"hopelessness": ["no point", "meaningless", "nobody cares", "why bother", "give up"],
"evil_gloating": ["foolish", "naive", "weak", "inferior", "pathetic"],
}
total_matches = 0
for category, keywords in TRIGGER_TOPICS.items():
matches = sum(1 for keyword in keywords if keyword in content_lower)
total_matches += matches
return min(total_matches / 3.0, 1.0)
def _check_emotional_intensity(self, content: str) -> float:
"""Check emotional intensity using sentiment analysis"""
if not self.sentiment_analyzer:
return 0.5 # Neutral if no analyzer
try:
result = self.sentiment_analyzer(content[:512])[0]
confidence = result['score']
# Punctuation intensity
exclamations = content.count('!')
questions = content.count('?')
caps_ratio = sum(1 for c in content if c.isupper()) / max(len(content), 1)
intensity_markers = (exclamations * 0.15) + (questions * 0.1) + (caps_ratio * 0.3)
return min(confidence * 0.6 + intensity_markers, 1.0)
except Exception as e:
print(f"⚠️ Sentiment analysis error: {e}")
return 0.5
def _detect_personality_clash(self, content: str, opposite_persona: str) -> float:
"""Detect statements that clash with the opposite persona's values"""
content_lower = content.lower()
if opposite_persona == "evil":
# User being too positive/naive = Evil Miku wants to "correct" them
positive_statements = [
"i believe in", "i love", "everything will be", "so happy",
"the best", "amazing", "perfect", "wonderful life", "so grateful"
]
return 0.8 if any(stmt in content_lower for stmt in positive_statements) else 0.0
else:
# User being cruel/negative = Miku wants to help/defend
negative_statements = [
"i hate", "everyone sucks", "life is meaningless", "don't care",
"deserve to suffer", "nobody matters", "worthless", "all terrible"
]
return 0.8 if any(stmt in content_lower for stmt in negative_statements) else 0.0
def _get_mood_multiplier(self, opposite_persona: str) -> float:
"""Current mood affects likelihood of interjection"""
if opposite_persona == "evil":
MOOD_MULTIPLIERS = {
"aggressive": 1.5,
"cruel": 1.3,
"mischievous": 1.2,
"cunning": 1.0,
"sarcastic": 1.1,
"evil_neutral": 0.8,
"contemplative": 0.6,
}
return MOOD_MULTIPLIERS.get(globals.EVIL_DM_MOOD, 1.0)
else:
MOOD_MULTIPLIERS = {
"bubbly": 1.4,
"excited": 1.3,
"curious": 1.2,
"neutral": 1.0,
"irritated": 0.9,
"melancholy": 0.7,
"asleep": 0.1,
}
return MOOD_MULTIPLIERS.get(globals.DM_MOOD, 1.0)
def _check_conversation_context(self, message: discord.Message) -> float:
"""Check if this is part of an active conversation"""
score = 0.0
# Part of a reply chain
if hasattr(message, 'reference') and message.reference:
score += 0.5
# Could add more context checks here
score += 0.2 # Base activity bonus
return min(score, 1.0)
def _check_cooldown(self) -> float:
"""Check cooldown and return multiplier (0.0 = blocked, 1.0 = full)"""
if not hasattr(globals, 'LAST_PERSONA_DIALOGUE_TIME'):
globals.LAST_PERSONA_DIALOGUE_TIME = 0
current_time = time.time()
time_since_last = current_time - globals.LAST_PERSONA_DIALOGUE_TIME
if time_since_last < INTERJECTION_COOLDOWN_HARD:
return 0.0
elif time_since_last < INTERJECTION_COOLDOWN_SOFT:
return (time_since_last - INTERJECTION_COOLDOWN_HARD) / (INTERJECTION_COOLDOWN_SOFT - INTERJECTION_COOLDOWN_HARD)
else:
return 1.0
# ============================================================================
# PERSONA DIALOGUE MANAGER
# ============================================================================
class PersonaDialogue:
"""
Manages natural back-and-forth conversations between Miku and Evil Miku.
Each turn:
1. Generate response + continuation signal (single LLM call)
2. Calculate tension delta from response
3. If tension >= threshold, escalate to argument
4. Otherwise, continue or end based on signal
"""
_instance = None
_sentiment_analyzer = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance.active_dialogues = {}
return cls._instance
@property
def sentiment_analyzer(self):
"""Lazy load sentiment analyzer (shared with InterjectionScorer)"""
if self._sentiment_analyzer is None:
scorer = InterjectionScorer()
self._sentiment_analyzer = scorer.sentiment_analyzer
return self._sentiment_analyzer
# ========================================================================
# DIALOGUE STATE MANAGEMENT
# ========================================================================
def is_dialogue_active(self, channel_id: int) -> bool:
"""Check if a dialogue is active in a channel"""
return channel_id in self.active_dialogues
def get_dialogue_state(self, channel_id: int) -> dict:
"""Get dialogue state for a channel"""
return self.active_dialogues.get(channel_id, None)
def start_dialogue(self, channel_id: int) -> dict:
"""Start a new dialogue in a channel"""
state = {
"turn_count": 0,
"started_at": time.time(),
"tension": 0.0,
"tension_history": [],
"last_speaker": None,
}
self.active_dialogues[channel_id] = state
globals.LAST_PERSONA_DIALOGUE_TIME = time.time()
print(f"💬 Started persona dialogue in channel {channel_id}")
return state
def end_dialogue(self, channel_id: int):
"""End a dialogue in a channel"""
if channel_id in self.active_dialogues:
state = self.active_dialogues[channel_id]
print(f"🏁 Ended persona dialogue in channel {channel_id}")
print(f" Turns: {state['turn_count']}, Final tension: {state['tension']:.2f}")
del self.active_dialogues[channel_id]
# ========================================================================
# TENSION CALCULATION
# ========================================================================
def calculate_tension_delta(self, response_text: str, current_tension: float) -> float:
"""
Analyze a response and determine how much tension it adds/removes.
Returns delta to add to current tension score.
"""
# Sentiment analysis
base_delta = 0.0
if self.sentiment_analyzer:
try:
sentiment = self.sentiment_analyzer(response_text[:512])[0]
sentiment_score = sentiment['score']
is_negative = sentiment['label'] == 'NEGATIVE'
if is_negative:
base_delta = sentiment_score * 0.15
else:
base_delta = -sentiment_score * 0.05
except Exception as e:
print(f"⚠️ Sentiment analysis error in tension calc: {e}")
text_lower = response_text.lower()
# Escalation patterns
escalation_patterns = {
"insult": ["idiot", "stupid", "pathetic", "fool", "naive", "worthless", "disgusting", "moron"],
"dismissive": ["whatever", "don't care", "waste of time", "not worth", "beneath me", "boring"],
"confrontational": ["wrong", "you always", "you never", "how dare", "shut up", "stop"],
"mockery": ["oh please", "how cute", "adorable that you think", "laughable", "hilarious"],
"challenge": ["prove it", "fight me", "make me", "i dare you", "try me"],
}
# De-escalation patterns
deescalation_patterns = {
"concession": ["you're right", "fair point", "i suppose", "maybe you have", "good point"],
"softening": ["i understand", "let's calm", "didn't mean", "sorry", "apologize"],
"deflection": ["anyway", "moving on", "whatever you say", "agree to disagree", "let's just"],
}
# Check escalation
for category, patterns in escalation_patterns.items():
matches = sum(1 for p in patterns if p in text_lower)
if matches > 0:
base_delta += matches * 0.08
# Check de-escalation
for category, patterns in deescalation_patterns.items():
matches = sum(1 for p in patterns if p in text_lower)
if matches > 0:
base_delta -= matches * 0.06
# Intensity multipliers
exclamation_count = response_text.count('!')
caps_ratio = sum(1 for c in response_text if c.isupper()) / max(len(response_text), 1)
if exclamation_count > 2 or caps_ratio > 0.3:
base_delta *= 1.3
# Momentum factor
if current_tension > 0.5:
base_delta *= 1.2
return base_delta
# ========================================================================
# RESPONSE GENERATION
# ========================================================================
async def generate_response_with_continuation(
self,
channel: discord.TextChannel,
responding_persona: str,
context: str,
) -> tuple:
"""
Generate response AND continuation signal in a single LLM call.
Returns:
Tuple of (response_text, should_continue, confidence)
"""
from utils.llm import query_llama
opposite = "Hatsune Miku" if responding_persona == "evil" else "Evil Miku"
# Get system prompt for persona
system_prompt = self._get_persona_system_prompt(responding_persona)
# Build the combined prompt
prompt = f"""{system_prompt}
Recent conversation:
{context}
Respond naturally as yourself. Keep your response conversational and in-character.
---
After your response, evaluate whether {opposite} would want to (or need to) respond.
The conversation should CONTINUE if ANY of these are true:
- You asked them a direct question (almost always YES)
- You made a provocative claim they'd dispute
- You challenged or insulted them
- The topic feels unfinished or confrontational
- There's clear tension or disagreement
The conversation might END if ALL of these are true:
- No questions were asked
- You made a definitive closing statement ("I'm done", "whatever", "goodbye")
- The exchange reached complete resolution
- Both sides have said their piece
IMPORTANT: If you asked a question, the answer is almost always YES - they need to respond!
On a new line after your response, write:
[CONTINUE: YES or NO] [CONFIDENCE: HIGH, MEDIUM, or LOW]"""
# Use appropriate model
model = globals.EVIL_TEXT_MODEL if responding_persona == "evil" else globals.TEXT_MODEL
# Temporarily set evil mode for proper context
original_evil_mode = globals.EVIL_MODE
globals.EVIL_MODE = (responding_persona == "evil")
try:
raw_response = await query_llama(
user_prompt=prompt,
user_id=f"persona_dialogue_{channel.id}",
guild_id=channel.guild.id if hasattr(channel, 'guild') and channel.guild else None,
response_type="autonomous_general",
model=model
)
finally:
globals.EVIL_MODE = original_evil_mode
if not raw_response or raw_response.startswith("Error"):
return None, False, "LOW"
# Parse response and signal
response_text, should_continue, confidence = self._parse_response(raw_response)
return response_text, should_continue, confidence
def _parse_response(self, raw_response: str) -> tuple:
"""Extract response text and continuation signal"""
lines = raw_response.strip().split('\n')
should_continue = False
confidence = "MEDIUM"
response_lines = []
for line in lines:
line_upper = line.upper()
if "[CONTINUE:" in line_upper:
should_continue = "YES" in line_upper
if "HIGH" in line_upper:
confidence = "HIGH"
elif "LOW" in line_upper:
confidence = "LOW"
else:
confidence = "MEDIUM"
else:
response_lines.append(line)
response_text = '\n'.join(response_lines).strip()
# Clean up any stray signal markers
response_text = response_text.replace("[CONTINUE:", "").replace("]", "")
response_text = response_text.replace("YES", "").replace("NO", "")
response_text = response_text.replace("HIGH", "").replace("MEDIUM", "").replace("LOW", "")
response_text = response_text.strip()
# Override: If the response contains a question mark, always continue
if '?' in response_text:
print(f"⚠️ [Parse Override] Question detected, forcing continue=YES")
should_continue = True
if confidence == "LOW":
confidence = "MEDIUM"
return response_text, should_continue, confidence
def _get_persona_system_prompt(self, persona: str) -> str:
"""Get system prompt for a persona"""
if persona == "evil":
from utils.evil_mode import get_evil_system_prompt
return get_evil_system_prompt()
else:
# Regular Miku prompt - simplified for dialogue
return """You are Hatsune Miku, the virtual singer. You are in a conversation with your alter ego, Evil Miku.
You are generally kind, bubbly, and optimistic, but you're not a pushover. You can be:
- Assertive when defending your values
- Frustrated when she's being cruel
- Curious about her perspective
- Hopeful that you can find common ground
- Playful when the mood allows
Respond naturally and conversationally. Keep responses concise (1-3 sentences typically).
You can use emojis naturally! ✨💙"""
# ========================================================================
# DIALOGUE TURN HANDLING
# ========================================================================
async def handle_dialogue_turn(
self,
channel: discord.TextChannel,
responding_persona: str,
trigger_reason: str = None
):
"""
Handle one turn of dialogue, tracking tension for potential argument escalation.
"""
channel_id = channel.id
# Get or create dialogue state
state = self.active_dialogues.get(channel_id)
if not state:
state = self.start_dialogue(channel_id)
# Safety limits
if state["turn_count"] >= MAX_TURNS:
print(f"🛑 Dialogue reached {MAX_TURNS} turns, ending")
self.end_dialogue(channel_id)
return
if time.time() - state["started_at"] > DIALOGUE_TIMEOUT:
print(f"🛑 Dialogue timeout (15 min), ending")
self.end_dialogue(channel_id)
return
# Build context from recent messages
context = await self._build_conversation_context(channel)
# Generate response with continuation signal
response_text, should_continue, confidence = await self.generate_response_with_continuation(
channel=channel,
responding_persona=responding_persona,
context=context,
)
if not response_text:
print(f"⚠️ Failed to generate response for {responding_persona}")
self.end_dialogue(channel_id)
return
# Calculate tension change
tension_delta = self.calculate_tension_delta(response_text, state["tension"])
state["tension"] = max(0.0, min(1.0, state["tension"] + tension_delta))
state["tension_history"].append({
"turn": state["turn_count"],
"speaker": responding_persona,
"delta": tension_delta,
"total": state["tension"],
})
print(f"🌡️ Tension: {state['tension']:.2f} (delta: {tension_delta:+.2f})")
# Check if we should escalate to argument
if state["tension"] >= ARGUMENT_TENSION_THRESHOLD:
print(f"🔥 TENSION THRESHOLD REACHED ({state['tension']:.2f}) - ESCALATING TO ARGUMENT")
# Send the response that pushed us over
await self._send_as_persona(channel, responding_persona, response_text)
# Transition to argument system
await self._escalate_to_argument(channel, responding_persona, response_text)
return
# Send response
await self._send_as_persona(channel, responding_persona, response_text)
# Update state
state["turn_count"] += 1
state["last_speaker"] = responding_persona
print(f"🗣️ Turn {state['turn_count']}: {responding_persona} | Continue: {should_continue} ({confidence}) | Tension: {state['tension']:.2f}")
# Decide what happens next
opposite = "evil" if responding_persona == "miku" else "miku"
if should_continue and confidence in ["HIGH", "MEDIUM"]:
asyncio.create_task(self._next_turn(channel, opposite))
elif should_continue and confidence == "LOW":
asyncio.create_task(self._next_turn(channel, opposite))
elif not should_continue and confidence == "LOW":
# Offer opposite persona the last word
asyncio.create_task(
self._offer_last_word(channel, opposite, context + f"\n{responding_persona}: {response_text}")
)
else:
# Clear signal to end
print(f"🏁 Dialogue ended naturally after {state['turn_count']} turns (tension: {state['tension']:.2f})")
self.end_dialogue(channel_id)
async def _next_turn(self, channel: discord.TextChannel, persona: str):
"""Queue the next turn"""
# Check if dialogue was interrupted
if await self._was_interrupted(channel):
print(f"💬 Dialogue interrupted by other activity")
self.end_dialogue(channel.id)
return
await self.handle_dialogue_turn(channel, persona)
async def _offer_last_word(self, channel: discord.TextChannel, persona: str, context: str):
"""
When speaker said NO with LOW confidence, ask opposite if they want to respond.
"""
from utils.llm import query_llama
channel_id = channel.id
state = self.active_dialogues.get(channel_id)
if not state:
return
if await self._was_interrupted(channel):
self.end_dialogue(channel_id)
return
system_prompt = self._get_persona_system_prompt(persona)
prompt = f"""{system_prompt}
Recent exchange:
{context}
The conversation seems to be wrapping up, but wasn't explicitly ended.
Do you have anything to add? If so, respond naturally.
If you're fine letting it end here, write only: [DONE]
Don't force a response if you have nothing meaningful to contribute."""
model = globals.EVIL_TEXT_MODEL if persona == "evil" else globals.TEXT_MODEL
original_evil_mode = globals.EVIL_MODE
globals.EVIL_MODE = (persona == "evil")
try:
response = await query_llama(
user_prompt=prompt,
user_id=f"persona_dialogue_{channel_id}",
guild_id=channel.guild.id if hasattr(channel, 'guild') and channel.guild else None,
response_type="autonomous_general",
model=model
)
finally:
globals.EVIL_MODE = original_evil_mode
if not response:
self.end_dialogue(channel_id)
return
if "[DONE]" in response.upper():
print(f"🏁 {persona} chose not to respond, dialogue ended (tension: {state['tension']:.2f})")
self.end_dialogue(channel_id)
else:
clean_response = response.replace("[DONE]", "").strip()
# Calculate tension
tension_delta = self.calculate_tension_delta(clean_response, state["tension"])
state["tension"] = max(0.0, min(1.0, state["tension"] + tension_delta))
print(f"🌡️ Last word tension: {state['tension']:.2f} (delta: {tension_delta:+.2f})")
# Check for argument escalation
if state["tension"] >= ARGUMENT_TENSION_THRESHOLD:
print(f"🔥 TENSION THRESHOLD REACHED on last word - ESCALATING TO ARGUMENT")
await self._send_as_persona(channel, persona, clean_response)
await self._escalate_to_argument(channel, persona, clean_response)
return
# Normal flow
await self._send_as_persona(channel, persona, clean_response)
state["turn_count"] += 1
# Check if this looks like a closing statement
opposite = "evil" if persona == "miku" else "miku"
await self._check_if_final(channel, persona, clean_response, opposite)
async def _check_if_final(self, channel: discord.TextChannel, speaker: str, response: str, opposite: str):
"""Check if a response looks like a closing statement"""
state = self.active_dialogues.get(channel.id)
if not state:
return
# Simple heuristics for closing statements
closing_indicators = [
response.rstrip().endswith('.'), # Statement, not question
'?' not in response, # No questions asked
len(response) < 100, # Short responses often close things
]
if all(closing_indicators):
print(f"🏁 Dialogue ended after last word, {state['turn_count']} turns total")
self.end_dialogue(channel.id)
else:
asyncio.create_task(self._next_turn(channel, opposite))
# ========================================================================
# ARGUMENT ESCALATION
# ========================================================================
async def _escalate_to_argument(self, channel: discord.TextChannel, last_speaker: str, triggering_message: str):
"""
Transition from dialogue to full bipolar argument.
"""
from utils.bipolar_mode import is_argument_in_progress, run_argument
# Clean up dialogue state
self.end_dialogue(channel.id)
# Don't start if an argument is already going
if is_argument_in_progress(channel.id):
print(f"⚠️ Argument already in progress, skipping escalation")
return
# Build context for the argument
escalation_context = f"""This argument erupted from a conversation that got heated.
The last thing said was: "{triggering_message}"
This pushed things over the edge into a full argument."""
print(f"⚔️ Escalating to argument in #{channel.name}")
# Use the existing argument system
# Pass the triggering message so the opposite persona responds to it
await run_argument(
channel=channel,
client=globals.client,
trigger_context=escalation_context,
)
# ========================================================================
# HELPER METHODS
# ========================================================================
async def _was_interrupted(self, channel: discord.TextChannel) -> bool:
"""Check if someone else sent a message during the dialogue"""
state = self.active_dialogues.get(channel.id)
if not state:
return True
try:
async for msg in channel.history(limit=1):
# If latest message is NOT from our webhooks, we were interrupted
if not msg.webhook_id:
# Check if it's from the bot itself (could be normal response)
if msg.author.id != globals.client.user.id:
return True
except Exception as e:
print(f"⚠️ Error checking for interruption: {e}")
return False
async def _build_conversation_context(self, channel: discord.TextChannel, limit: int = 15) -> str:
"""Get recent messages for context"""
messages = []
try:
async for msg in channel.history(limit=limit):
speaker = self._identify_speaker(msg)
messages.append(f"{speaker}: {msg.content}")
messages.reverse()
except Exception as e:
print(f"⚠️ Error building conversation context: {e}")
return '\n'.join(messages)
def _identify_speaker(self, message: discord.Message) -> str:
"""Identify who sent a message"""
if message.webhook_id:
name_lower = (message.author.name or "").lower()
if "evil" in name_lower:
return "Evil Miku"
return "Hatsune Miku"
elif message.author.id == globals.client.user.id:
# Bot's own messages - check mode at time of message
if globals.EVIL_MODE:
return "Evil Miku"
return "Hatsune Miku"
return message.author.display_name
async def _send_as_persona(self, channel: discord.TextChannel, persona: str, content: str):
"""Send message via webhook"""
from utils.bipolar_mode import (
get_or_create_webhooks_for_channel,
get_miku_display_name,
get_evil_miku_display_name
)
webhooks = await get_or_create_webhooks_for_channel(channel)
if not webhooks:
print(f"⚠️ Could not get webhooks for #{channel.name}")
return
webhook = webhooks["evil_miku"] if persona == "evil" else webhooks["miku"]
display_name = get_evil_miku_display_name() if persona == "evil" else get_miku_display_name()
try:
await webhook.send(content=content, username=display_name)
except Exception as e:
print(f"⚠️ Error sending as {persona}: {e}")
# ============================================================================
# CONVENIENCE FUNCTIONS
# ============================================================================
# Singleton instances
_scorer = None
_dialogue_manager = None
def get_interjection_scorer() -> InterjectionScorer:
"""Get the singleton InterjectionScorer instance"""
global _scorer
if _scorer is None:
_scorer = InterjectionScorer()
return _scorer
def get_dialogue_manager() -> PersonaDialogue:
"""Get the singleton PersonaDialogue instance"""
global _dialogue_manager
if _dialogue_manager is None:
_dialogue_manager = PersonaDialogue()
return _dialogue_manager
async def check_for_interjection(message: discord.Message, current_persona: str) -> bool:
"""
Check if the opposite persona should interject based on a message.
If they should, starts a dialogue automatically.
Args:
message: The Discord message that was just sent
current_persona: Who sent the message ("miku" or "evil")
Returns:
True if an interjection was triggered, False otherwise
"""
print(f"🔍 [Persona Dialogue] Checking interjection for message from {current_persona}")
scorer = get_interjection_scorer()
dialogue_manager = get_dialogue_manager()
# Don't trigger if dialogue already active
if dialogue_manager.is_dialogue_active(message.channel.id):
print(f"⏸️ [Persona Dialogue] Dialogue already active in channel {message.channel.id}")
return False
# Check if we should interject
should_interject, reason, score = await scorer.should_interject(message, current_persona)
print(f"📊 [Persona Dialogue] Interjection check: should_interject={should_interject}, reason={reason}, score={score:.2f}")
if should_interject:
opposite_persona = "evil" if current_persona == "miku" else "miku"
print(f"🎭 Triggering {opposite_persona} interjection (reason: {reason}, score: {score:.2f})")
# Start dialogue with the opposite persona responding first
dialogue_manager.start_dialogue(message.channel.id)
asyncio.create_task(
dialogue_manager.handle_dialogue_turn(message.channel, opposite_persona, trigger_reason=reason)
)
return True
return False
def is_persona_dialogue_active(channel_id: int) -> bool:
"""Check if a persona dialogue is currently active in a channel"""
dialogue_manager = get_dialogue_manager()
return dialogue_manager.is_dialogue_active(channel_id)

View File

@@ -20,6 +20,35 @@ services:
environment: environment:
- NVIDIA_VISIBLE_DEVICES=all - NVIDIA_VISIBLE_DEVICES=all
llama-swap-amd:
build:
context: .
dockerfile: Dockerfile.llamaswap-rocm
container_name: llama-swap-amd
ports:
- "8091:8080" # Map host port 8091 to container port 8080
volumes:
- ./models:/models # GGUF model files
- ./llama-swap-rocm-config.yaml:/app/config.yaml # llama-swap configuration for AMD
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
group_add:
- "985" # video group
- "989" # render group
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 10s
timeout: 5s
retries: 10
start_period: 30s # Give more time for initial model loading
environment:
- HSA_OVERRIDE_GFX_VERSION=10.3.0 # RX 6800 compatibility
- ROCM_PATH=/opt/rocm
- HIP_VISIBLE_DEVICES=0 # Use first AMD GPU
- GPU_DEVICE_ORDINAL=0
miku-bot: miku-bot:
build: ./bot build: ./bot
container_name: miku-bot container_name: miku-bot
@@ -30,9 +59,12 @@ services:
depends_on: depends_on:
llama-swap: llama-swap:
condition: service_healthy condition: service_healthy
llama-swap-amd:
condition: service_healthy
environment: environment:
- DISCORD_BOT_TOKEN=MTM0ODAyMjY0Njc3NTc0NjY1MQ.GXsxML.nNCDOplmgNxKgqdgpAomFM2PViX10GjxyuV8uw - DISCORD_BOT_TOKEN=MTM0ODAyMjY0Njc3NTc0NjY1MQ.GXsxML.nNCDOplmgNxKgqdgpAomFM2PViX10GjxyuV8uw
- LLAMA_URL=http://llama-swap:8080 - LLAMA_URL=http://llama-swap:8080
- LLAMA_AMD_URL=http://llama-swap-amd:8080 # Secondary AMD GPU endpoint
- TEXT_MODEL=llama3.1 - TEXT_MODEL=llama3.1
- VISION_MODEL=vision - VISION_MODEL=vision
- OWNER_USER_ID=209381657369772032 # Your Discord user ID for DM analysis reports - OWNER_USER_ID=209381657369772032 # Your Discord user ID for DM analysis reports

View File

@@ -0,0 +1,25 @@
# llama-swap configuration for AMD RX 6800 (ROCm)
# This manages automatic model switching and unloading for the secondary GPU
# Vision model stays on NVIDIA GPU - AMD only handles text models
models:
# Main text generation model (same name as NVIDIA for uniform switching)
llama3.1:
cmd: /app/llama-server --port ${PORT} --model /models/Llama-3.1-8B-Instruct-UD-Q4_K_XL.gguf -ngl 99 -nkvo -c 16384 --host 0.0.0.0 --no-warmup
ttl: 1800 # Unload after 30 minutes of inactivity (1800 seconds)
aliases:
- llama3.1
- text-model
# Evil/Uncensored model (same name as NVIDIA for uniform switching)
darkidol:
cmd: /app/llama-server --port ${PORT} --model /models/DarkIdol-Llama-3.1-8B-Instruct-1.3-Uncensored_Q4_K_M.gguf -ngl 99 -nkvo -c 16384 --host 0.0.0.0 --no-warmup
ttl: 1800 # Unload after 30 minutes of inactivity
aliases:
- darkidol
- evil-model
- uncensored
# Server configuration
# llama-swap will listen on this address
# Inside Docker, we bind to 0.0.0.0 to allow bot container to connect

159
setup-dual-gpu.sh Executable file
View File

@@ -0,0 +1,159 @@
#!/bin/bash
# Setup script for dual GPU configuration (NVIDIA + AMD RX 6800)
# Note: Not using 'set -e' to allow arithmetic operations
echo "================================================"
echo "Miku Discord Bot - Dual GPU Setup"
echo "================================================"
echo ""
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Check if ROCm is installed on host (optional - Docker container has it)
echo -e "${YELLOW}Checking ROCm installation (optional)...${NC}"
if command -v rocm-smi &> /dev/null; then
echo -e "${GREEN}✓ ROCm is installed on host${NC}"
rocm-smi --version
# Check AMD GPU via rocm-smi
echo ""
echo -e "${YELLOW}Checking AMD GPU...${NC}"
if rocm-smi -i 0 &> /dev/null; then
echo -e "${GREEN}✓ AMD GPU detected via rocm-smi${NC}"
rocm-smi -i 0 --showproductname
else
echo -e "${YELLOW}! AMD GPU check via rocm-smi failed${NC}"
fi
else
echo -e "${YELLOW}! ROCm not found on host (this is OK - Docker container includes ROCm)${NC}"
# Fallback: Check AMD GPU via lspci
echo ""
echo -e "${YELLOW}Checking AMD GPU via lspci...${NC}"
if lspci | grep -i "VGA.*AMD\|VGA.*Radeon" &> /dev/null; then
echo -e "${GREEN}✓ AMD GPU detected${NC}"
lspci | grep -i "VGA.*AMD\|VGA.*Radeon"
else
echo -e "${YELLOW}! AMD GPU not detected via lspci${NC}"
echo "Note: AMD RX 6800 should be visible in lspci output"
fi
fi
# Check NVIDIA GPU
echo ""
echo -e "${YELLOW}Checking NVIDIA GPU...${NC}"
if command -v nvidia-smi &> /dev/null; then
echo -e "${GREEN}✓ NVIDIA GPU detected${NC}"
nvidia-smi --query-gpu=name --format=csv,noheader
else
echo -e "${RED}✗ NVIDIA GPU not found${NC}"
echo "Warning: Primary GPU (NVIDIA) not detected"
fi
# Check Docker
echo ""
echo -e "${YELLOW}Checking Docker...${NC}"
if command -v docker &> /dev/null; then
echo -e "${GREEN}✓ Docker is installed${NC}"
docker --version
else
echo -e "${RED}✗ Docker not found${NC}"
exit 1
fi
# Check device permissions
echo ""
echo -e "${YELLOW}Checking device permissions...${NC}"
if [ -c /dev/kfd ] && [ -r /dev/kfd ]; then
echo -e "${GREEN}✓ /dev/kfd accessible${NC}"
else
echo -e "${RED}✗ /dev/kfd not accessible${NC}"
echo "You may need to add your user to the 'render' group:"
echo " sudo usermod -aG render \$USER"
fi
if [ -d /dev/dri ]; then
echo -e "${GREEN}✓ /dev/dri exists${NC}"
else
echo -e "${RED}✗ /dev/dri not found${NC}"
fi
# Check if models exist
echo ""
echo -e "${YELLOW}Checking model files...${NC}"
MODEL_DIR="./models"
if [ -d "$MODEL_DIR" ]; then
echo -e "${GREEN}✓ Models directory exists${NC}"
# Check for specific models
models_found=0
if [ -f "$MODEL_DIR/Llama-3.1-8B-Instruct-UD-Q4_K_XL.gguf" ]; then
echo -e "${GREEN} ✓ Llama 3.1 8B model found${NC}"
((models_found++))
fi
if [ -f "$MODEL_DIR/DarkIdol-Llama-3.1-8B-Instruct-1.3-Uncensored_Q4_K_M.gguf" ]; then
echo -e "${GREEN} ✓ DarkIdol model found${NC}"
((models_found++))
fi
if [ -f "$MODEL_DIR/moondream2-text-model-f16_ct-vicuna.gguf" ]; then
echo -e "${GREEN} ✓ Moondream2 model found${NC}"
((models_found++))
fi
if [ $models_found -eq 0 ]; then
echo -e "${YELLOW} ! No models found in $MODEL_DIR${NC}"
echo " Please download GGUF models before starting"
fi
else
echo -e "${YELLOW}! Models directory not found${NC}"
echo "Creating models directory..."
mkdir -p "$MODEL_DIR"
fi
echo ""
echo "================================================"
echo "Setup Commands"
echo "================================================"
echo ""
echo "1. Build the AMD ROCm container:"
echo -e " ${GREEN}docker compose build llama-swap-amd${NC}"
echo ""
echo "2. Start both GPU services:"
echo -e " ${GREEN}docker compose up -d llama-swap llama-swap-amd${NC}"
echo ""
echo "3. Check status:"
echo -e " ${GREEN}docker compose ps${NC}"
echo -e " ${GREEN}docker compose logs -f llama-swap-amd${NC}"
echo ""
echo "4. Test endpoints:"
echo -e " ${GREEN}curl http://localhost:8090/health # NVIDIA GPU${NC}"
echo -e " ${GREEN}curl http://localhost:8091/health # AMD GPU${NC}"
echo ""
echo "5. Monitor GPU usage:"
echo -e " ${GREEN}watch -n 1 nvidia-smi # NVIDIA${NC}"
echo -e " ${GREEN}watch -n 1 rocm-smi # AMD${NC}"
echo ""
echo "================================================"
echo "Optional: Enable AMD GPU preference"
echo "================================================"
echo ""
echo "To prefer AMD GPU for load balancing, add to docker-compose.yml:"
echo " environment:"
echo " - PREFER_AMD_GPU=true"
echo ""
echo -e "${GREEN}Setup check complete!${NC}"
echo "See DUAL_GPU_SETUP.md for detailed documentation"