Organize documentation: Move all .md files to readmes/ directory

2025-12-07 17:21:59 +02:00
parent 8c74ad5260
commit 88d4256755
28 changed files with 0 additions and 0 deletions
--- a/readmes/ON_DEMAND_FACE_DETECTION.md
+++ b/readmes/ON_DEMAND_FACE_DETECTION.md
@@ -0,0 +1,332 @@
+# On-Demand Face Detection - Final Implementation
+
+## Problem Solved
+
+**Issue**: GPU only has 6GB VRAM, but we needed to run:
+- Text model (~4.8GB)
+- Vision model (~1GB when loaded)  
+- Face detector (~918MB when loaded)
+
+**Result**: Vision model + Face detector = OOM (Out of Memory)
+
+## Solution: On-Demand Container Management
+
+The face detector container **does NOT start by default**. It only starts when needed for face detection, then stops immediately after to free VRAM.
+
+## New Process Flow
+
+### Profile Picture Change (Danbooru):
+
+```
+1. Danbooru Search & Download
+   └─> Download image from Danbooru
+   
+2. Vision Model Verification  
+   └─> llama-swap loads vision model
+   └─> Verify image contains Miku
+   └─> Vision model stays loaded (auto-unload after 15min TTL)
+   
+3. Face Detection (NEW ON-DEMAND FLOW)
+   ├─> Swap to text model (vision unloads)
+   ├─> Wait 3s for VRAM to clear
+   ├─> Start anime-face-detector container  <-- STARTS HERE
+   ├─> Wait for API to be ready (~5-10s)
+   ├─> Call face detection API
+   ├─> Get bbox & keypoints
+   └─> Stop anime-face-detector container   <-- STOPS HERE
+   
+4. Crop & Upload
+   └─> Crop image using face bbox
+   └─> Upload to Discord
+```
+
+## VRAM Timeline
+
+```
+Time:     0s        10s       15s       25s       28s      30s
+          │         │         │         │         │        │
+Vision:   ████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░  ← Unloads when swapping
+Text:     ░░░░░░░░░░░░░░░░░░░░████████████████████████████  ← Loaded for swap
+Face Det: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░  ← Starts, detects, stops
+
+VRAM:     ~5GB      ~5GB      ~1GB      ~5.8GB    ~1GB     ~5GB
+          Vision    Vision    Swap      Face      Swap     Text only
+```
+
+## Key Changes
+
+### 1. Docker Compose (`docker-compose.yml`)
+
+```yaml
+anime-face-detector:
+  # ... config ...
+  restart: "no"          # Don't auto-restart
+  profiles:
+    - tools              # Don't start by default (requires --profile tools)
+```
+
+**Result**: Container exists but doesn't run unless explicitly started.
+
+### 2. Profile Picture Manager (`bot/utils/profile_picture_manager.py`)
+
+#### Added Methods:
+
+**`_start_face_detector()`**
+- Runs `docker start anime-face-detector`
+- Waits up to 30s for API health check
+- Returns True when ready
+
+**`_stop_face_detector()`**
+- Runs `docker stop anime-face-detector`
+- Frees ~918MB VRAM immediately
+
+**`_ensure_vram_available()`** (updated)
+- Swaps to text model
+- Waits 3s for vision model to unload
+
+#### Updated Method:
+
+**`_detect_face()`**
+```python
+async def _detect_face(self, image_bytes: bytes, debug: bool = False):
+    face_detector_started = False
+    try:
+        # 1. Free VRAM by swapping to text model
+        await self._ensure_vram_available(debug=debug)
+        
+        # 2. Start face detector container
+        if not await self._start_face_detector(debug=debug):
+            return None
+        face_detector_started = True
+        
+        # 3. Call face detection API
+        # ... detection logic ...
+        
+        return detection_result
+        
+    finally:
+        # 4. ALWAYS stop container to free VRAM
+        if face_detector_started:
+            await self._stop_face_detector(debug=debug)
+```
+
+## Container States
+
+### Normal Operation (Most of the time):
+```
+llama-swap:         RUNNING  (~4.8GB VRAM - text model loaded)
+miku-bot:           RUNNING  (minimal VRAM)
+anime-face-detector: STOPPED  (0 VRAM)
+```
+
+### During Profile Picture Change:
+```
+Phase 1 - Vision Verification:
+  llama-swap:         RUNNING  (~5GB VRAM - vision model)
+  miku-bot:           RUNNING  
+  anime-face-detector: STOPPED
+
+Phase 2 - Model Swap:
+  llama-swap:         RUNNING  (~1GB VRAM - transitioning)
+  miku-bot:           RUNNING
+  anime-face-detector: STOPPED
+
+Phase 3 - Face Detection:
+  llama-swap:         RUNNING  (~5GB VRAM - text model)
+  miku-bot:           RUNNING
+  anime-face-detector: RUNNING  (~918MB VRAM - detecting)
+
+Phase 4 - Cleanup:
+  llama-swap:         RUNNING  (~5GB VRAM - text model)
+  miku-bot:           RUNNING
+  anime-face-detector: STOPPED  (0 VRAM - stopped)
+```
+
+## Benefits
+
+✅ **No VRAM Conflicts**: Sequential processing with container lifecycle management
+✅ **Automatic**: Bot handles all starting/stopping
+✅ **Efficient**: Face detector only uses VRAM when actively needed (~10-15s)
+✅ **Reliable**: Always stops in finally block, even on errors
+✅ **Simple**: Uses standard docker commands from inside container
+
+## Commands
+
+### Manual Container Management
+
+```bash
+# Start face detector manually (for testing)
+docker start anime-face-detector
+
+# Check if it's running
+docker ps | grep anime-face-detector
+
+# Stop it manually
+docker stop anime-face-detector
+
+# Check VRAM usage
+nvidia-smi
+```
+
+### Start with Profile (for Gradio UI testing)
+
+```bash
+# Start with face detector running
+docker-compose --profile tools up -d
+
+# Use Gradio UI at http://localhost:7860
+# Stop everything
+docker-compose down
+```
+
+## Monitoring
+
+### Check Container Status
+```bash
+docker ps -a --filter name=anime-face-detector
+```
+
+### Watch VRAM During Profile Change
+```bash
+# Terminal 1: Watch GPU memory
+watch -n 0.5 nvidia-smi
+
+# Terminal 2: Trigger profile change
+curl -X POST http://localhost:3939/profile-picture/change
+```
+
+### Check Bot Logs
+```bash
+docker logs -f miku-bot | grep -E "face|VRAM|Starting|Stopping"
+```
+
+You should see:
+```
+💾 Swapping to text model to free VRAM for face detection...
+✅ Vision model unloaded, VRAM available
+🚀 Starting face detector container...
+✅ Face detector ready
+👤 Detected 1 face(s) via API...
+🛑 Stopping face detector to free VRAM...
+✅ Face detector stopped
+```
+
+## Testing
+
+### Test On-Demand Face Detection
+
+```bash
+# 1. Verify face detector is stopped
+docker ps | grep anime-face-detector
+# Should show nothing
+
+# 2. Check VRAM (should be ~4.8GB for text model only)
+nvidia-smi
+
+# 3. Trigger profile picture change
+curl -X POST "http://localhost:3939/profile-picture/change"
+
+# 4. Watch logs in another terminal
+docker logs -f miku-bot
+
+# 5. After completion, verify face detector stopped again
+docker ps | grep anime-face-detector
+# Should show nothing again
+
+# 6. Check VRAM returned to ~4.8GB
+nvidia-smi
+```
+
+## Troubleshooting
+
+### Face Detector Won't Start
+
+**Symptom**: `⚠️ Could not start face detector`
+
+**Solutions**:
+```bash
+# Check if container exists
+docker ps -a | grep anime-face-detector
+
+# If missing, rebuild
+cd /home/koko210Serve/docker/miku-discord
+docker-compose build anime-face-detector
+
+# Check logs
+docker logs anime-face-detector
+```
+
+### Still Getting OOM
+
+**Symptom**: `cudaMalloc failed: out of memory`
+
+**Check**:
+```bash
+# What's using VRAM?
+nvidia-smi
+
+# Is face detector still running?
+docker ps | grep anime-face-detector
+
+# Stop it manually
+docker stop anime-face-detector
+```
+
+### Container Won't Stop
+
+**Symptom**: Face detector stays running after detection
+
+**Solutions**:
+```bash
+# Force stop
+docker stop anime-face-detector
+
+# Check for errors in bot logs
+docker logs miku-bot | grep "stop"
+
+# Verify the finally block is executing
+docker logs miku-bot | grep "Stopping face detector"
+```
+
+## Performance Metrics
+
+| Operation | Duration | VRAM Peak | Notes |
+|-----------|----------|-----------|-------|
+| Vision verification | 5-10s | ~5GB | Vision model loaded |
+| Model swap | 3-5s | ~1GB | Transitioning |
+| Container start | 5-10s | ~5GB | Text + starting detector |
+| Face detection | 1-2s | ~5.8GB | Text + detector running |
+| Container stop | 1-2s | ~5GB | Back to text only |
+| **Total** | **15-29s** | **5.8GB max** | Fits in 6GB VRAM ✅ |
+
+## Files Modified
+
+1. `/miku-discord/docker-compose.yml`
+   - Added `restart: "no"` 
+   - Added `profiles: [tools]`
+
+2. `/miku-discord/bot/utils/profile_picture_manager.py`
+   - Added `_start_face_detector()`
+   - Added `_stop_face_detector()`
+   - Updated `_detect_face()` with lifecycle management
+
+## Related Documentation
+
+- `/miku-discord/VRAM_MANAGEMENT.md` - Original VRAM management approach
+- `/miku-discord/FACE_DETECTION_API_MIGRATION.md` - API migration details
+- `/miku-discord/PROFILE_PICTURE_IMPLEMENTATION.md` - Profile picture feature
+
+## Success Criteria
+
+✅ Face detector container does not run by default
+✅ Container starts only when face detection is needed
+✅ Container stops immediately after detection completes
+✅ No VRAM OOM errors during profile picture changes
+✅ Total VRAM usage stays under 6GB at all times
+✅ Process completes successfully with face detection working
+
+---
+
+**Status**: ✅ **IMPLEMENTED AND TESTED**
+
+The on-demand face detection system is now active. The face detector will automatically start and stop as needed, ensuring efficient VRAM usage without conflicts.