Organize documentation: Move all .md files to readmes/ directory

This commit is contained in:
2025-12-07 17:21:59 +02:00
parent 8c74ad5260
commit 88d4256755
28 changed files with 0 additions and 0 deletions

View File

@@ -0,0 +1,332 @@
# On-Demand Face Detection - Final Implementation
## Problem Solved
**Issue**: GPU only has 6GB VRAM, but we needed to run:
- Text model (~4.8GB)
- Vision model (~1GB when loaded)
- Face detector (~918MB when loaded)
**Result**: Vision model + Face detector = OOM (Out of Memory)
## Solution: On-Demand Container Management
The face detector container **does NOT start by default**. It only starts when needed for face detection, then stops immediately after to free VRAM.
## New Process Flow
### Profile Picture Change (Danbooru):
```
1. Danbooru Search & Download
└─> Download image from Danbooru
2. Vision Model Verification
└─> llama-swap loads vision model
└─> Verify image contains Miku
└─> Vision model stays loaded (auto-unload after 15min TTL)
3. Face Detection (NEW ON-DEMAND FLOW)
├─> Swap to text model (vision unloads)
├─> Wait 3s for VRAM to clear
├─> Start anime-face-detector container <-- STARTS HERE
├─> Wait for API to be ready (~5-10s)
├─> Call face detection API
├─> Get bbox & keypoints
└─> Stop anime-face-detector container <-- STOPS HERE
4. Crop & Upload
└─> Crop image using face bbox
└─> Upload to Discord
```
## VRAM Timeline
```
Time: 0s 10s 15s 25s 28s 30s
│ │ │ │ │ │
Vision: ████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░ ← Unloads when swapping
Text: ░░░░░░░░░░░░░░░░░░░░████████████████████████████ ← Loaded for swap
Face Det: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░ ← Starts, detects, stops
VRAM: ~5GB ~5GB ~1GB ~5.8GB ~1GB ~5GB
Vision Vision Swap Face Swap Text only
```
## Key Changes
### 1. Docker Compose (`docker-compose.yml`)
```yaml
anime-face-detector:
# ... config ...
restart: "no" # Don't auto-restart
profiles:
- tools # Don't start by default (requires --profile tools)
```
**Result**: Container exists but doesn't run unless explicitly started.
### 2. Profile Picture Manager (`bot/utils/profile_picture_manager.py`)
#### Added Methods:
**`_start_face_detector()`**
- Runs `docker start anime-face-detector`
- Waits up to 30s for API health check
- Returns True when ready
**`_stop_face_detector()`**
- Runs `docker stop anime-face-detector`
- Frees ~918MB VRAM immediately
**`_ensure_vram_available()`** (updated)
- Swaps to text model
- Waits 3s for vision model to unload
#### Updated Method:
**`_detect_face()`**
```python
async def _detect_face(self, image_bytes: bytes, debug: bool = False):
face_detector_started = False
try:
# 1. Free VRAM by swapping to text model
await self._ensure_vram_available(debug=debug)
# 2. Start face detector container
if not await self._start_face_detector(debug=debug):
return None
face_detector_started = True
# 3. Call face detection API
# ... detection logic ...
return detection_result
finally:
# 4. ALWAYS stop container to free VRAM
if face_detector_started:
await self._stop_face_detector(debug=debug)
```
## Container States
### Normal Operation (Most of the time):
```
llama-swap: RUNNING (~4.8GB VRAM - text model loaded)
miku-bot: RUNNING (minimal VRAM)
anime-face-detector: STOPPED (0 VRAM)
```
### During Profile Picture Change:
```
Phase 1 - Vision Verification:
llama-swap: RUNNING (~5GB VRAM - vision model)
miku-bot: RUNNING
anime-face-detector: STOPPED
Phase 2 - Model Swap:
llama-swap: RUNNING (~1GB VRAM - transitioning)
miku-bot: RUNNING
anime-face-detector: STOPPED
Phase 3 - Face Detection:
llama-swap: RUNNING (~5GB VRAM - text model)
miku-bot: RUNNING
anime-face-detector: RUNNING (~918MB VRAM - detecting)
Phase 4 - Cleanup:
llama-swap: RUNNING (~5GB VRAM - text model)
miku-bot: RUNNING
anime-face-detector: STOPPED (0 VRAM - stopped)
```
## Benefits
**No VRAM Conflicts**: Sequential processing with container lifecycle management
**Automatic**: Bot handles all starting/stopping
**Efficient**: Face detector only uses VRAM when actively needed (~10-15s)
**Reliable**: Always stops in finally block, even on errors
**Simple**: Uses standard docker commands from inside container
## Commands
### Manual Container Management
```bash
# Start face detector manually (for testing)
docker start anime-face-detector
# Check if it's running
docker ps | grep anime-face-detector
# Stop it manually
docker stop anime-face-detector
# Check VRAM usage
nvidia-smi
```
### Start with Profile (for Gradio UI testing)
```bash
# Start with face detector running
docker-compose --profile tools up -d
# Use Gradio UI at http://localhost:7860
# Stop everything
docker-compose down
```
## Monitoring
### Check Container Status
```bash
docker ps -a --filter name=anime-face-detector
```
### Watch VRAM During Profile Change
```bash
# Terminal 1: Watch GPU memory
watch -n 0.5 nvidia-smi
# Terminal 2: Trigger profile change
curl -X POST http://localhost:3939/profile-picture/change
```
### Check Bot Logs
```bash
docker logs -f miku-bot | grep -E "face|VRAM|Starting|Stopping"
```
You should see:
```
💾 Swapping to text model to free VRAM for face detection...
✅ Vision model unloaded, VRAM available
🚀 Starting face detector container...
✅ Face detector ready
👤 Detected 1 face(s) via API...
🛑 Stopping face detector to free VRAM...
✅ Face detector stopped
```
## Testing
### Test On-Demand Face Detection
```bash
# 1. Verify face detector is stopped
docker ps | grep anime-face-detector
# Should show nothing
# 2. Check VRAM (should be ~4.8GB for text model only)
nvidia-smi
# 3. Trigger profile picture change
curl -X POST "http://localhost:3939/profile-picture/change"
# 4. Watch logs in another terminal
docker logs -f miku-bot
# 5. After completion, verify face detector stopped again
docker ps | grep anime-face-detector
# Should show nothing again
# 6. Check VRAM returned to ~4.8GB
nvidia-smi
```
## Troubleshooting
### Face Detector Won't Start
**Symptom**: `⚠️ Could not start face detector`
**Solutions**:
```bash
# Check if container exists
docker ps -a | grep anime-face-detector
# If missing, rebuild
cd /home/koko210Serve/docker/miku-discord
docker-compose build anime-face-detector
# Check logs
docker logs anime-face-detector
```
### Still Getting OOM
**Symptom**: `cudaMalloc failed: out of memory`
**Check**:
```bash
# What's using VRAM?
nvidia-smi
# Is face detector still running?
docker ps | grep anime-face-detector
# Stop it manually
docker stop anime-face-detector
```
### Container Won't Stop
**Symptom**: Face detector stays running after detection
**Solutions**:
```bash
# Force stop
docker stop anime-face-detector
# Check for errors in bot logs
docker logs miku-bot | grep "stop"
# Verify the finally block is executing
docker logs miku-bot | grep "Stopping face detector"
```
## Performance Metrics
| Operation | Duration | VRAM Peak | Notes |
|-----------|----------|-----------|-------|
| Vision verification | 5-10s | ~5GB | Vision model loaded |
| Model swap | 3-5s | ~1GB | Transitioning |
| Container start | 5-10s | ~5GB | Text + starting detector |
| Face detection | 1-2s | ~5.8GB | Text + detector running |
| Container stop | 1-2s | ~5GB | Back to text only |
| **Total** | **15-29s** | **5.8GB max** | Fits in 6GB VRAM ✅ |
## Files Modified
1. `/miku-discord/docker-compose.yml`
- Added `restart: "no"`
- Added `profiles: [tools]`
2. `/miku-discord/bot/utils/profile_picture_manager.py`
- Added `_start_face_detector()`
- Added `_stop_face_detector()`
- Updated `_detect_face()` with lifecycle management
## Related Documentation
- `/miku-discord/VRAM_MANAGEMENT.md` - Original VRAM management approach
- `/miku-discord/FACE_DETECTION_API_MIGRATION.md` - API migration details
- `/miku-discord/PROFILE_PICTURE_IMPLEMENTATION.md` - Profile picture feature
## Success Criteria
✅ Face detector container does not run by default
✅ Container starts only when face detection is needed
✅ Container stops immediately after detection completes
✅ No VRAM OOM errors during profile picture changes
✅ Total VRAM usage stays under 6GB at all times
✅ Process completes successfully with face detection working
---
**Status**: ✅ **IMPLEMENTED AND TESTED**
The on-demand face detection system is now active. The face detector will automatically start and stop as needed, ensuring efficient VRAM usage without conflicts.