333 lines
8.9 KiB
Markdown
333 lines
8.9 KiB
Markdown
# On-Demand Face Detection - Final Implementation
|
|
|
|
## Problem Solved
|
|
|
|
**Issue**: GPU only has 6GB VRAM, but we needed to run:
|
|
- Text model (~4.8GB)
|
|
- Vision model (~1GB when loaded)
|
|
- Face detector (~918MB when loaded)
|
|
|
|
**Result**: Vision model + Face detector = OOM (Out of Memory)
|
|
|
|
## Solution: On-Demand Container Management
|
|
|
|
The face detector container **does NOT start by default**. It only starts when needed for face detection, then stops immediately after to free VRAM.
|
|
|
|
## New Process Flow
|
|
|
|
### Profile Picture Change (Danbooru):
|
|
|
|
```
|
|
1. Danbooru Search & Download
|
|
└─> Download image from Danbooru
|
|
|
|
2. Vision Model Verification
|
|
└─> llama-swap loads vision model
|
|
└─> Verify image contains Miku
|
|
└─> Vision model stays loaded (auto-unload after 15min TTL)
|
|
|
|
3. Face Detection (NEW ON-DEMAND FLOW)
|
|
├─> Swap to text model (vision unloads)
|
|
├─> Wait 3s for VRAM to clear
|
|
├─> Start anime-face-detector container <-- STARTS HERE
|
|
├─> Wait for API to be ready (~5-10s)
|
|
├─> Call face detection API
|
|
├─> Get bbox & keypoints
|
|
└─> Stop anime-face-detector container <-- STOPS HERE
|
|
|
|
4. Crop & Upload
|
|
└─> Crop image using face bbox
|
|
└─> Upload to Discord
|
|
```
|
|
|
|
## VRAM Timeline
|
|
|
|
```
|
|
Time: 0s 10s 15s 25s 28s 30s
|
|
│ │ │ │ │ │
|
|
Vision: ████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░ ← Unloads when swapping
|
|
Text: ░░░░░░░░░░░░░░░░░░░░████████████████████████████ ← Loaded for swap
|
|
Face Det: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░ ← Starts, detects, stops
|
|
|
|
VRAM: ~5GB ~5GB ~1GB ~5.8GB ~1GB ~5GB
|
|
Vision Vision Swap Face Swap Text only
|
|
```
|
|
|
|
## Key Changes
|
|
|
|
### 1. Docker Compose (`docker-compose.yml`)
|
|
|
|
```yaml
|
|
anime-face-detector:
|
|
# ... config ...
|
|
restart: "no" # Don't auto-restart
|
|
profiles:
|
|
- tools # Don't start by default (requires --profile tools)
|
|
```
|
|
|
|
**Result**: Container exists but doesn't run unless explicitly started.
|
|
|
|
### 2. Profile Picture Manager (`bot/utils/profile_picture_manager.py`)
|
|
|
|
#### Added Methods:
|
|
|
|
**`_start_face_detector()`**
|
|
- Runs `docker start anime-face-detector`
|
|
- Waits up to 30s for API health check
|
|
- Returns True when ready
|
|
|
|
**`_stop_face_detector()`**
|
|
- Runs `docker stop anime-face-detector`
|
|
- Frees ~918MB VRAM immediately
|
|
|
|
**`_ensure_vram_available()`** (updated)
|
|
- Swaps to text model
|
|
- Waits 3s for vision model to unload
|
|
|
|
#### Updated Method:
|
|
|
|
**`_detect_face()`**
|
|
```python
|
|
async def _detect_face(self, image_bytes: bytes, debug: bool = False):
|
|
face_detector_started = False
|
|
try:
|
|
# 1. Free VRAM by swapping to text model
|
|
await self._ensure_vram_available(debug=debug)
|
|
|
|
# 2. Start face detector container
|
|
if not await self._start_face_detector(debug=debug):
|
|
return None
|
|
face_detector_started = True
|
|
|
|
# 3. Call face detection API
|
|
# ... detection logic ...
|
|
|
|
return detection_result
|
|
|
|
finally:
|
|
# 4. ALWAYS stop container to free VRAM
|
|
if face_detector_started:
|
|
await self._stop_face_detector(debug=debug)
|
|
```
|
|
|
|
## Container States
|
|
|
|
### Normal Operation (Most of the time):
|
|
```
|
|
llama-swap: RUNNING (~4.8GB VRAM - text model loaded)
|
|
miku-bot: RUNNING (minimal VRAM)
|
|
anime-face-detector: STOPPED (0 VRAM)
|
|
```
|
|
|
|
### During Profile Picture Change:
|
|
```
|
|
Phase 1 - Vision Verification:
|
|
llama-swap: RUNNING (~5GB VRAM - vision model)
|
|
miku-bot: RUNNING
|
|
anime-face-detector: STOPPED
|
|
|
|
Phase 2 - Model Swap:
|
|
llama-swap: RUNNING (~1GB VRAM - transitioning)
|
|
miku-bot: RUNNING
|
|
anime-face-detector: STOPPED
|
|
|
|
Phase 3 - Face Detection:
|
|
llama-swap: RUNNING (~5GB VRAM - text model)
|
|
miku-bot: RUNNING
|
|
anime-face-detector: RUNNING (~918MB VRAM - detecting)
|
|
|
|
Phase 4 - Cleanup:
|
|
llama-swap: RUNNING (~5GB VRAM - text model)
|
|
miku-bot: RUNNING
|
|
anime-face-detector: STOPPED (0 VRAM - stopped)
|
|
```
|
|
|
|
## Benefits
|
|
|
|
✅ **No VRAM Conflicts**: Sequential processing with container lifecycle management
|
|
✅ **Automatic**: Bot handles all starting/stopping
|
|
✅ **Efficient**: Face detector only uses VRAM when actively needed (~10-15s)
|
|
✅ **Reliable**: Always stops in finally block, even on errors
|
|
✅ **Simple**: Uses standard docker commands from inside container
|
|
|
|
## Commands
|
|
|
|
### Manual Container Management
|
|
|
|
```bash
|
|
# Start face detector manually (for testing)
|
|
docker start anime-face-detector
|
|
|
|
# Check if it's running
|
|
docker ps | grep anime-face-detector
|
|
|
|
# Stop it manually
|
|
docker stop anime-face-detector
|
|
|
|
# Check VRAM usage
|
|
nvidia-smi
|
|
```
|
|
|
|
### Start with Profile (for Gradio UI testing)
|
|
|
|
```bash
|
|
# Start with face detector running
|
|
docker-compose --profile tools up -d
|
|
|
|
# Use Gradio UI at http://localhost:7860
|
|
# Stop everything
|
|
docker-compose down
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
### Check Container Status
|
|
```bash
|
|
docker ps -a --filter name=anime-face-detector
|
|
```
|
|
|
|
### Watch VRAM During Profile Change
|
|
```bash
|
|
# Terminal 1: Watch GPU memory
|
|
watch -n 0.5 nvidia-smi
|
|
|
|
# Terminal 2: Trigger profile change
|
|
curl -X POST http://localhost:3939/profile-picture/change
|
|
```
|
|
|
|
### Check Bot Logs
|
|
```bash
|
|
docker logs -f miku-bot | grep -E "face|VRAM|Starting|Stopping"
|
|
```
|
|
|
|
You should see:
|
|
```
|
|
💾 Swapping to text model to free VRAM for face detection...
|
|
✅ Vision model unloaded, VRAM available
|
|
🚀 Starting face detector container...
|
|
✅ Face detector ready
|
|
👤 Detected 1 face(s) via API...
|
|
🛑 Stopping face detector to free VRAM...
|
|
✅ Face detector stopped
|
|
```
|
|
|
|
## Testing
|
|
|
|
### Test On-Demand Face Detection
|
|
|
|
```bash
|
|
# 1. Verify face detector is stopped
|
|
docker ps | grep anime-face-detector
|
|
# Should show nothing
|
|
|
|
# 2. Check VRAM (should be ~4.8GB for text model only)
|
|
nvidia-smi
|
|
|
|
# 3. Trigger profile picture change
|
|
curl -X POST "http://localhost:3939/profile-picture/change"
|
|
|
|
# 4. Watch logs in another terminal
|
|
docker logs -f miku-bot
|
|
|
|
# 5. After completion, verify face detector stopped again
|
|
docker ps | grep anime-face-detector
|
|
# Should show nothing again
|
|
|
|
# 6. Check VRAM returned to ~4.8GB
|
|
nvidia-smi
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Face Detector Won't Start
|
|
|
|
**Symptom**: `⚠️ Could not start face detector`
|
|
|
|
**Solutions**:
|
|
```bash
|
|
# Check if container exists
|
|
docker ps -a | grep anime-face-detector
|
|
|
|
# If missing, rebuild
|
|
cd /home/koko210Serve/docker/miku-discord
|
|
docker-compose build anime-face-detector
|
|
|
|
# Check logs
|
|
docker logs anime-face-detector
|
|
```
|
|
|
|
### Still Getting OOM
|
|
|
|
**Symptom**: `cudaMalloc failed: out of memory`
|
|
|
|
**Check**:
|
|
```bash
|
|
# What's using VRAM?
|
|
nvidia-smi
|
|
|
|
# Is face detector still running?
|
|
docker ps | grep anime-face-detector
|
|
|
|
# Stop it manually
|
|
docker stop anime-face-detector
|
|
```
|
|
|
|
### Container Won't Stop
|
|
|
|
**Symptom**: Face detector stays running after detection
|
|
|
|
**Solutions**:
|
|
```bash
|
|
# Force stop
|
|
docker stop anime-face-detector
|
|
|
|
# Check for errors in bot logs
|
|
docker logs miku-bot | grep "stop"
|
|
|
|
# Verify the finally block is executing
|
|
docker logs miku-bot | grep "Stopping face detector"
|
|
```
|
|
|
|
## Performance Metrics
|
|
|
|
| Operation | Duration | VRAM Peak | Notes |
|
|
|-----------|----------|-----------|-------|
|
|
| Vision verification | 5-10s | ~5GB | Vision model loaded |
|
|
| Model swap | 3-5s | ~1GB | Transitioning |
|
|
| Container start | 5-10s | ~5GB | Text + starting detector |
|
|
| Face detection | 1-2s | ~5.8GB | Text + detector running |
|
|
| Container stop | 1-2s | ~5GB | Back to text only |
|
|
| **Total** | **15-29s** | **5.8GB max** | Fits in 6GB VRAM ✅ |
|
|
|
|
## Files Modified
|
|
|
|
1. `/miku-discord/docker-compose.yml`
|
|
- Added `restart: "no"`
|
|
- Added `profiles: [tools]`
|
|
|
|
2. `/miku-discord/bot/utils/profile_picture_manager.py`
|
|
- Added `_start_face_detector()`
|
|
- Added `_stop_face_detector()`
|
|
- Updated `_detect_face()` with lifecycle management
|
|
|
|
## Related Documentation
|
|
|
|
- `/miku-discord/VRAM_MANAGEMENT.md` - Original VRAM management approach
|
|
- `/miku-discord/FACE_DETECTION_API_MIGRATION.md` - API migration details
|
|
- `/miku-discord/PROFILE_PICTURE_IMPLEMENTATION.md` - Profile picture feature
|
|
|
|
## Success Criteria
|
|
|
|
✅ Face detector container does not run by default
|
|
✅ Container starts only when face detection is needed
|
|
✅ Container stops immediately after detection completes
|
|
✅ No VRAM OOM errors during profile picture changes
|
|
✅ Total VRAM usage stays under 6GB at all times
|
|
✅ Process completes successfully with face detection working
|
|
|
|
---
|
|
|
|
**Status**: ✅ **IMPLEMENTED AND TESTED**
|
|
|
|
The on-demand face detection system is now active. The face detector will automatically start and stop as needed, ensuring efficient VRAM usage without conflicts.
|