Organize documentation: Move all .md files to readmes/ directory
This commit is contained in:
332
readmes/ON_DEMAND_FACE_DETECTION.md
Normal file
332
readmes/ON_DEMAND_FACE_DETECTION.md
Normal file
@@ -0,0 +1,332 @@
|
||||
# On-Demand Face Detection - Final Implementation
|
||||
|
||||
## Problem Solved
|
||||
|
||||
**Issue**: GPU only has 6GB VRAM, but we needed to run:
|
||||
- Text model (~4.8GB)
|
||||
- Vision model (~1GB when loaded)
|
||||
- Face detector (~918MB when loaded)
|
||||
|
||||
**Result**: Vision model + Face detector = OOM (Out of Memory)
|
||||
|
||||
## Solution: On-Demand Container Management
|
||||
|
||||
The face detector container **does NOT start by default**. It only starts when needed for face detection, then stops immediately after to free VRAM.
|
||||
|
||||
## New Process Flow
|
||||
|
||||
### Profile Picture Change (Danbooru):
|
||||
|
||||
```
|
||||
1. Danbooru Search & Download
|
||||
└─> Download image from Danbooru
|
||||
|
||||
2. Vision Model Verification
|
||||
└─> llama-swap loads vision model
|
||||
└─> Verify image contains Miku
|
||||
└─> Vision model stays loaded (auto-unload after 15min TTL)
|
||||
|
||||
3. Face Detection (NEW ON-DEMAND FLOW)
|
||||
├─> Swap to text model (vision unloads)
|
||||
├─> Wait 3s for VRAM to clear
|
||||
├─> Start anime-face-detector container <-- STARTS HERE
|
||||
├─> Wait for API to be ready (~5-10s)
|
||||
├─> Call face detection API
|
||||
├─> Get bbox & keypoints
|
||||
└─> Stop anime-face-detector container <-- STOPS HERE
|
||||
|
||||
4. Crop & Upload
|
||||
└─> Crop image using face bbox
|
||||
└─> Upload to Discord
|
||||
```
|
||||
|
||||
## VRAM Timeline
|
||||
|
||||
```
|
||||
Time: 0s 10s 15s 25s 28s 30s
|
||||
│ │ │ │ │ │
|
||||
Vision: ████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░ ← Unloads when swapping
|
||||
Text: ░░░░░░░░░░░░░░░░░░░░████████████████████████████ ← Loaded for swap
|
||||
Face Det: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░ ← Starts, detects, stops
|
||||
|
||||
VRAM: ~5GB ~5GB ~1GB ~5.8GB ~1GB ~5GB
|
||||
Vision Vision Swap Face Swap Text only
|
||||
```
|
||||
|
||||
## Key Changes
|
||||
|
||||
### 1. Docker Compose (`docker-compose.yml`)
|
||||
|
||||
```yaml
|
||||
anime-face-detector:
|
||||
# ... config ...
|
||||
restart: "no" # Don't auto-restart
|
||||
profiles:
|
||||
- tools # Don't start by default (requires --profile tools)
|
||||
```
|
||||
|
||||
**Result**: Container exists but doesn't run unless explicitly started.
|
||||
|
||||
### 2. Profile Picture Manager (`bot/utils/profile_picture_manager.py`)
|
||||
|
||||
#### Added Methods:
|
||||
|
||||
**`_start_face_detector()`**
|
||||
- Runs `docker start anime-face-detector`
|
||||
- Waits up to 30s for API health check
|
||||
- Returns True when ready
|
||||
|
||||
**`_stop_face_detector()`**
|
||||
- Runs `docker stop anime-face-detector`
|
||||
- Frees ~918MB VRAM immediately
|
||||
|
||||
**`_ensure_vram_available()`** (updated)
|
||||
- Swaps to text model
|
||||
- Waits 3s for vision model to unload
|
||||
|
||||
#### Updated Method:
|
||||
|
||||
**`_detect_face()`**
|
||||
```python
|
||||
async def _detect_face(self, image_bytes: bytes, debug: bool = False):
|
||||
face_detector_started = False
|
||||
try:
|
||||
# 1. Free VRAM by swapping to text model
|
||||
await self._ensure_vram_available(debug=debug)
|
||||
|
||||
# 2. Start face detector container
|
||||
if not await self._start_face_detector(debug=debug):
|
||||
return None
|
||||
face_detector_started = True
|
||||
|
||||
# 3. Call face detection API
|
||||
# ... detection logic ...
|
||||
|
||||
return detection_result
|
||||
|
||||
finally:
|
||||
# 4. ALWAYS stop container to free VRAM
|
||||
if face_detector_started:
|
||||
await self._stop_face_detector(debug=debug)
|
||||
```
|
||||
|
||||
## Container States
|
||||
|
||||
### Normal Operation (Most of the time):
|
||||
```
|
||||
llama-swap: RUNNING (~4.8GB VRAM - text model loaded)
|
||||
miku-bot: RUNNING (minimal VRAM)
|
||||
anime-face-detector: STOPPED (0 VRAM)
|
||||
```
|
||||
|
||||
### During Profile Picture Change:
|
||||
```
|
||||
Phase 1 - Vision Verification:
|
||||
llama-swap: RUNNING (~5GB VRAM - vision model)
|
||||
miku-bot: RUNNING
|
||||
anime-face-detector: STOPPED
|
||||
|
||||
Phase 2 - Model Swap:
|
||||
llama-swap: RUNNING (~1GB VRAM - transitioning)
|
||||
miku-bot: RUNNING
|
||||
anime-face-detector: STOPPED
|
||||
|
||||
Phase 3 - Face Detection:
|
||||
llama-swap: RUNNING (~5GB VRAM - text model)
|
||||
miku-bot: RUNNING
|
||||
anime-face-detector: RUNNING (~918MB VRAM - detecting)
|
||||
|
||||
Phase 4 - Cleanup:
|
||||
llama-swap: RUNNING (~5GB VRAM - text model)
|
||||
miku-bot: RUNNING
|
||||
anime-face-detector: STOPPED (0 VRAM - stopped)
|
||||
```
|
||||
|
||||
## Benefits
|
||||
|
||||
✅ **No VRAM Conflicts**: Sequential processing with container lifecycle management
|
||||
✅ **Automatic**: Bot handles all starting/stopping
|
||||
✅ **Efficient**: Face detector only uses VRAM when actively needed (~10-15s)
|
||||
✅ **Reliable**: Always stops in finally block, even on errors
|
||||
✅ **Simple**: Uses standard docker commands from inside container
|
||||
|
||||
## Commands
|
||||
|
||||
### Manual Container Management
|
||||
|
||||
```bash
|
||||
# Start face detector manually (for testing)
|
||||
docker start anime-face-detector
|
||||
|
||||
# Check if it's running
|
||||
docker ps | grep anime-face-detector
|
||||
|
||||
# Stop it manually
|
||||
docker stop anime-face-detector
|
||||
|
||||
# Check VRAM usage
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
### Start with Profile (for Gradio UI testing)
|
||||
|
||||
```bash
|
||||
# Start with face detector running
|
||||
docker-compose --profile tools up -d
|
||||
|
||||
# Use Gradio UI at http://localhost:7860
|
||||
# Stop everything
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Check Container Status
|
||||
```bash
|
||||
docker ps -a --filter name=anime-face-detector
|
||||
```
|
||||
|
||||
### Watch VRAM During Profile Change
|
||||
```bash
|
||||
# Terminal 1: Watch GPU memory
|
||||
watch -n 0.5 nvidia-smi
|
||||
|
||||
# Terminal 2: Trigger profile change
|
||||
curl -X POST http://localhost:3939/profile-picture/change
|
||||
```
|
||||
|
||||
### Check Bot Logs
|
||||
```bash
|
||||
docker logs -f miku-bot | grep -E "face|VRAM|Starting|Stopping"
|
||||
```
|
||||
|
||||
You should see:
|
||||
```
|
||||
💾 Swapping to text model to free VRAM for face detection...
|
||||
✅ Vision model unloaded, VRAM available
|
||||
🚀 Starting face detector container...
|
||||
✅ Face detector ready
|
||||
👤 Detected 1 face(s) via API...
|
||||
🛑 Stopping face detector to free VRAM...
|
||||
✅ Face detector stopped
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Test On-Demand Face Detection
|
||||
|
||||
```bash
|
||||
# 1. Verify face detector is stopped
|
||||
docker ps | grep anime-face-detector
|
||||
# Should show nothing
|
||||
|
||||
# 2. Check VRAM (should be ~4.8GB for text model only)
|
||||
nvidia-smi
|
||||
|
||||
# 3. Trigger profile picture change
|
||||
curl -X POST "http://localhost:3939/profile-picture/change"
|
||||
|
||||
# 4. Watch logs in another terminal
|
||||
docker logs -f miku-bot
|
||||
|
||||
# 5. After completion, verify face detector stopped again
|
||||
docker ps | grep anime-face-detector
|
||||
# Should show nothing again
|
||||
|
||||
# 6. Check VRAM returned to ~4.8GB
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Face Detector Won't Start
|
||||
|
||||
**Symptom**: `⚠️ Could not start face detector`
|
||||
|
||||
**Solutions**:
|
||||
```bash
|
||||
# Check if container exists
|
||||
docker ps -a | grep anime-face-detector
|
||||
|
||||
# If missing, rebuild
|
||||
cd /home/koko210Serve/docker/miku-discord
|
||||
docker-compose build anime-face-detector
|
||||
|
||||
# Check logs
|
||||
docker logs anime-face-detector
|
||||
```
|
||||
|
||||
### Still Getting OOM
|
||||
|
||||
**Symptom**: `cudaMalloc failed: out of memory`
|
||||
|
||||
**Check**:
|
||||
```bash
|
||||
# What's using VRAM?
|
||||
nvidia-smi
|
||||
|
||||
# Is face detector still running?
|
||||
docker ps | grep anime-face-detector
|
||||
|
||||
# Stop it manually
|
||||
docker stop anime-face-detector
|
||||
```
|
||||
|
||||
### Container Won't Stop
|
||||
|
||||
**Symptom**: Face detector stays running after detection
|
||||
|
||||
**Solutions**:
|
||||
```bash
|
||||
# Force stop
|
||||
docker stop anime-face-detector
|
||||
|
||||
# Check for errors in bot logs
|
||||
docker logs miku-bot | grep "stop"
|
||||
|
||||
# Verify the finally block is executing
|
||||
docker logs miku-bot | grep "Stopping face detector"
|
||||
```
|
||||
|
||||
## Performance Metrics
|
||||
|
||||
| Operation | Duration | VRAM Peak | Notes |
|
||||
|-----------|----------|-----------|-------|
|
||||
| Vision verification | 5-10s | ~5GB | Vision model loaded |
|
||||
| Model swap | 3-5s | ~1GB | Transitioning |
|
||||
| Container start | 5-10s | ~5GB | Text + starting detector |
|
||||
| Face detection | 1-2s | ~5.8GB | Text + detector running |
|
||||
| Container stop | 1-2s | ~5GB | Back to text only |
|
||||
| **Total** | **15-29s** | **5.8GB max** | Fits in 6GB VRAM ✅ |
|
||||
|
||||
## Files Modified
|
||||
|
||||
1. `/miku-discord/docker-compose.yml`
|
||||
- Added `restart: "no"`
|
||||
- Added `profiles: [tools]`
|
||||
|
||||
2. `/miku-discord/bot/utils/profile_picture_manager.py`
|
||||
- Added `_start_face_detector()`
|
||||
- Added `_stop_face_detector()`
|
||||
- Updated `_detect_face()` with lifecycle management
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- `/miku-discord/VRAM_MANAGEMENT.md` - Original VRAM management approach
|
||||
- `/miku-discord/FACE_DETECTION_API_MIGRATION.md` - API migration details
|
||||
- `/miku-discord/PROFILE_PICTURE_IMPLEMENTATION.md` - Profile picture feature
|
||||
|
||||
## Success Criteria
|
||||
|
||||
✅ Face detector container does not run by default
|
||||
✅ Container starts only when face detection is needed
|
||||
✅ Container stops immediately after detection completes
|
||||
✅ No VRAM OOM errors during profile picture changes
|
||||
✅ Total VRAM usage stays under 6GB at all times
|
||||
✅ Process completes successfully with face detection working
|
||||
|
||||
---
|
||||
|
||||
**Status**: ✅ **IMPLEMENTED AND TESTED**
|
||||
|
||||
The on-demand face detection system is now active. The face detector will automatically start and stop as needed, ensuring efficient VRAM usage without conflicts.
|
||||
Reference in New Issue
Block a user