Koko210/miku-discord

Fork 0

Files

koko210Serve 88d4256755 Organize documentation: Move all .md files to readmes/ directory

2025-12-07 17:21:59 +02:00

8.9 KiB

Raw Blame History

On-Demand Face Detection - Final Implementation

Problem Solved

Issue: GPU only has 6GB VRAM, but we needed to run:

Text model (~4.8GB)
Vision model (~1GB when loaded)
Face detector (~918MB when loaded)

Result: Vision model + Face detector = OOM (Out of Memory)

Solution: On-Demand Container Management

The face detector container does NOT start by default. It only starts when needed for face detection, then stops immediately after to free VRAM.

New Process Flow

Profile Picture Change (Danbooru):

1. Danbooru Search & Download
   └─> Download image from Danbooru
   
2. Vision Model Verification  
   └─> llama-swap loads vision model
   └─> Verify image contains Miku
   └─> Vision model stays loaded (auto-unload after 15min TTL)
   
3. Face Detection (NEW ON-DEMAND FLOW)
   ├─> Swap to text model (vision unloads)
   ├─> Wait 3s for VRAM to clear
   ├─> Start anime-face-detector container  <-- STARTS HERE
   ├─> Wait for API to be ready (~5-10s)
   ├─> Call face detection API
   ├─> Get bbox & keypoints
   └─> Stop anime-face-detector container   <-- STOPS HERE
   
4. Crop & Upload
   └─> Crop image using face bbox
   └─> Upload to Discord

VRAM Timeline

Time:     0s        10s       15s       25s       28s      30s
          │         │         │         │         │        │
Vision:   ████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░  ← Unloads when swapping
Text:     ░░░░░░░░░░░░░░░░░░░░████████████████████████████  ← Loaded for swap
Face Det: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████░░░░░░░░  ← Starts, detects, stops

VRAM:     ~5GB      ~5GB      ~1GB      ~5.8GB    ~1GB     ~5GB
          Vision    Vision    Swap      Face      Swap     Text only

Key Changes

1. Docker Compose (`docker-compose.yml`)

anime-face-detector:
  # ... config ...
  restart: "no"          # Don't auto-restart
  profiles:
    - tools              # Don't start by default (requires --profile tools)

Result: Container exists but doesn't run unless explicitly started.

2. Profile Picture Manager (`bot/utils/profile_picture_manager.py`)

Added Methods:

_start_face_detector()

Runs docker start anime-face-detector
Waits up to 30s for API health check
Returns True when ready

_stop_face_detector()

Runs docker stop anime-face-detector
Frees ~918MB VRAM immediately

_ensure_vram_available() (updated)

Swaps to text model
Waits 3s for vision model to unload

Updated Method:

_detect_face()

async def _detect_face(self, image_bytes: bytes, debug: bool = False):
    face_detector_started = False
    try:
        # 1. Free VRAM by swapping to text model
        await self._ensure_vram_available(debug=debug)
        
        # 2. Start face detector container
        if not await self._start_face_detector(debug=debug):
            return None
        face_detector_started = True
        
        # 3. Call face detection API
        # ... detection logic ...
        
        return detection_result
        
    finally:
        # 4. ALWAYS stop container to free VRAM
        if face_detector_started:
            await self._stop_face_detector(debug=debug)

Container States

Normal Operation (Most of the time):

llama-swap:         RUNNING  (~4.8GB VRAM - text model loaded)
miku-bot:           RUNNING  (minimal VRAM)
anime-face-detector: STOPPED  (0 VRAM)

During Profile Picture Change:

Phase 1 - Vision Verification:
  llama-swap:         RUNNING  (~5GB VRAM - vision model)
  miku-bot:           RUNNING  
  anime-face-detector: STOPPED

Phase 2 - Model Swap:
  llama-swap:         RUNNING  (~1GB VRAM - transitioning)
  miku-bot:           RUNNING
  anime-face-detector: STOPPED

Phase 3 - Face Detection:
  llama-swap:         RUNNING  (~5GB VRAM - text model)
  miku-bot:           RUNNING
  anime-face-detector: RUNNING  (~918MB VRAM - detecting)

Phase 4 - Cleanup:
  llama-swap:         RUNNING  (~5GB VRAM - text model)
  miku-bot:           RUNNING
  anime-face-detector: STOPPED  (0 VRAM - stopped)

Benefits

✅ No VRAM Conflicts: Sequential processing with container lifecycle management ✅ Automatic: Bot handles all starting/stopping ✅ Efficient: Face detector only uses VRAM when actively needed (~10-15s) ✅ Reliable: Always stops in finally block, even on errors ✅ Simple: Uses standard docker commands from inside container

Commands

Manual Container Management

# Start face detector manually (for testing)
docker start anime-face-detector

# Check if it's running
docker ps | grep anime-face-detector

# Stop it manually
docker stop anime-face-detector

# Check VRAM usage
nvidia-smi

Start with Profile (for Gradio UI testing)

# Start with face detector running
docker-compose --profile tools up -d

# Use Gradio UI at http://localhost:7860
# Stop everything
docker-compose down

Monitoring

Check Container Status

docker ps -a --filter name=anime-face-detector

Watch VRAM During Profile Change

# Terminal 1: Watch GPU memory
watch -n 0.5 nvidia-smi

# Terminal 2: Trigger profile change
curl -X POST http://localhost:3939/profile-picture/change

Check Bot Logs

docker logs -f miku-bot | grep -E "face|VRAM|Starting|Stopping"

You should see:

💾 Swapping to text model to free VRAM for face detection...
✅ Vision model unloaded, VRAM available
🚀 Starting face detector container...
✅ Face detector ready
👤 Detected 1 face(s) via API...
🛑 Stopping face detector to free VRAM...
✅ Face detector stopped

Testing

Test On-Demand Face Detection

# 1. Verify face detector is stopped
docker ps | grep anime-face-detector
# Should show nothing

# 2. Check VRAM (should be ~4.8GB for text model only)
nvidia-smi

# 3. Trigger profile picture change
curl -X POST "http://localhost:3939/profile-picture/change"

# 4. Watch logs in another terminal
docker logs -f miku-bot

# 5. After completion, verify face detector stopped again
docker ps | grep anime-face-detector
# Should show nothing again

# 6. Check VRAM returned to ~4.8GB
nvidia-smi

Troubleshooting

Face Detector Won't Start

Symptom: ⚠️ Could not start face detector

Solutions:

# Check if container exists
docker ps -a | grep anime-face-detector

# If missing, rebuild
cd /home/koko210Serve/docker/miku-discord
docker-compose build anime-face-detector

# Check logs
docker logs anime-face-detector

Still Getting OOM

Symptom: cudaMalloc failed: out of memory

Check:

# What's using VRAM?
nvidia-smi

# Is face detector still running?
docker ps | grep anime-face-detector

# Stop it manually
docker stop anime-face-detector

Container Won't Stop

Symptom: Face detector stays running after detection

Solutions:

# Force stop
docker stop anime-face-detector

# Check for errors in bot logs
docker logs miku-bot | grep "stop"

# Verify the finally block is executing
docker logs miku-bot | grep "Stopping face detector"

Performance Metrics

Operation	Duration	VRAM Peak	Notes
Vision verification	5-10s	~5GB	Vision model loaded
Model swap	3-5s	~1GB	Transitioning
Container start	5-10s	~5GB	Text + starting detector
Face detection	1-2s	~5.8GB	Text + detector running
Container stop	1-2s	~5GB	Back to text only
Total	15-29s	5.8GB max	Fits in 6GB VRAM ✅

Files Modified

/miku-discord/docker-compose.yml
- Added restart: "no"
- Added profiles: [tools]
/miku-discord/bot/utils/profile_picture_manager.py
- Added _start_face_detector()
- Added _stop_face_detector()
- Updated _detect_face() with lifecycle management

/miku-discord/VRAM_MANAGEMENT.md - Original VRAM management approach
/miku-discord/FACE_DETECTION_API_MIGRATION.md - API migration details
/miku-discord/PROFILE_PICTURE_IMPLEMENTATION.md - Profile picture feature

Success Criteria

✅ Face detector container does not run by default ✅ Container starts only when face detection is needed ✅ Container stops immediately after detection completes ✅ No VRAM OOM errors during profile picture changes ✅ Total VRAM usage stays under 6GB at all times ✅ Process completes successfully with face detection working

Status: ✅ IMPLEMENTED AND TESTED

The on-demand face detection system is now active. The face detector will automatically start and stop as needed, ensuring efficient VRAM usage without conflicts.

8.9 KiB Raw Blame History

On-Demand Face Detection - Final Implementation

Problem Solved

Solution: On-Demand Container Management

New Process Flow

Profile Picture Change (Danbooru):

VRAM Timeline

Key Changes

1. Docker Compose (docker-compose.yml)

2. Profile Picture Manager (bot/utils/profile_picture_manager.py)

Added Methods:

Updated Method:

Container States

Normal Operation (Most of the time):

During Profile Picture Change:

Benefits

Commands

Manual Container Management

Start with Profile (for Gradio UI testing)

Monitoring

Check Container Status

Watch VRAM During Profile Change

Check Bot Logs

Testing

Test On-Demand Face Detection

Troubleshooting

Face Detector Won't Start

Still Getting OOM

Container Won't Stop

Performance Metrics

Files Modified

Related Documentation

Success Criteria

8.9 KiB

Raw Blame History

1. Docker Compose (`docker-compose.yml`)

2. Profile Picture Manager (`bot/utils/profile_picture_manager.py`)