Files
miku-discord/PROFILE_PICTURE_IMPLEMENTATION.md

435 lines
13 KiB
Markdown

# Profile Picture Implementation
## Overview
Miku can now intelligently search for Hatsune Miku artwork on Danbooru and change her profile picture autonomously or manually. The system includes:
- **Danbooru Integration**: Searches for SFW Miku artwork (general/sensitive ratings only)
- **Vision Model Verification**: Confirms the image contains Miku and locates her if multiple characters present
- **Anime Face Detection**: Uses OpenCV with anime-specific cascade for intelligent cropping
- **Intelligent Cropping**: Centers on detected face or uses saliency detection fallback
- **Mood-Based Selection**: Searches for artwork matching Miku's current mood
- **Autonomous Action**: Once-per-day autonomous decision to change profile picture
- **Manual Controls**: Web UI and API endpoints for manual changes with optional custom uploads
## Architecture
### Core Components
#### 1. **Danbooru Client** (`utils/danbooru_client.py`)
- Interfaces with Danbooru's public API
- Searches for Hatsune Miku artwork with mood-based tag filtering
- Filters by rating (general/sensitive only, excludes questionable/explicit)
- Extracts image URLs and metadata
**Key Features:**
- Mood-to-tag mapping (e.g., "bubbly" → "smile", "happy")
- Random page selection for variety
- Proper rate limiting (2 req/sec, we use much less)
#### 2. **Profile Picture Manager** (`utils/profile_picture_manager.py`)
Main orchestrator for all profile picture operations.
**Workflow:**
1. **Source Image**:
- Custom upload (if provided) OR
- Danbooru search (filtered by mood and rating)
2. **Verification** (Danbooru images only):
- Uses MiniCPM-V vision model to confirm Miku is present
- Detects multiple characters and locates Miku's position
- Extracts suggested crop region if needed
3. **Face Detection**:
- Uses anime-specific face cascade (`lbpcascade_animeface`)
- Falls back to saliency detection if no face found
- Ultimate fallback: center crop
4. **Intelligent Cropping**:
- Crops to square aspect ratio
- Centers on detected face or salient region
- Resizes to 512x512 for Discord
5. **Apply**:
- Updates Discord bot avatar
- Saves metadata (source, timestamp, Danbooru post info)
- Keeps current image as backup
**Safety Features:**
- Current animated avatar saved as fallback
- Metadata logging for all changes
- Graceful error handling with rollback
- Rate limit awareness (Discord allows 2 changes per 10 min globally)
#### 3. **Autonomous Engine Integration** (`utils/autonomous_engine.py`)
New action type: `change_profile_picture`
**Decision Logic:**
- **Frequency**: Once per day maximum (20+ hour cooldown)
- **Time Window**: 10 AM - 10 PM only
- **Activity Requirement**: Low server activity (< 5 messages last hour)
- **Cooldown**: 1.5+ hours since last autonomous action
- **Mood Influence**: 2x more likely when bubbly/curious/excited/silly
- **Base Probability**: 1-2% per check (very rare)
**Why Once Per Day?**
- Respects Discord's rate limits
- Maintains consistency for users
- Preserves special nature of the feature
- Reduces API load on Danbooru
#### 4. **API Endpoints** (`api.py`)
##### **POST /profile-picture/change**
Change profile picture manually.
**Parameters:**
- `guild_id` (optional): Server ID to get mood from
- `file` (optional): Custom image upload (multipart/form-data)
**Behavior:**
- If `file` provided: Uses uploaded image
- If no `file`: Searches Danbooru with current mood
- Returns success status and metadata
**Example:**
```bash
# Auto (Danbooru search)
curl -X POST "http://localhost:8000/profile-picture/change?guild_id=123456"
# Custom upload
curl -X POST "http://localhost:8000/profile-picture/change" \
-F "file=@miku_image.png"
```
##### **GET /profile-picture/metadata**
Get information about current profile picture.
**Returns:**
```json
{
"status": "ok",
"metadata": {
"id": 12345,
"source": "danbooru",
"changed_at": "2025-12-05T14:30:00",
"rating": "g",
"tags": ["hatsune_miku", "solo", "smile"],
"artist": "artist_name",
"file_url": "https://..."
}
}
```
##### **POST /profile-picture/restore-fallback**
Restore the original animated fallback avatar.
**Example:**
```bash
curl -X POST "http://localhost:8000/profile-picture/restore-fallback"
```
## Technical Details
### Face Detection
Uses `lbpcascade_animeface.xml` - specifically trained for anime faces:
- More accurate than general face detection for anime art
- Downloaded automatically on first run
- Detects multiple faces and selects largest
### Vision Model Integration
Uses existing MiniCPM-V model for verification:
**Prompt:**
```
Analyze this image and answer:
1. Is Hatsune Miku present in this image? (yes/no)
2. How many characters are in the image? (number)
3. If multiple characters, describe where Miku is located
(left/right/center, top/bottom/middle)
```
**Response Parsing:**
- Extracts JSON from LLM response
- Maps location description to crop coordinates
- Handles multi-character images intelligently
### Cropping Strategy
1. **Face Detected**: Center on face center point
2. **No Face**: Use saliency detection (spectral residual method)
3. **Saliency Failed**: Center of image
**All crops:**
- Square aspect ratio (min dimension)
- 512x512 final output (Discord optimal size)
- High-quality Lanczos resampling
### Mood-Based Tag Mapping
| Mood | Danbooru Tags |
|------|---------------|
| bubbly | smile, happy |
| sleepy | sleepy, closed_eyes |
| curious | looking_at_viewer |
| shy | blush, embarrassed |
| excited | happy, open_mouth |
| silly | smile, tongue_out |
| melancholy | sad, tears |
| flirty | blush, wink |
| romantic | blush, heart |
| irritated | annoyed |
| angry | angry, frown |
**Note:** Only ONE random tag used to avoid over-filtering
## File Structure
```
bot/
├── utils/
│ ├── danbooru_client.py # Danbooru API wrapper
│ ├── profile_picture_manager.py # Main PFP logic
│ ├── autonomous_engine.py # Decision logic (updated)
│ └── autonomous.py # Action executor (updated)
├── memory/
│ └── profile_pictures/
│ ├── fallback.png # Original avatar backup
│ ├── current.png # Current processed image
│ ├── metadata.json # Change history/metadata
│ └── lbpcascade_animeface.xml # Face detection model
├── api.py # Web API (updated)
├── bot.py # Main bot (updated)
└── requirements.txt # Dependencies (updated)
```
## Dependencies Added
```
opencv-python # Computer vision & face detection
numpy # Array operations for image processing
```
Existing dependencies used:
- `Pillow` - Image manipulation
- `aiohttp` - Async HTTP for downloads
- `discord.py` - Avatar updates
## Initialization Sequence
On bot startup (`bot.py``on_ready`):
1. **Initialize Profile Picture Manager**
```python
await profile_picture_manager.initialize()
```
- Downloads anime face cascade if missing
- Loads OpenCV cascade classifier
- Prepares directory structure
2. **Save Current Avatar as Fallback**
```python
await profile_picture_manager.save_current_avatar_as_fallback()
```
- Downloads bot's current avatar
- Saves as `fallback.png`
- Preserves animated avatar if present
## Usage Examples
### Autonomous
Miku decides on her own (roughly once per day):
```python
# Automatic - handled by autonomous_tick_v2()
# No user intervention needed
```
### Manual via Web UI
**Location:** Actions Tab → Profile Picture section
**Available Controls:**
1. **🎨 Change Profile Picture (Danbooru)** - Automatic search
- Uses current mood from selected server
- Searches Danbooru for appropriate artwork
- Automatically crops and applies
2. **Upload Custom Image** - Manual upload
- Select image file from computer
- Bot detects face and crops intelligently
- Click "📤 Upload & Apply" to process
3. **🔄 Restore Original Avatar** - Rollback
- Restores the fallback avatar saved on bot startup
- Confirms before applying
**Features:**
- Real-time status updates
- Displays metadata after changes (source, tags, artist, etc.)
- Server selection dropdown to use specific server's mood
- File validation and error handling
### Manual via API
```bash
# Let Miku search Danbooru (uses current mood)
curl -X POST "http://localhost:8000/profile-picture/change?guild_id=123456"
# Upload custom image
curl -X POST "http://localhost:8000/profile-picture/change" \
-F "file=@custom_miku.png"
# Check current PFP metadata
curl "http://localhost:8000/profile-picture/metadata"
# Restore original avatar
curl -X POST "http://localhost:8000/profile-picture/restore-fallback"
```
### Manual via Web UI
(Implementation in `static/index.html` - to be added)
**Actions Tab:**
- Button: "Change Profile Picture (Danbooru)"
- File upload: "Upload Custom Image"
- Button: "Restore Original Avatar"
- Display: Current PFP metadata
## Error Handling
### Graceful Degradation
1. **Vision model fails**: Assume it's Miku (trust Danbooru tags)
2. **Face detection fails**: Use saliency detection
3. **Saliency fails**: Center crop
4. **Danbooru API fails**: Retry or skip action
5. **Discord API fails**: Log error, don't retry (rate limit)
### Rollback
If Discord avatar update fails:
- Error logged
- Metadata not saved
- Original avatar unchanged
- Fallback available via API
## Performance Considerations
### API Rate Limits
- **Danbooru**: 2 requests/second (we use ~1/day)
- **Discord**: 2 avatar changes/10 min globally (we use ~1/day)
- **Vision Model**: Local, no external limits
### Resource Usage
- **Image Download**: ~1-5 MB per image
- **Processing**: ~1-2 seconds (face detection + crop)
- **Storage**: ~500 KB per saved image
- **Memory**: Minimal (images processed and discarded)
### Caching Strategy
- Fallback saved on startup (one-time)
- Current PFP saved after processing
- Metadata persisted to JSON
- No aggressive caching needed (infrequent operation)
## Future Enhancements
### Potential Improvements
1. **Multi-mood combos**: "bubbly + romantic" tag combinations
2. **Time-based themes**: Different art styles by time of day
3. **User voting**: Let server vote on next PFP
4. **Quality scoring**: Rank images by aesthetic appeal
5. **Artist credits**: Post artist attribution when changing
6. **Preview mode**: Show crop preview before applying
7. **Scheduled changes**: Weekly theme rotations
8. **Favorite images**: Build curated collection over time
### Web UI Additions
- Real-time preview of crop before applying
- Gallery of previously used profile pictures
- Manual tag selection for Danbooru search
- Artist credit display
- Change history timeline
## Testing
## Testing
### Web UI Testing
1. Navigate to the bot control panel (usually `http://localhost:8000`)
2. Click the **Actions** tab
3. Scroll to the **🎨 Profile Picture** section
4. Try each feature:
- Click "Change Profile Picture (Danbooru)" - wait ~10-20 seconds
- Upload a custom Miku image and click "Upload & Apply"
- Click "Restore Original Avatar" to revert
**Expected Results:**
- Status messages appear below buttons
- Metadata displays when successful
- Bot's Discord avatar updates within ~5 seconds
- Errors display in red with clear messages
### Manual Testing Checklist
- [ ] Autonomous action triggers (set probability high for testing)
- [ ] Danbooru search returns results
- [ ] Vision model correctly identifies Miku
- [ ] Face detection works on anime art
- [ ] Saliency fallback works when no face
- [ ] Custom image upload works
- [ ] Discord avatar updates successfully
- [ ] Fallback restoration works
- [ ] Metadata saves correctly
- [ ] API endpoints respond properly
- [ ] Error handling works (bad images, API failures)
- [ ] Rate limiting prevents spam
### Test Commands
```bash
# Test Danbooru search
python -c "
import asyncio
from utils.danbooru_client import danbooru_client
async def test():
post = await danbooru_client.get_random_miku_image(mood='bubbly')
print(post)
asyncio.run(test())
"
# Test face detection (after downloading cascade)
# Upload test image via API
# Test autonomous trigger (increase probability temporarily)
# Edit autonomous_engine.py: base_chance = 1.0
```
## Deployment Notes
### First-Time Setup
1. Install new dependencies: `pip install opencv-python numpy`
2. Ensure `memory/profile_pictures/` directory exists
3. Bot will download face cascade on first run (~100 KB)
4. Current avatar automatically saved as fallback
### Docker Deployment
Already handled if using existing Dockerfile:
- `requirements.txt` includes new deps
- `memory/` directory persisted via volume
- Network access for Danbooru API
### Monitoring
Watch for these log messages:
- `📥 Downloading anime face detection cascade...`
- `✅ Anime face detection ready`
- `✅ Saved current avatar as fallback`
- `🎨 [V2] Changing profile picture (mood: ...)`
- `✅ Profile picture changed successfully!`
## Summary
This implementation provides Miku with a unique, personality-driven feature that:
- ✅ Fully autonomous (once per day decision-making)
- ✅ Mood-aware (searches match current emotional state)
- ✅ Intelligent (vision model verification + face detection)
- ✅ Safe (fallback preservation, error handling)
- ✅ Controllable (manual API endpoints with custom uploads)
- ✅ Well-integrated (fits existing autonomous engine architecture)
The feature showcases Miku's personality while respecting rate limits and providing users with visibility and control through the web UI.