Files
miku-discord/readmes/EMBED_CONTENT_FEATURE.md

136 lines
4.6 KiB
Markdown

# Embed Content Reading Feature
## Overview
Miku can now read and understand embedded content from Discord messages, including articles, images, videos, and other rich media that gets automatically embedded when sharing links.
## Supported Embed Types
### 1. **Article Embeds** (`rich`, `article`, `link`)
When you share a news article or blog post link, Discord automatically creates an embed with:
- **Title** - The article headline
- **Description** - A preview of the article content
- **Author** - The article author (if available)
- **Images** - Featured images or thumbnails
- **Custom Fields** - Additional metadata
Miku will:
- Extract and read the text content (title, description, fields)
- Analyze any embedded images
- Combine all this context to provide an informed response
### 2. **Image Embeds**
When links contain images that Discord auto-embeds:
- Miku downloads and analyzes the images using her vision model
- Provides descriptions and commentary based on what she sees
### 3. **Video Embeds**
For embedded videos from various platforms:
- Miku extracts multiple frames from the video
- Analyzes the visual content across frames
- Provides commentary on what's happening in the video
### 4. **Tenor GIF Embeds** (`gifv`)
Already supported and now integrated:
- Extracts frames from Tenor GIFs
- Analyzes the GIF content
- Provides playful responses about what's in the GIF
## How It Works
### Processing Flow
1. **Message Received** - User sends a message with an embedded link
2. **Embed Detection** - Miku detects the embed type
3. **Content Extraction**:
- Text content (title, description, fields, footer)
- Image URLs from embed
- Video URLs from embed
4. **Media Analysis**:
- Downloads and analyzes images with vision model
- Extracts and analyzes video frames
5. **Context Building** - Combines all extracted content
6. **Response Generation** - Miku responds with full context awareness
### Example Scenario
```
User: @Miku what do you think about this?
[Discord embeds article: "Bulgaria arrests mayor over €200,000 fine"]
Miku sees:
- Embedded title: "Bulgaria arrests mayor over €200,000 fine"
- Embedded description: "Town mayor Blagomir Kotsev charged with..."
- Embedded image: [analyzes photo of the mayor]
Miku responds with context-aware commentary about the news
```
## Technical Implementation
### New Functions
**`extract_embed_content(embed)`** - In `utils/image_handling.py`
- Extracts text from title, description, author, fields, footer
- Collects image URLs from embed.image and embed.thumbnail
- Collects video URLs from embed.video
- Returns structured dictionary with all content
### Modified Bot Logic
**`on_message()`** - In `bot.py`
- Checks for embeds in messages
- Processes different embed types:
- `gifv` - Tenor GIFs (existing functionality)
- `rich`, `article`, `image`, `video`, `link` - NEW comprehensive handling
- Builds enhanced context with embed content
- Passes context to LLM for informed responses
### Context Format
```
[Embedded content: <title and description>]
[Embedded image shows: <vision analysis>]
[Embedded video shows: <vision analysis>]
User message: <user's actual message>
```
## Logging
New log indicators:
- `📰 Processing {type} embed` - Starting embed processing
- `🖼️ Processing image from embed: {url}` - Analyzing embedded image
- `🎬 Processing video from embed: {url}` - Analyzing embedded video
- `💬 Server embed response` - Responding with embed context
- `💌 DM embed response` - DM response with embed context
## Supported Platforms
Any platform that Discord embeds should work:
- ✅ News sites (BBC, Reuters, etc.)
- ✅ Social media (Twitter/X embeds, Instagram, etc.)
- ✅ YouTube videos
- ✅ Blogs and Medium articles
- ✅ Image hosting sites
- ✅ Tenor GIFs
- ✅ Many other platforms with OpenGraph metadata
## Limitations
- Embed text is truncated to 500 characters to keep context manageable
- Some platforms may block bot requests for media
- Very large videos may take time to process
- Paywalled content only shows the preview text Discord provides
## Server/DM Support
- ✅ Works in server channels
- ✅ Works in DMs
- Respects server-specific moods
- Uses DM mood for direct messages
- Logs DM interactions including embed content
## Privacy
- Only processes embeds when Miku is addressed (@mentioned or in DMs)
- Respects blocked user list for DMs
- No storage of embed content beyond conversation history
## Future Enhancements
Potential improvements:
- Audio transcription from embedded audio/video
- PDF content extraction
- Twitter/X thread reading
- Better handling of code snippets in embeds
- Embed source credibility assessment