Organize documentation: Move all .md files to readmes/ directory
This commit is contained in:
135
readmes/EMBED_CONTENT_FEATURE.md
Normal file
135
readmes/EMBED_CONTENT_FEATURE.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# Embed Content Reading Feature
|
||||
|
||||
## Overview
|
||||
Miku can now read and understand embedded content from Discord messages, including articles, images, videos, and other rich media that gets automatically embedded when sharing links.
|
||||
|
||||
## Supported Embed Types
|
||||
|
||||
### 1. **Article Embeds** (`rich`, `article`, `link`)
|
||||
When you share a news article or blog post link, Discord automatically creates an embed with:
|
||||
- **Title** - The article headline
|
||||
- **Description** - A preview of the article content
|
||||
- **Author** - The article author (if available)
|
||||
- **Images** - Featured images or thumbnails
|
||||
- **Custom Fields** - Additional metadata
|
||||
|
||||
Miku will:
|
||||
- Extract and read the text content (title, description, fields)
|
||||
- Analyze any embedded images
|
||||
- Combine all this context to provide an informed response
|
||||
|
||||
### 2. **Image Embeds**
|
||||
When links contain images that Discord auto-embeds:
|
||||
- Miku downloads and analyzes the images using her vision model
|
||||
- Provides descriptions and commentary based on what she sees
|
||||
|
||||
### 3. **Video Embeds**
|
||||
For embedded videos from various platforms:
|
||||
- Miku extracts multiple frames from the video
|
||||
- Analyzes the visual content across frames
|
||||
- Provides commentary on what's happening in the video
|
||||
|
||||
### 4. **Tenor GIF Embeds** (`gifv`)
|
||||
Already supported and now integrated:
|
||||
- Extracts frames from Tenor GIFs
|
||||
- Analyzes the GIF content
|
||||
- Provides playful responses about what's in the GIF
|
||||
|
||||
## How It Works
|
||||
|
||||
### Processing Flow
|
||||
1. **Message Received** - User sends a message with an embedded link
|
||||
2. **Embed Detection** - Miku detects the embed type
|
||||
3. **Content Extraction**:
|
||||
- Text content (title, description, fields, footer)
|
||||
- Image URLs from embed
|
||||
- Video URLs from embed
|
||||
4. **Media Analysis**:
|
||||
- Downloads and analyzes images with vision model
|
||||
- Extracts and analyzes video frames
|
||||
5. **Context Building** - Combines all extracted content
|
||||
6. **Response Generation** - Miku responds with full context awareness
|
||||
|
||||
### Example Scenario
|
||||
```
|
||||
User: @Miku what do you think about this?
|
||||
[Discord embeds article: "Bulgaria arrests mayor over €200,000 fine"]
|
||||
|
||||
Miku sees:
|
||||
- Embedded title: "Bulgaria arrests mayor over €200,000 fine"
|
||||
- Embedded description: "Town mayor Blagomir Kotsev charged with..."
|
||||
- Embedded image: [analyzes photo of the mayor]
|
||||
|
||||
Miku responds with context-aware commentary about the news
|
||||
```
|
||||
|
||||
## Technical Implementation
|
||||
|
||||
### New Functions
|
||||
**`extract_embed_content(embed)`** - In `utils/image_handling.py`
|
||||
- Extracts text from title, description, author, fields, footer
|
||||
- Collects image URLs from embed.image and embed.thumbnail
|
||||
- Collects video URLs from embed.video
|
||||
- Returns structured dictionary with all content
|
||||
|
||||
### Modified Bot Logic
|
||||
**`on_message()`** - In `bot.py`
|
||||
- Checks for embeds in messages
|
||||
- Processes different embed types:
|
||||
- `gifv` - Tenor GIFs (existing functionality)
|
||||
- `rich`, `article`, `image`, `video`, `link` - NEW comprehensive handling
|
||||
- Builds enhanced context with embed content
|
||||
- Passes context to LLM for informed responses
|
||||
|
||||
### Context Format
|
||||
```
|
||||
[Embedded content: <title and description>]
|
||||
[Embedded image shows: <vision analysis>]
|
||||
[Embedded video shows: <vision analysis>]
|
||||
|
||||
User message: <user's actual message>
|
||||
```
|
||||
|
||||
## Logging
|
||||
New log indicators:
|
||||
- `📰 Processing {type} embed` - Starting embed processing
|
||||
- `🖼️ Processing image from embed: {url}` - Analyzing embedded image
|
||||
- `🎬 Processing video from embed: {url}` - Analyzing embedded video
|
||||
- `💬 Server embed response` - Responding with embed context
|
||||
- `💌 DM embed response` - DM response with embed context
|
||||
|
||||
## Supported Platforms
|
||||
Any platform that Discord embeds should work:
|
||||
- ✅ News sites (BBC, Reuters, etc.)
|
||||
- ✅ Social media (Twitter/X embeds, Instagram, etc.)
|
||||
- ✅ YouTube videos
|
||||
- ✅ Blogs and Medium articles
|
||||
- ✅ Image hosting sites
|
||||
- ✅ Tenor GIFs
|
||||
- ✅ Many other platforms with OpenGraph metadata
|
||||
|
||||
## Limitations
|
||||
- Embed text is truncated to 500 characters to keep context manageable
|
||||
- Some platforms may block bot requests for media
|
||||
- Very large videos may take time to process
|
||||
- Paywalled content only shows the preview text Discord provides
|
||||
|
||||
## Server/DM Support
|
||||
- ✅ Works in server channels
|
||||
- ✅ Works in DMs
|
||||
- Respects server-specific moods
|
||||
- Uses DM mood for direct messages
|
||||
- Logs DM interactions including embed content
|
||||
|
||||
## Privacy
|
||||
- Only processes embeds when Miku is addressed (@mentioned or in DMs)
|
||||
- Respects blocked user list for DMs
|
||||
- No storage of embed content beyond conversation history
|
||||
|
||||
## Future Enhancements
|
||||
Potential improvements:
|
||||
- Audio transcription from embedded audio/video
|
||||
- PDF content extraction
|
||||
- Twitter/X thread reading
|
||||
- Better handling of code snippets in embeds
|
||||
- Embed source credibility assessment
|
||||
Reference in New Issue
Block a user