Files

koko210Serve 88d4256755 Organize documentation: Move all .md files to readmes/ directory

2025-12-07 17:21:59 +02:00

4.6 KiB

Raw Blame History

Embed Content Reading Feature

Overview

Miku can now read and understand embedded content from Discord messages, including articles, images, videos, and other rich media that gets automatically embedded when sharing links.

Supported Embed Types

1. Article Embeds (`rich`, `article`, `link`)

When you share a news article or blog post link, Discord automatically creates an embed with:

Title - The article headline
Description - A preview of the article content
Author - The article author (if available)
Images - Featured images or thumbnails
Custom Fields - Additional metadata

Miku will:

Extract and read the text content (title, description, fields)
Analyze any embedded images
Combine all this context to provide an informed response

2. Image Embeds

When links contain images that Discord auto-embeds:

Miku downloads and analyzes the images using her vision model
Provides descriptions and commentary based on what she sees

3. Video Embeds

For embedded videos from various platforms:

Miku extracts multiple frames from the video
Analyzes the visual content across frames
Provides commentary on what's happening in the video

4. Tenor GIF Embeds (`gifv`)

Already supported and now integrated:

Extracts frames from Tenor GIFs
Analyzes the GIF content
Provides playful responses about what's in the GIF

How It Works

Processing Flow

Message Received - User sends a message with an embedded link
Embed Detection - Miku detects the embed type
Content Extraction:
- Text content (title, description, fields, footer)
- Image URLs from embed
- Video URLs from embed
Media Analysis:
- Downloads and analyzes images with vision model
- Extracts and analyzes video frames
Context Building - Combines all extracted content
Response Generation - Miku responds with full context awareness

Example Scenario

User: @Miku what do you think about this?
[Discord embeds article: "Bulgaria arrests mayor over €200,000 fine"]

Miku sees:
- Embedded title: "Bulgaria arrests mayor over €200,000 fine"
- Embedded description: "Town mayor Blagomir Kotsev charged with..."
- Embedded image: [analyzes photo of the mayor]

Miku responds with context-aware commentary about the news

Technical Implementation

New Functions

extract_embed_content(embed) - In utils/image_handling.py

Extracts text from title, description, author, fields, footer
Collects image URLs from embed.image and embed.thumbnail
Collects video URLs from embed.video
Returns structured dictionary with all content

Modified Bot Logic

on_message() - In bot.py

Checks for embeds in messages
Processes different embed types:
- gifv - Tenor GIFs (existing functionality)
- rich, article, image, video, link - NEW comprehensive handling
Builds enhanced context with embed content
Passes context to LLM for informed responses

Context Format

[Embedded content: <title and description>]
[Embedded image shows: <vision analysis>]
[Embedded video shows: <vision analysis>]

User message: <user's actual message>

Logging

New log indicators:

📰 Processing {type} embed - Starting embed processing
🖼️ Processing image from embed: {url} - Analyzing embedded image
🎬 Processing video from embed: {url} - Analyzing embedded video
💬 Server embed response - Responding with embed context
💌 DM embed response - DM response with embed context

Supported Platforms

Any platform that Discord embeds should work:

✅ News sites (BBC, Reuters, etc.)
✅ Social media (Twitter/X embeds, Instagram, etc.)
✅ YouTube videos
✅ Blogs and Medium articles
✅ Image hosting sites
✅ Tenor GIFs
✅ Many other platforms with OpenGraph metadata

Limitations

Embed text is truncated to 500 characters to keep context manageable
Some platforms may block bot requests for media
Very large videos may take time to process
Paywalled content only shows the preview text Discord provides

Server/DM Support

✅ Works in server channels
✅ Works in DMs
Respects server-specific moods
Uses DM mood for direct messages
Logs DM interactions including embed content

Privacy

Only processes embeds when Miku is addressed (@mentioned or in DMs)
Respects blocked user list for DMs
No storage of embed content beyond conversation history

Future Enhancements

Potential improvements:

Audio transcription from embedded audio/video
PDF content extraction
Twitter/X thread reading
Better handling of code snippets in embeds
Embed source credibility assessment

4.6 KiB Raw Blame History