175 lines
4.1 KiB
Markdown
175 lines
4.1 KiB
Markdown
# Testing Embed Content Reading
|
|
|
|
## Test Cases
|
|
|
|
### Test 1: News Article with Image
|
|
**What to do:** Send a news article link to Miku
|
|
```
|
|
@Miku what do you think about this?
|
|
https://www.bbc.com/news/articles/example
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Miku reads the article title and description
|
|
- Analyzes the embedded image
|
|
- Provides commentary based on both text and image
|
|
|
|
**Log output:**
|
|
```
|
|
📰 Processing article embed
|
|
🖼️ Processing image from embed: [url]
|
|
💬 Server embed response to [user]
|
|
```
|
|
|
|
---
|
|
|
|
### Test 2: YouTube Video
|
|
**What to do:** Share a YouTube link
|
|
```
|
|
@Miku check this out
|
|
https://www.youtube.com/watch?v=example
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Miku reads video title and description from embed
|
|
- May analyze thumbnail image
|
|
- Responds with context about the video
|
|
|
|
---
|
|
|
|
### Test 3: Twitter/X Post
|
|
**What to do:** Share a tweet link
|
|
```
|
|
@Miku thoughts?
|
|
https://twitter.com/user/status/123456789
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Reads tweet text from embed
|
|
- Analyzes any images in the embed
|
|
- Provides response based on tweet content
|
|
|
|
---
|
|
|
|
### Test 4: Tenor GIF (via /gif command or link)
|
|
**What to do:** Use Discord's GIF picker or share Tenor link
|
|
```
|
|
@Miku what's happening here?
|
|
[shares Tenor GIF via Discord]
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Extracts GIF URL from Tenor embed
|
|
- Converts GIF to MP4
|
|
- Extracts 6 frames
|
|
- Analyzes the animation
|
|
- Responds with description of what's in the GIF
|
|
|
|
**Log output:**
|
|
```
|
|
🎭 Processing Tenor GIF from embed
|
|
🔄 Converting Tenor GIF to MP4 for processing...
|
|
✅ Tenor GIF converted to MP4
|
|
📹 Extracted 6 frames from Tenor GIF
|
|
💬 Server Tenor GIF response to [user]
|
|
```
|
|
|
|
---
|
|
|
|
### Test 5: Image Link (direct)
|
|
**What to do:** Share a direct image link that Discord embeds
|
|
```
|
|
@Miku what is this?
|
|
https://example.com/image.jpg
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Detects image embed
|
|
- Downloads and analyzes image
|
|
- Provides description
|
|
|
|
---
|
|
|
|
### Test 6: Article WITHOUT Image
|
|
**What to do:** Share an article that has only text preview
|
|
```
|
|
@Miku summarize this
|
|
https://example.com/text-article
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Reads title, description, and any fields
|
|
- Responds based on text content alone
|
|
|
|
---
|
|
|
|
## Real Test Example
|
|
|
|
Based on your screenshot, you shared:
|
|
**URL:** https://www.vesti.bg/bulgaria/sreshu-200-000-leva-puskat-pod-garancija-kmeta-na-varna-blagomir-kocev-snimki-6245207
|
|
|
|
**What Miku saw:**
|
|
- **Embed Type:** article/rich
|
|
- **Title:** "Срещу 200 000 лева пускат под гаранция к..."
|
|
- **Description:** "Окръжният съд във Варна определи парична гаранция от 200 000 лв. на кмета на Варна Благомир Коцев..."
|
|
- **Image:** Photo of the mayor
|
|
- **Source:** Vesti.bg
|
|
|
|
**What Miku did:**
|
|
1. Extracted Bulgarian text from embed
|
|
2. Analyzed the photo of the person
|
|
3. Combined context: article about mayor + image analysis
|
|
4. Generated response with full understanding
|
|
|
|
---
|
|
|
|
## Monitoring in Real-Time
|
|
|
|
To watch Miku process embeds live:
|
|
```bash
|
|
docker logs -f ollama-discord-bot-1 | grep -E "Processing|embed|Embedded"
|
|
```
|
|
|
|
---
|
|
|
|
## Edge Cases Handled
|
|
|
|
### Multiple Embeds in One Message
|
|
- Processes the first compatible embed
|
|
- Returns after processing (prevents spam)
|
|
|
|
### Embed with Both Text and Media
|
|
- Extracts all text content
|
|
- Processes all images and videos
|
|
- Combines everything into comprehensive context
|
|
|
|
### Empty or Invalid Embeds
|
|
- Checks `has_content` flag
|
|
- Skips if no extractable content
|
|
- Continues to next embed or normal processing
|
|
|
|
### Large Embed Content
|
|
- Truncates text to 500 characters
|
|
- Processes up to 6 video frames
|
|
- Keeps context manageable for LLM
|
|
|
|
---
|
|
|
|
## Comparison: Before vs After
|
|
|
|
### Before
|
|
```
|
|
User: @Miku what about this? [shares article]
|
|
Miku: *sees only "what about this?"*
|
|
Miku: "About what? I don't see anything specific..."
|
|
```
|
|
|
|
### After
|
|
```
|
|
User: @Miku what about this? [shares article]
|
|
Miku: *sees article title, description, and image*
|
|
Miku: *provides informed commentary about the actual article*
|
|
```
|
|
|
|
This matches exactly what you see in your screenshot! The Bulgarian news article about the mayor was properly read and understood by Miku.
|