Files

koko210Serve 9d2c14fa0b Fix vision pipeline: ffmpeg removal by autoremove, increase vision timeout, reduce frame count, add Discord activity awareness

- bot/Dockerfile: Add ffmpeg to reinstall line after apt-get autoremove
  (autoremove was sweeping up ffmpeg as 'no longer needed' after playwright install)
- bot/utils/image_handling.py: Increase video analysis timeout 120s→300s, 6→3 for Tenor GIFs (GTX 1660 VRAM constraint)
- bot/utils/activities.py: Add _activity_changed_at timestamp tracking,
  get_current_activity_label() and get_current_activity_fresh() with 30-min decay
- bot/utils/cat_client.py: Pass current Discord activity to Cheshire Cat pipeline
- bot/utils/llm.py: Inject current Discord activity into system prompt
- cat-plugins/*: Forward Discord activity through working_memory to personality plugins
- bot/persona/*/preamble.txt: Add Discord status usage guidelines for character prompts
- llama-swap-rocm-config.yaml: Add qwen3.5 model entry for ComfyUI prompt generation
- AGENTS.md: New project documentation file

2026-05-27 01:18:12 +03:00

4.1 KiB

Raw Blame History

AGENTS.md

Language & runtime

Python 3.11 (main bot). There is no root package.json or TypeScript — do not apply Node/TS tooling.
uno-online/ is a secondary Node.js project; miku-app/ is Android/Kotlin. Both shelved features for now.

Commands

# Build and run all core services (bot, STT, llama-swap, Cheshire Cat, Qdrant)
docker compose up -d

# Run with face-detector (requires NVIDIA GPU)
docker compose --profile tools up -d

# Run only the bot (implies dependencies are already up)
docker compose up -d miku-bot

# View bot logs
docker compose logs -f miku-bot

# Rebuild bot after code changes
docker compose down miku-bot && docker compose build miku-bot && docker compose up -d miku-bot

Config

config.yaml: app settings (model names, URLs, ports, feature flags).
.env: secrets only (DISCORD_BOT_TOKEN, OWNER_USER_ID, ERROR_WEBHOOK_URL).
Config is loaded by bot/config.py (Pydantic) and bot/globals.py (bare os.getenv). Both sources matter — check both when tracing config usage.
Runtime config overrides are persisted to bot/memory/config_runtime.yaml via the API.

Architecture

Discord <-> bot/bot.py (discord.py)
               ├── on_message -> Cheshire Cat pipeline -> memory-augmented LLM response
               ├── utils/llm.py -> llama-swap (HTTP proxy) -> llama.cpp (NVIDIA or AMD GPU)
               ├── utils/voice_manager.py -> STT WebSocket (port 8766) and audio playback
               ├── FastAPI (port 3939, daemon thread) -> 22 route modules in bot/routes/
               ├── APScheduler (background tasks in globals.py)
               └── utils/autonomous_engine.py -> proactive message decisions (Autonomous V2)

The FastAPI server runs in a daemon thread inside the Discord bot process — no separate process.
bot/globals.py holds mutable global state (scheduler, env vars, discord.Client). Module-level mutations are pervasive; be careful with import order.
llama-swap is a llama.cpp HTTP proxy with TTL-based model swapping. Two configs: llama-swap-config.yaml (NVIDIA) and llama-swap-rocm-config.yaml (AMD).

Models (via llama-swap)

Model key	Purpose
`llama3.1`	Primary text model
`darkidol`	Uncensored model (evil mode)
`vision`	MiniCPM-V (image understanding)
`swallow`	Japanese text model
`rocinante`	12B model (AMD GPU only)
`qwen3.5`	ComfyUI prompt generation (AMD GPU only)

Testing & linting

No formal test framework and no linting/formatting config. Ad-hoc scripts live in tests/ and bot/tests/.
Run ad-hoc tests however you want; there is no standard command.

Web UI color scheme (bot/static/)

Base: #121212 body, #000 log panel, #1e1e1e code blocks, #2a2a2a cards
Text: #fff primary, #ccc labels, #888 muted, #0f0 log info
Primary accent: #61dafb (headings, links, assistant messages, active elements)
Success: #4CAF50 (active tabs, user messages, enabled toggles)
Error: #f44336 (chat errors), #ff6b6b (error logs)
Warning: #ffd93d (warning logs)
Bot message: #2196F3 (left border)
Danger/evil: #ff4444 (overrides all accents when body.evil-mode is set)
Bipolar: #9932CC (toggle active)
Blocked: #ff9800 (blocked user cards)
Evil mode toggles body.evil-mode class which replaces all #61dafb and #4CAF50 with #ff4444.

Key gotchas

bot/memory/ contains persisted JSON state files and is gitignored. Do not expect these to exist in a fresh clone.
.env is gitignored; copy .env.example to .env and fill in real tokens.
Changes to bot/moods/ or bot/persona/ text files take effect at runtime (loaded on demand), no rebuild needed.
Playwright browsers must be installed in the Docker image (bot/Dockerfile does this via setup_uno_playwright.sh).
Voice features require discord-ext-voice-recv and PyNaCl — if voice fails, check these are installed.
The miku-voice Docker network is declared as external — it must exist before docker compose up.

4.1 KiB Raw Blame History

AGENTS.md

Language & runtime

Commands

Config

Architecture

Models (via llama-swap)

Testing & linting

Web UI color scheme (bot/static/)

Key gotchas

4.1 KiB

Raw Blame History