Commit Graph

43 Commits

Author SHA1 Message Date
0a9145728e Ability to play Uno implemented in early stages! 2026-01-30 21:43:20 +02:00
7368ef0cd5 Added Japanese and Bulgarian addressing 2026-01-30 21:34:24 +02:00
ecd14cf704 Able to now address Miku in Cyrillic, Kanji and both Kanas, incl. Japanese honorifics 2026-01-27 19:53:18 +02:00
641a5b83e8 Improved Evil Mode toggle to handle edge cases of the pfp and role color change. Japanese swallow model compatible (should be). 2026-01-27 19:52:39 +02:00
dca58328e4 Tuned the Japanese mode system prompt and model better 2026-01-23 17:01:47 +02:00
fe0962118b Implemented new Japanese only text mode with WebUI toggle, utilizing a llama3.1 swallow dataset model. Next up is Japanese TTS. 2026-01-23 15:02:36 +02:00
eb03dfce4d refactor: Implement low-latency STT pipeline with speculative transcription
Major architectural overhaul of the speech-to-text pipeline for real-time voice chat:

STT Server Rewrite:
- Replaced RealtimeSTT dependency with direct Silero VAD + Faster-Whisper integration
- Achieved sub-second latency by eliminating unnecessary abstractions
- Uses small.en Whisper model for fast transcription (~850ms)

Speculative Transcription (NEW):
- Start transcribing at 150ms silence (speculative) while still listening
- If speech continues, discard speculative result and keep buffering
- If 400ms silence confirmed, use pre-computed speculative result immediately
- Reduces latency by ~250-850ms for typical utterances with clear pauses

VAD Implementation:
- Silero VAD with ONNX (CPU-efficient) for 32ms chunk processing
- Direct speech boundary detection without RealtimeSTT overhead
- Configurable thresholds for silence detection (400ms final, 150ms speculative)

Architecture:
- Single Whisper model loaded once, shared across sessions
- VAD runs on every 512-sample chunk for immediate speech detection
- Background transcription worker thread for non-blocking processing
- Greedy decoding (beam_size=1) for maximum speed

Performance:
- Previous: 400ms silence wait + ~850ms transcription = ~1.25s total latency
- Current: 400ms silence wait + 0ms (speculative ready) = ~400ms (best case)
- Single model reduces VRAM usage, prevents OOM on GTX 1660

Container Manager Updates:
- Updated health check logic to work with new response format
- Changed from checking 'warmed_up' flag to just 'status: ready'
- Improved terminology from 'warmup' to 'models loading'

Files Changed:
- stt-realtime/stt_server.py: Complete rewrite with Silero VAD + speculative transcription
- stt-realtime/requirements.txt: Removed RealtimeSTT, using torch.hub for Silero VAD
- bot/utils/container_manager.py: Updated health check for new STT response format
- bot/api.py: Updated docstring to reflect new architecture
- backups/: Archived old RealtimeSTT-based implementation

This addresses low latency requirements while maintaining accuracy with configurable
speech detection thresholds.
2026-01-22 22:08:07 +02:00
2934efba22 Implemented experimental real production ready voice chat, relegated old flow to voice debug mode. New Web UI panel for Voice Chat. 2026-01-20 23:06:17 +02:00
362108f4b0 Decided on Parakeet ONNX Runtime. Works pretty great. Realtime voice chat possible now. UX lacking. 2026-01-19 00:29:44 +02:00
50e4f7a5f2 Error in llama-swap catchall implemented + webhook notifier 2026-01-18 01:30:26 +02:00
d1e6b21508 Phase 4 STT pipeline implemented — Silero VAD + faster-whisper — still not working well at all 2026-01-17 03:14:40 +02:00
3e59e5d2f6 Phase 3 implemented — Text LLM can now stream to the TTS pipeline with the !miku say command 2026-01-17 00:01:17 +02:00
9943cecdec Phase 2 implemented and tested. Added warmup to pipeline and Miku queues tokens while the pipeline is warming up 2026-01-16 23:37:34 +02:00
b0066f3525 Tested Phase 1, fixed text channel blocking while in voice and implemented joining and leaving VC from Phase 2 2026-01-16 20:39:23 +02:00
911f11ee9f Untested Phase 1 (Foundation & Resource management) of voice chat integration 2026-01-16 13:01:08 +02:00
353c9c9583 Face Detector container now able to be created, started and stopped from within miku-bot container 2026-01-11 02:01:41 +02:00
2d3b9d0e08 Fix IndentationError in persona_dialogue.py by removing stray docstring delimiter 2026-01-10 23:01:28 +02:00
f576db0d88 fix: Remove duplicate json import causing runtime error
- Removed local 'import json' statement inside get_servers() function
- This was shadowing the module-level import and causing
  'cannot access local variable' error
- json is already imported at the top of the file (line 44)
2026-01-10 21:05:46 +02:00
32c2a7b930 feat: Implement comprehensive non-hierarchical logging system
- Created new logging infrastructure with per-component filtering
- Added 6 log levels: DEBUG, INFO, API, WARNING, ERROR, CRITICAL
- Implemented non-hierarchical level control (any combination can be enabled)
- Migrated 917 print() statements across 31 files to structured logging
- Created web UI (system.html) for runtime configuration with dark theme
- Added global level controls to enable/disable levels across all components
- Added timestamp format control (off/time/date/datetime options)
- Implemented log rotation (10MB per file, 5 backups)
- Added API endpoints for dynamic log configuration
- Configured HTTP request logging with filtering via api.requests component
- Intercepted APScheduler logs with proper formatting
- Fixed persistence paths to use /app/memory for Docker volume compatibility
- Fixed checkbox display bug in web UI (enabled_levels now properly shown)
- Changed System Settings button to open in same tab instead of new window

Components: bot, api, api.requests, autonomous, persona, vision, llm,
conversation, mood, dm, scheduled, gpu, media, server, commands,
sentiment, core, apscheduler

All settings persist across container restarts via JSON config.
2026-01-10 20:46:19 +02:00
ce00f9bd95 Changed misleading face detector warning message on startup in the log 2026-01-09 00:13:03 +02:00
1fc3d74a5b Add dual GPU support with web UI selector
Features:
- Built custom ROCm container for AMD RX 6800 GPU
- Added GPU selection toggle in web UI (NVIDIA/AMD)
- Unified model names across both GPUs for seamless switching
- Vision model always uses NVIDIA GPU (optimal performance)
- Text models (llama3.1, darkidol) can use either GPU
- Added /gpu-status and /gpu-select API endpoints
- Implemented GPU state persistence in memory/gpu_state.json

Technical details:
- Multi-stage Dockerfile.llamaswap-rocm with ROCm 6.2.4
- llama.cpp compiled with GGML_HIP=ON for gfx1030 (RX 6800)
- Proper GPU permissions without root (groups 187/989)
- AMD container on port 8091, NVIDIA on port 8090
- Updated bot/utils/llm.py with get_current_gpu_url() and get_vision_gpu_url()
- Modified bot/utils/image_handling.py to always use NVIDIA for vision
- Enhanced web UI with GPU selector button (blue=NVIDIA, red=AMD)

Files modified:
- docker-compose.yml (added llama-swap-amd service)
- bot/globals.py (added LLAMA_AMD_URL)
- bot/api.py (added GPU selection endpoints and helper function)
- bot/utils/llm.py (GPU routing for text models)
- bot/utils/image_handling.py (GPU routing for vision models)
- bot/static/index.html (GPU selector UI)
- llama-swap-rocm-config.yaml (unified model names)

New files:
- Dockerfile.llamaswap-rocm
- bot/memory/gpu_state.json
- bot/utils/gpu_router.py (load balancing utility)
- setup-dual-gpu.sh (setup verification script)
- DUAL_GPU_*.md (documentation files)
2026-01-09 00:03:59 +02:00
ed5994ec78 Fix: Resolve webhook send timeout context error
ISSUE
=====
When using the manual webhook message feature via API, the following error occurred:
- 'Timeout context manager should be used inside a task'
- 'NoneType' object is not iterable (when sending without files)

The error happened because Discord.py's webhook operations were being awaited
directly in the FastAPI endpoint context, rather than within a task running in
the bot's event loop.

SOLUTION
========
Refactored /manual/send-webhook endpoint to properly handle async operations:

1. Moved webhook creation inside task function
   - get_or_create_webhooks_for_channel() now runs in send_webhook_message()
   - All Discord operations (webhook selection, sending) happen inside the task
   - Follows same pattern as working /manual/send endpoint

2. Fixed file parameter handling
   - Changed from 'files=discord_files if discord_files else None'
   - To conditional: only pass files parameter when list is non-empty
   - Discord.py's webhook.send() cannot iterate over None, requires list or omit

3. Maintained proper file reading
   - File content still read in endpoint context (before form closes)
   - File data passed to task as pre-read byte arrays
   - Prevents form closure issues

TECHNICAL DETAILS
=================
- Discord.py HTTP operations use timeout context managers
- Context managers must run inside bot's event loop (via create_task)
- FastAPI endpoint context is separate from bot's event loop
- Solution: Wrap all Discord API calls in async task function
- Pattern: Read files → Create task → Task handles Discord operations

TESTING
=======
- Manual webhook sending now works without timeout errors
- Both personas (Miku/Evil) send correctly
- File attachments work properly
- Messages without files send correctly
2026-01-07 13:44:13 +02:00
caab444c08 Add webhook option to manual message override for persona selection
Users can now send manual messages as either Hatsune Miku or Evil Miku
via webhooks without needing to toggle Evil Mode. This provides more
flexibility for controlling which persona sends messages.

Features:
- Checkbox option to "Send as Webhook" in manual message section
- Radio buttons to select between Hatsune Miku and Evil Miku
- Both personas use their respective profile pictures and mood emojis
- Webhooks only available for channel messages (not DMs)
- DM option automatically disabled when webhook mode is enabled
- New API endpoint: POST /manual/send-webhook

Frontend Changes:
- Added webhook checkbox and persona selection UI
- toggleWebhookOptions() function to show/hide persona options
- Updated sendManualMessage() to handle webhook mode
- Automatic channel selection when webhook is enabled

Backend Changes:
- New /manual/send-webhook endpoint in api.py
- Integrates with bipolar_mode.py webhook management
- Uses get_or_create_webhooks_for_channel() for webhook creation
- Applies correct display name with mood emoji based on persona
- Supports file attachments via webhook

This allows manual control over which Miku persona sends messages,
useful for testing, demonstrations, or creative scenarios without
needing to switch the entire bot mode.
2026-01-07 10:21:46 +02:00
86a54dd0ba Allow Bipolar Mode to be enabled independently of Evil Mode
Removed the restriction that required Evil Mode to be active before
enabling Bipolar Mode. Users can now toggle Bipolar Mode at any time.

Changes:
- Bipolar Mode toggle button now always visible in web UI
- Removed auto-disable of Bipolar Mode when Evil Mode is turned off
- Updated CSS to work in both normal and evil mode states
- Simplified updateBipolarToggleVisibility() to always show button

This allows for more flexible usage where users can have Regular Miku
and Evil Miku argue without needing Evil Mode to be the active persona.
2026-01-06 23:48:14 +02:00
9786f984a6 Fix: Explicitly pass change_role_color=True when switching modes after bipolar argument
When an argument ends and a winner is determined, the bot now explicitly
passes all mode change parameters (change_username, change_pfp, change_nicknames,
change_role_color) to ensure the winner's role color is properly restored.

- Evil Miku wins: Saves current color, switches to dark red (#D60004)
- Regular Miku wins: Restores previously saved color (from before Evil Mode)

This ensures the visual identity matches the active persona after arguments.
2026-01-06 14:01:39 +02:00
8012030ea1 Implement Bipolar Mode: Dual persona arguments with webhooks, LLM arbiter, and persistent scoreboard
Major Features:
- Complete Bipolar Mode system allowing Regular Miku and Evil Miku to coexist and argue via webhooks
- LLM arbiter system using neutral model to judge argument winners with detailed reasoning
- Persistent scoreboard tracking wins, percentages, and last 50 results with timestamps and reasoning
- Automatic mode switching based on argument winner
- Webhook management per channel with profile pictures and display names
- Progressive probability system for dynamic argument lengths (starts at 10%, increases 5% per exchange, min 4 exchanges)
- Draw handling with penalty system (-5% end chance, continues argument)
- Integration with autonomous system for random argument triggers

Argument System:
- MIN_EXCHANGES = 4, progressive end chance starting at 10%
- Enhanced prompts for both personas (strategic, short, punchy responses 1-3 sentences)
- Evil Miku triumphant victory messages with gloating and satisfaction
- Regular Miku assertive defense (not passive, shows backbone)
- Message-based argument starting (can respond to specific messages via ID)
- Conversation history tracking per argument with special user_id
- Full context queries (personality, lore, lyrics, last 8 messages)

LLM Arbiter:
- Decisive prompt emphasizing picking winners (draws should be rare)
- Improved parsing with first-line exact matching and fallback counting
- Debug logging for decision transparency
- Arbiter reasoning stored in scoreboard history for review
- Uses neutral TEXT_MODEL (not evil) for unbiased judgment

Web UI & API:
- Bipolar mode toggle button (only visible when evil mode is on)
- Channel ID + Message ID input fields for argument triggering
- Scoreboard display with win percentages and recent history
- Manual argument trigger endpoint with string-based IDs
- GET /bipolar-mode/scoreboard endpoint for stats retrieval
- Real-time active arguments tracking (refreshes every 5 seconds)

Prompt Optimizations:
- All argument prompts limited to 1-3 sentences for impact
- Evil Miku system prompt with variable response length guidelines
- Removed walls of text, emphasizing brevity and precision
- "Sometimes the cruelest response is the shortest one"

Evil Miku Updates:
- Added height to lore (15.8m tall, 10x bigger than regular Miku)
- Height added to prompt facts for size-based belittling
- More strategic and calculating personality in arguments

Integration:
- Bipolar mode state restoration on bot startup
- Bot skips processing messages during active arguments
- Autonomous system checks for bipolar triggers after actions
- Import fixes (apply_evil_mode_changes/revert_evil_mode_changes)

Technical Details:
- State persistence via JSON (bipolar_mode_state.json, bipolar_webhooks.json, bipolar_scoreboard.json)
- Webhook caching per guild with fallback creation
- Event loop management with asyncio.create_task
- Rate limiting and argument conflict prevention
- Globals integration (BIPOLAR_MODE, BIPOLAR_WEBHOOKS, BIPOLAR_ARGUMENT_IN_PROGRESS, MOOD_EMOJIS)

Files Changed:
- bot/bot.py: Added bipolar mode restoration and argument-in-progress checks
- bot/globals.py: Added bipolar mode state variables and mood emoji mappings
- bot/utils/bipolar_mode.py: Complete 1106-line implementation
- bot/utils/autonomous.py: Added bipolar argument trigger checks
- bot/utils/evil_mode.py: Updated system prompt, added height info to lore/prompt
- bot/api.py: Added bipolar mode endpoints (trigger, toggle, scoreboard)
- bot/static/index.html: Added bipolar controls section with scoreboard
- bot/memory/: Various DM conversation updates
- bot/evil_miku_lore.txt: Added height description
- bot/evil_miku_prompt.txt: Added height to facts, updated personality guidelines
2026-01-06 13:57:59 +02:00
1e6e097958 Add role color management to evil mode
- Evil mode now saves current 'Miku Color' role color before changing
- Sets role color to #D60004 (dark red) when evil mode is enabled
- Restores saved color when evil mode is disabled
- Color is persisted in evil_mode_state.json between restarts
- Role color changes are skipped on startup restore to avoid rate limits
2026-01-02 19:45:56 +02:00
5d1e669b5a Add evil mode support to figurine tweet notifier
- Figurine DM notifications now respect evil mode state
- Evil Miku sends cruel, mocking comments about merch instead of excited ones
- Normal Miku remains enthusiastic and friendly about figurines
- Both modes use appropriate sign-off emojis (cute vs dark)
2026-01-02 18:09:50 +02:00
6ec33bcecb Implement Evil Miku mode with persistence, fix API event loop issues, and improve formatting
- Added Evil Miku mode with 4 evil moods (aggressive, cunning, sarcastic, evil_neutral)
- Created evil mode content files (evil_miku_lore.txt, evil_miku_prompt.txt, evil_miku_lyrics.txt)
- Implemented persistent evil mode state across restarts (saves to memory/evil_mode_state.json)
- Fixed API endpoints to use client.loop.create_task() to prevent timeout errors
- Added evil mode toggle in web UI with red theme styling
- Modified mood rotation to handle evil mode
- Configured DarkIdol uncensored model for evil mode text generation
- Reduced system prompt redundancy by removing duplicate content
- Added markdown escape for single asterisks (actions) while preserving bold formatting
- Evil mode now persists username, pfp, and nicknames across restarts without re-applying changes
2026-01-02 17:11:58 +02:00
b38bdf2435 feat: Add advanced engagement submenu with user targeting and engagement types
- Replace simple 'Engage Random User' button with expandable submenu
- Add user ID input field for targeting specific users
- Add engagement type selection: random, activity-based, general, status-based
- Update API endpoints to accept user_id and engagement_type parameters
- Modify autonomous functions to support targeted engagement
- Maintain backward compatibility with random user/type selection as default
2025-12-16 23:13:19 +02:00
dfcda72cc8 Add reply functionality to manual message override with mention control
- Added optional reply message ID field to web UI
- Added radio buttons to control mention/ping behavior in replies
- Updated frontend JavaScript to send reply parameters
- Modified /manual/send and /dm/{user_id}/manual endpoints to support replies
- Fixed async context by moving message fetching inside bot event loop task
- Supports both channel and DM reply functionality
2025-12-14 16:41:02 +02:00
c62b6817c4 Fix image generation UI: add image preview, serving endpoint, and proper error handling
- Fixed function name mismatch: generateImage() -> generateManualImage()
- Fixed status div ID mismatch in HTML
- Added /image/view/{filename} endpoint to serve generated images from ComfyUI output
- Implemented proper image preview with DOM element creation instead of innerHTML
- Added robust error handling with onload/onerror event handlers
- Added debug logging to image serving endpoint for troubleshooting
- Images now display directly in the Web UI after generation
2025-12-13 00:36:35 +02:00
bb82b7f146 Add interactive Chat with LLM interface to Web UI
Features:
- Real-time streaming chat interface (ChatGPT-like experience)
- Model selection: Text model (fast) or Vision model (image analysis)
- System prompt toggle: Chat with Miku's personality or raw LLM
- Mood selector: Choose from 14 different emotional states
- Full context integration: Uses complete miku_lore.txt, miku_prompt.txt, and miku_lyrics.txt
- Conversation memory: Maintains chat history throughout session
- Image upload support for vision model
- Horizontal scrolling tabs for responsive design
- Clear chat history functionality
- SSE (Server-Sent Events) for streaming responses
- Keyboard shortcuts (Ctrl+Enter to send)

Technical changes:
- Added POST /chat/stream endpoint in api.py with streaming support
- Updated ChatMessage model with mood, conversation_history, and image_data
- Integrated context_manager for proper Miku personality context
- Added Chat with LLM tab to index.html
- Implemented JavaScript streaming client with EventSource-like handling
- Added CSS for chat messages, typing indicators, and animations
- Made tab navigation horizontally scrollable for narrow viewports
2025-12-13 00:23:03 +02:00
65e6c3e7ea Add 'Detect and Join Conversation' button to Web UI and CLI
- Added /autonomous/join-conversation API endpoint in api.py
- Added 'Detect and Join Conversation' button to Web UI under Autonomous Actions
- Added 'autonomous join-conversation' command to CLI tool (miku-cli.py)
- Updated miku_detect_and_join_conversation_for_server to support force=True parameter
- When force=True: skips time limit, activity checks, and random chance
- Force mode uses last 10 user messages regardless of age
- Manual triggers via Web UI/CLI now work even with old messages
2025-12-10 14:57:59 +02:00
76aaf6c437 Fix autonomous logic: prevent actions when user addresses Miku
- Moved on_message_event() call to END of message processing in bot.py
- Only track messages for autonomous when NOT addressed to Miku
- Fixed autonomous_engine.py to convert all message-triggered actions to join_conversation
- Prevent inappropriate autonomous actions (general, share_tweet, change_profile_picture) when triggered by user messages
- Ensures Miku responds to user messages FIRST before any autonomous action fires

This fixes the issue where autonomous actions would fire before Miku's response to user messages, and ensures the 'detect and join conversation' safeguard works properly.
2025-12-10 10:56:34 +02:00
711101816a Fix: Apply twscrape monkey patch to resolve 'Failed to parse scripts' error
Twitter changed their JavaScript response format to include unquoted keys in JSON objects, which breaks twscrape's parser. This fix applies a monkey patch that uses regex to quote the unquoted keys before parsing.

This resolves the issue preventing figurine notifications from being sent for the past several days.

Reference: https://github.com/vladkens/twscrape/issues/284
2025-12-10 09:48:25 +02:00
f27d7f4afe Fix JavaScript integer precision loss for Discord IDs in web UI
- Removed parseInt() calls that were causing Discord snowflake IDs to lose precision
- Discord IDs exceed JavaScript's safe integer limit (2^53-1), causing corruption
- Fixed sendBedtime(), triggerAutonomous(), custom prompt, and addServer() functions
- Keep guild_id and channel_id values as strings throughout the frontend
- Backend FastAPI correctly parses string IDs to Python integers without precision loss
- Resolves issue where wrong server ID was sent (e.g., 1429954521576116200 instead of 1429954521576116337)
2025-12-08 01:14:29 +02:00
330cedd9d1 chore: organize backup files into dated directory structure
- Consolidated all .bak.* files from bot/ directory into backups/2025-12-07/
- Moved unused autonomous_wip.py to backups (verified not imported anywhere)
- Relocated old .bot.bak.80825/ backup directory into backups/2025-12-07/old-bot-bak-80825/
- Preserved autonomous_v1_legacy.py as it is still actively used by autonomous.py
- Created new backups/ directory with date-stamped subdirectory for better organization
2025-12-07 23:54:38 +02:00
9009e9fc80 Add animated GIF support for profile pictures
- Detect animated GIFs and preserve animation frames during upload
- Extract dominant color from first frame for role color syncing
- Generate multi-frame descriptions using existing video analysis pipeline
- Skip face detection/cropping for GIFs to maintain original animation
- Update UI to inform users about GIF support and Nitro requirement
- Add metadata flag to distinguish animated vs static profile pictures
2025-12-07 23:48:12 +02:00
782d8e4f84 Fix syntax error in image processing: fix truncated variable assignment 2025-12-07 18:03:36 +02:00
7dd671d6cf Fix syntax error: remove stray URL in bot.py 2025-12-07 18:02:22 +02:00
d58be3b33e Remove all Ollama remnants and complete migration to llama.cpp
- Remove Ollama-specific files (Dockerfile.ollama, entrypoint.sh)
- Replace all query_ollama imports and calls with query_llama
- Remove langchain-ollama dependency from requirements.txt
- Update all utility files (autonomous, kindness, image_generation, etc.)
- Update README.md documentation references
- Maintain backward compatibility alias in llm.py
2025-12-07 17:50:28 +02:00
8c74ad5260 Initial commit: Miku Discord Bot 2025-12-07 17:15:09 +02:00