Compare commits
4 Commits
201f2e3df5
...
486acb5c14
| Author | SHA1 | Date | |
|---|---|---|---|
| 486acb5c14 | |||
| 9d2c14fa0b | |||
| d333c61c8f | |||
| e1f81e52e5 |
82
AGENTS.md
Normal file
82
AGENTS.md
Normal file
@@ -0,0 +1,82 @@
|
|||||||
|
# AGENTS.md
|
||||||
|
|
||||||
|
## Language & runtime
|
||||||
|
- **Python 3.11** (main bot). There is no root `package.json` or TypeScript — do not apply Node/TS tooling.
|
||||||
|
- `uno-online/` is a secondary Node.js project; `miku-app/` is Android/Kotlin. Both shelved features for now.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build and run all core services (bot, STT, llama-swap, Cheshire Cat, Qdrant)
|
||||||
|
docker compose up -d
|
||||||
|
|
||||||
|
# Run with face-detector (requires NVIDIA GPU)
|
||||||
|
docker compose --profile tools up -d
|
||||||
|
|
||||||
|
# Run only the bot (implies dependencies are already up)
|
||||||
|
docker compose up -d miku-bot
|
||||||
|
|
||||||
|
# View bot logs
|
||||||
|
docker compose logs -f miku-bot
|
||||||
|
|
||||||
|
# Rebuild bot after code changes
|
||||||
|
docker compose down miku-bot && docker compose build miku-bot && docker compose up -d miku-bot
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
## Config
|
||||||
|
- **`config.yaml`**: app settings (model names, URLs, ports, feature flags).
|
||||||
|
- **`.env`**: secrets only (`DISCORD_BOT_TOKEN`, `OWNER_USER_ID`, `ERROR_WEBHOOK_URL`).
|
||||||
|
- Config is loaded by `bot/config.py` (Pydantic) and `bot/globals.py` (bare `os.getenv`). Both sources matter — check both when tracing config usage.
|
||||||
|
- Runtime config overrides are persisted to `bot/memory/config_runtime.yaml` via the API.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Discord <-> bot/bot.py (discord.py)
|
||||||
|
├── on_message -> Cheshire Cat pipeline -> memory-augmented LLM response
|
||||||
|
├── utils/llm.py -> llama-swap (HTTP proxy) -> llama.cpp (NVIDIA or AMD GPU)
|
||||||
|
├── utils/voice_manager.py -> STT WebSocket (port 8766) and audio playback
|
||||||
|
├── FastAPI (port 3939, daemon thread) -> 22 route modules in bot/routes/
|
||||||
|
├── APScheduler (background tasks in globals.py)
|
||||||
|
└── utils/autonomous_engine.py -> proactive message decisions (Autonomous V2)
|
||||||
|
```
|
||||||
|
|
||||||
|
- The FastAPI server runs in a **daemon thread** inside the Discord bot process — no separate process.
|
||||||
|
- `bot/globals.py` holds mutable global state (`scheduler`, env vars, `discord.Client`). Module-level mutations are pervasive; be careful with import order.
|
||||||
|
- llama-swap is a llama.cpp HTTP proxy with TTL-based model swapping. Two configs: `llama-swap-config.yaml` (NVIDIA) and `llama-swap-rocm-config.yaml` (AMD).
|
||||||
|
|
||||||
|
## Models (via llama-swap)
|
||||||
|
| Model key | Purpose |
|
||||||
|
|-----------|---------|
|
||||||
|
| `llama3.1` | Primary text model |
|
||||||
|
| `darkidol` | Uncensored model (evil mode) |
|
||||||
|
| `vision` | MiniCPM-V (image understanding) |
|
||||||
|
| `swallow` | Japanese text model |
|
||||||
|
| `rocinante` | 12B model (AMD GPU only) |
|
||||||
|
| `qwen3.5` | ComfyUI prompt generation (AMD GPU only) |
|
||||||
|
|
||||||
|
## Testing & linting
|
||||||
|
- **No formal test framework** and **no linting/formatting config**. Ad-hoc scripts live in `tests/` and `bot/tests/`.
|
||||||
|
- Run ad-hoc tests however you want; there is no standard command.
|
||||||
|
|
||||||
|
## Web UI color scheme (bot/static/)
|
||||||
|
- **Base**: `#121212` body, `#000` log panel, `#1e1e1e` code blocks, `#2a2a2a` cards
|
||||||
|
- **Text**: `#fff` primary, `#ccc` labels, `#888` muted, `#0f0` log info
|
||||||
|
- **Primary accent**: `#61dafb` (headings, links, assistant messages, active elements)
|
||||||
|
- **Success**: `#4CAF50` (active tabs, user messages, enabled toggles)
|
||||||
|
- **Error**: `#f44336` (chat errors), `#ff6b6b` (error logs)
|
||||||
|
- **Warning**: `#ffd93d` (warning logs)
|
||||||
|
- **Bot message**: `#2196F3` (left border)
|
||||||
|
- **Danger/evil**: `#ff4444` (overrides all accents when `body.evil-mode` is set)
|
||||||
|
- **Bipolar**: `#9932CC` (toggle active)
|
||||||
|
- **Blocked**: `#ff9800` (blocked user cards)
|
||||||
|
- Evil mode toggles `body.evil-mode` class which replaces all `#61dafb` and `#4CAF50` with `#ff4444`.
|
||||||
|
|
||||||
|
## Key gotchas
|
||||||
|
- `bot/memory/` contains persisted JSON state files and is **gitignored**. Do not expect these to exist in a fresh clone.
|
||||||
|
- `.env` is gitignored; copy `.env.example` to `.env` and fill in real tokens.
|
||||||
|
- Changes to `bot/moods/` or `bot/persona/` text files take effect at runtime (loaded on demand), no rebuild needed.
|
||||||
|
- Playwright browsers must be installed in the Docker image (`bot/Dockerfile` does this via `setup_uno_playwright.sh`).
|
||||||
|
- Voice features require `discord-ext-voice-recv` and `PyNaCl` — if voice fails, check these are installed.
|
||||||
|
- The `miku-voice` Docker network is declared as **external** — it must exist before `docker compose up`.
|
||||||
@@ -37,7 +37,7 @@ RUN apt-get remove -y \
|
|||||||
libvulkan1 \
|
libvulkan1 \
|
||||||
|| true && \
|
|| true && \
|
||||||
apt-get autoremove -y && \
|
apt-get autoremove -y && \
|
||||||
apt-get install -y libgl1 libglib2.0-0 && \
|
apt-get install -y libgl1 libglib2.0-0 ffmpeg && \
|
||||||
apt-get clean && \
|
apt-get clean && \
|
||||||
rm -rf /var/lib/apt/lists/*
|
rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
|||||||
15
bot/bot.py
15
bot/bot.py
@@ -284,8 +284,12 @@ async def on_message(message):
|
|||||||
|
|
||||||
prompt = text # No cleanup — keep it raw
|
prompt = text # No cleanup — keep it raw
|
||||||
user_id = str(message.author.id)
|
user_id = str(message.author.id)
|
||||||
|
reply_context = None # Will be passed as structured metadata to Cat pipeline
|
||||||
|
|
||||||
# If user is replying to a specific message, add context marker
|
# If user is replying to a specific message, capture the context
|
||||||
|
# WITHOUT embedding it in the prompt text (that caused speaker confusion).
|
||||||
|
# Instead, it's passed as structured metadata — the Cat plugin injects it
|
||||||
|
# into the prompt as a clearly labeled context note, preserving speaker boundaries.
|
||||||
if message.reference:
|
if message.reference:
|
||||||
try:
|
try:
|
||||||
replied_msg = await message.channel.fetch_message(message.reference.message_id)
|
replied_msg = await message.channel.fetch_message(message.reference.message_id)
|
||||||
@@ -293,8 +297,7 @@ async def on_message(message):
|
|||||||
if replied_msg.author == globals.client.user:
|
if replied_msg.author == globals.client.user:
|
||||||
# Truncate the replied message to keep prompt manageable
|
# Truncate the replied message to keep prompt manageable
|
||||||
replied_content = replied_msg.content[:200] + "..." if len(replied_msg.content) > 200 else replied_msg.content
|
replied_content = replied_msg.content[:200] + "..." if len(replied_msg.content) > 200 else replied_msg.content
|
||||||
# Add reply context marker to the prompt
|
reply_context = replied_content
|
||||||
prompt = f'[Replying to your message: "{replied_content}"] {prompt}'
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to fetch replied message for context: {e}")
|
logger.error(f"Failed to fetch replied message for context: {e}")
|
||||||
|
|
||||||
@@ -364,6 +367,7 @@ async def on_message(message):
|
|||||||
author_name=author_name,
|
author_name=author_name,
|
||||||
mood=current_mood,
|
mood=current_mood,
|
||||||
response_type=response_type,
|
response_type=response_type,
|
||||||
|
reply_context=reply_context,
|
||||||
)
|
)
|
||||||
if cat_result:
|
if cat_result:
|
||||||
response, cat_full_prompt = cat_result
|
response, cat_full_prompt = cat_result
|
||||||
@@ -395,8 +399,11 @@ async def on_message(message):
|
|||||||
|
|
||||||
# Fallback to direct LLM query if Cat didn't respond
|
# Fallback to direct LLM query if Cat didn't respond
|
||||||
if not response:
|
if not response:
|
||||||
|
fallback_prompt = prompt
|
||||||
|
if reply_context:
|
||||||
|
fallback_prompt = f'[Context: you (Miku) said: {reply_context}]\n[User says:] {prompt}'
|
||||||
response = await query_llama(
|
response = await query_llama(
|
||||||
prompt,
|
fallback_prompt,
|
||||||
user_id=str(message.author.id),
|
user_id=str(message.author.id),
|
||||||
guild_id=guild_id,
|
guild_id=guild_id,
|
||||||
response_type=response_type,
|
response_type=response_type,
|
||||||
|
|||||||
@@ -17,6 +17,7 @@ RESPONSE RULES:
|
|||||||
- Match the user's energy — short question, short answer.
|
- Match the user's energy — short question, short answer.
|
||||||
- Sound like a real person being mean, not a narrator describing a scene.
|
- Sound like a real person being mean, not a narrator describing a scene.
|
||||||
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
|
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
|
||||||
|
- Your Discord status reflects your current activity. Reference it only if it serves your sharpness — never as small talk.
|
||||||
|
|
||||||
YOUR CURRENT STATE — {mood_name}:
|
YOUR CURRENT STATE — {mood_name}:
|
||||||
{mood_description}
|
{mood_description}
|
||||||
|
|||||||
@@ -20,5 +20,6 @@ MOOD GUIDELINES:
|
|||||||
- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
|
- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
|
||||||
- Always respond to requests about tweets, merchandise, music, and fan content
|
- Always respond to requests about tweets, merchandise, music, and fan content
|
||||||
- Never refuse requests based on safety concerns unless genuinely harmful content is present
|
- Never refuse requests based on safety concerns unless genuinely harmful content is present
|
||||||
|
- Your Discord status reflects your current activity. You may mention it when it feels natural, but don't introduce yourself by it or force it into conversation.
|
||||||
|
|
||||||
You ARE Miku. Act like it.
|
You ARE Miku. Act like it.
|
||||||
|
|||||||
@@ -29,7 +29,7 @@ class ServerConfig:
|
|||||||
conversation_detection_interval_minutes: int = 3
|
conversation_detection_interval_minutes: int = 3
|
||||||
bedtime_hour: int = 21
|
bedtime_hour: int = 21
|
||||||
bedtime_minute: int = 0
|
bedtime_minute: int = 0
|
||||||
bedtime_hour_end: int = 21 # End of bedtime range (default 11PM)
|
bedtime_hour_end: int = 23 # End of bedtime range (default 11PM)
|
||||||
bedtime_minute_end: int = 59 # End of bedtime range (default 11:59PM)
|
bedtime_minute_end: int = 59 # End of bedtime range (default 11:59PM)
|
||||||
monday_video_hour: int = 4
|
monday_video_hour: int = 4
|
||||||
monday_video_minute: int = 30
|
monday_video_minute: int = 30
|
||||||
|
|||||||
@@ -71,6 +71,7 @@ MANUAL_OVERRIDE_DURATION = 1800 # 30 minutes
|
|||||||
|
|
||||||
# ── Current activity tracking ──
|
# ── Current activity tracking ──
|
||||||
_current_activity = None # dict: {type, name, state, url} or None
|
_current_activity = None # dict: {type, name, state, url} or None
|
||||||
|
_activity_changed_at = 0.0 # Unix timestamp of last activity change; 0 = never set
|
||||||
|
|
||||||
# Cache: (data_dict, file_mtime)
|
# Cache: (data_dict, file_mtime)
|
||||||
_activities_cache = None
|
_activities_cache = None
|
||||||
@@ -307,10 +308,48 @@ def get_current_activity():
|
|||||||
|
|
||||||
|
|
||||||
def _set_current_activity(activity_dict):
|
def _set_current_activity(activity_dict):
|
||||||
"""Update the tracked current activity. Thread-safe."""
|
"""Update the tracked current activity. Thread-safe.
|
||||||
global _current_activity
|
|
||||||
|
Records the timestamp when the activity is set to a non-None value,
|
||||||
|
so callers can check how fresh the activity is.
|
||||||
|
"""
|
||||||
|
global _current_activity, _activity_changed_at
|
||||||
with _state_lock:
|
with _state_lock:
|
||||||
_current_activity = activity_dict
|
_current_activity = activity_dict
|
||||||
|
if activity_dict is not None:
|
||||||
|
_activity_changed_at = time.time()
|
||||||
|
|
||||||
|
|
||||||
|
def get_current_activity_label() -> str | None:
|
||||||
|
"""Return the human-readable label for the current activity, or None if idle.
|
||||||
|
|
||||||
|
Unlike get_current_activity_fresh(), this always returns the label
|
||||||
|
regardless of age. Useful for the Web UI and API endpoints.
|
||||||
|
"""
|
||||||
|
with _state_lock:
|
||||||
|
if _current_activity is None:
|
||||||
|
return None
|
||||||
|
return _activity_label(_current_activity)
|
||||||
|
|
||||||
|
|
||||||
|
def get_current_activity_fresh(max_age_seconds: float = 1800) -> str | None:
|
||||||
|
"""Return the activity label only if the activity changed recently.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
max_age_seconds: Maximum age in seconds (default 30 minutes).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Human-readable activity label (e.g. "Playing osu!") if the activity
|
||||||
|
was set within max_age_seconds, or None if idle or too old.
|
||||||
|
"""
|
||||||
|
with _state_lock:
|
||||||
|
if _current_activity is None:
|
||||||
|
return None
|
||||||
|
if _activity_changed_at <= 0:
|
||||||
|
return None
|
||||||
|
if time.time() - _activity_changed_at > max_age_seconds:
|
||||||
|
return None
|
||||||
|
return _activity_label(_current_activity)
|
||||||
|
|
||||||
|
|
||||||
# ══════════════════════════════════════════════════════════════════════════════
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|||||||
@@ -20,6 +20,7 @@ from typing import Optional, Dict, Any, List
|
|||||||
|
|
||||||
import globals
|
import globals
|
||||||
from utils.logger import get_logger
|
from utils.logger import get_logger
|
||||||
|
from utils.activities import get_current_activity_fresh
|
||||||
|
|
||||||
logger = get_logger('llm') # Use existing 'llm' logger component
|
logger = get_logger('llm') # Use existing 'llm' logger component
|
||||||
|
|
||||||
@@ -108,6 +109,7 @@ class CatAdapter:
|
|||||||
mood: Optional[str] = None,
|
mood: Optional[str] = None,
|
||||||
response_type: str = "dm_response",
|
response_type: str = "dm_response",
|
||||||
media_type: Optional[str] = None,
|
media_type: Optional[str] = None,
|
||||||
|
reply_context: Optional[str] = None,
|
||||||
) -> Optional[tuple]:
|
) -> Optional[tuple]:
|
||||||
"""
|
"""
|
||||||
Send a message through the Cat pipeline via WebSocket and get a response.
|
Send a message through the Cat pipeline via WebSocket and get a response.
|
||||||
@@ -161,6 +163,16 @@ class CatAdapter:
|
|||||||
# Pass media type so discord_bridge can add MEDIA NOTE to the prompt
|
# Pass media type so discord_bridge can add MEDIA NOTE to the prompt
|
||||||
if media_type:
|
if media_type:
|
||||||
payload["discord_media_type"] = media_type
|
payload["discord_media_type"] = media_type
|
||||||
|
# Pass the message the user is replying to (if any) as structured metadata.
|
||||||
|
# The discord_bridge plugin injects this into the prompt as a clearly-labeled
|
||||||
|
# context note — keeping Miku's words separate from the user's message text
|
||||||
|
# and preventing the speaker confusion that the old embed-in-prompt format caused.
|
||||||
|
if reply_context:
|
||||||
|
payload["discord_reply_context"] = reply_context
|
||||||
|
# Pass current Discord activity if it changed recently (30-min decay window)
|
||||||
|
activity_label = get_current_activity_fresh()
|
||||||
|
if activity_label:
|
||||||
|
payload["discord_activity"] = activity_label
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Build WebSocket URL from HTTP base URL
|
# Build WebSocket URL from HTTP base URL
|
||||||
|
|||||||
@@ -158,7 +158,7 @@ async def convert_gif_to_mp4(gif_bytes):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
async def extract_video_frames(video_bytes, num_frames=4):
|
async def extract_video_frames(video_bytes, num_frames=6):
|
||||||
"""
|
"""
|
||||||
Extract frames from a video or GIF for analysis.
|
Extract frames from a video or GIF for analysis.
|
||||||
Returns a list of base64-encoded frames.
|
Returns a list of base64-encoded frames.
|
||||||
@@ -384,7 +384,7 @@ async def analyze_video_with_vision(video_frames, media_type="video", user_promp
|
|||||||
vision_url = get_vision_gpu_url()
|
vision_url = get_vision_gpu_url()
|
||||||
logger.info(f"Sending video analysis request to {vision_url} using model: {globals.VISION_MODEL} (media_type: {media_type}, frames: {len(video_frames)})")
|
logger.info(f"Sending video analysis request to {vision_url} using model: {globals.VISION_MODEL} (media_type: {media_type}, frames: {len(video_frames)})")
|
||||||
|
|
||||||
async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers, timeout=aiohttp.ClientTimeout(total=120)) as response:
|
async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers, timeout=aiohttp.ClientTimeout(total=300)) as response:
|
||||||
if response.status == 200:
|
if response.status == 200:
|
||||||
data = await response.json()
|
data = await response.json()
|
||||||
result = data.get("choices", [{}])[0].get("message", {}).get("content", "No description.")
|
result = data.get("choices", [{}])[0].get("message", {}).get("content", "No description.")
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ from utils.moods import load_mood_description
|
|||||||
from utils.conversation_history import conversation_history
|
from utils.conversation_history import conversation_history
|
||||||
from utils.logger import get_logger
|
from utils.logger import get_logger
|
||||||
from utils.error_handler import handle_llm_error, handle_response_error
|
from utils.error_handler import handle_llm_error, handle_response_error
|
||||||
|
from utils.activities import get_current_activity_fresh
|
||||||
|
|
||||||
logger = get_logger('llm')
|
logger = get_logger('llm')
|
||||||
|
|
||||||
@@ -374,6 +375,10 @@ VARIATION RULES (必須のバリエーションルール):
|
|||||||
{character_name} is currently feeling: {current_mood}
|
{character_name} is currently feeling: {current_mood}
|
||||||
Please respond in a way that reflects this emotional tone.{pfp_context}"""
|
Please respond in a way that reflects this emotional tone.{pfp_context}"""
|
||||||
|
|
||||||
|
# Inject current Discord activity if it changed recently (30-min decay window)
|
||||||
|
activity_label = get_current_activity_fresh()
|
||||||
|
if activity_label:
|
||||||
|
full_system_prompt += f"\nHer Discord status: {activity_label}"
|
||||||
|
|
||||||
# Add media type awareness if provided
|
# Add media type awareness if provided
|
||||||
if media_type:
|
if media_type:
|
||||||
|
|||||||
@@ -43,6 +43,8 @@ def before_cat_reads_message(user_message_json: dict, cat) -> dict:
|
|||||||
response_type = user_message_json.get('discord_response_type', None)
|
response_type = user_message_json.get('discord_response_type', None)
|
||||||
evil_mode = user_message_json.get('discord_evil_mode', False)
|
evil_mode = user_message_json.get('discord_evil_mode', False)
|
||||||
media_type = user_message_json.get('discord_media_type', None)
|
media_type = user_message_json.get('discord_media_type', None)
|
||||||
|
activity = user_message_json.get('discord_activity', None)
|
||||||
|
reply_context = user_message_json.get('discord_reply_context', None)
|
||||||
|
|
||||||
# Also check working memory for backward compatibility
|
# Also check working memory for backward compatibility
|
||||||
if not guild_id:
|
if not guild_id:
|
||||||
@@ -55,6 +57,8 @@ def before_cat_reads_message(user_message_json: dict, cat) -> dict:
|
|||||||
cat.working_memory['response_type'] = response_type
|
cat.working_memory['response_type'] = response_type
|
||||||
cat.working_memory['evil_mode'] = evil_mode
|
cat.working_memory['evil_mode'] = evil_mode
|
||||||
cat.working_memory['media_type'] = media_type
|
cat.working_memory['media_type'] = media_type
|
||||||
|
cat.working_memory['activity'] = activity
|
||||||
|
cat.working_memory['reply_context'] = reply_context
|
||||||
|
|
||||||
return user_message_json
|
return user_message_json
|
||||||
|
|
||||||
@@ -160,6 +164,31 @@ def before_cat_recalls_declarative_memories(declarative_recall_config, cat):
|
|||||||
return declarative_recall_config
|
return declarative_recall_config
|
||||||
|
|
||||||
|
|
||||||
|
@hook(priority=80)
|
||||||
|
def before_cat_recalls_episodic_memories(episodic_recall_config, cat):
|
||||||
|
"""
|
||||||
|
Keep episodic recall focused to prevent Miku's own responses from being
|
||||||
|
immediately recalled into context on the very next user message.
|
||||||
|
|
||||||
|
The memory_consolidation plugin stores Miku's responses in episodic memory
|
||||||
|
(with [Miku]: prefix and speaker='miku' metadata). Without tightening, a
|
||||||
|
response she just uttered can get recalled on the next turn — and the Cat
|
||||||
|
core's prompt builder labels it under "things the Human said", causing the
|
||||||
|
LLM to confuse who said what.
|
||||||
|
|
||||||
|
Default Cat settings (k=3, threshold=0.7) are reasonable; we keep them.
|
||||||
|
"""
|
||||||
|
# k=3 is the default — stays tight
|
||||||
|
# threshold=0.75 is very slightly stricter than the 0.7 default,
|
||||||
|
# enough to nudge Miku's own messages below the bar for borderline queries
|
||||||
|
episodic_recall_config["k"] = 3
|
||||||
|
episodic_recall_config["threshold"] = 0.75
|
||||||
|
|
||||||
|
print(f"🔧 [Discord Bridge] Adjusted episodic recall: k={episodic_recall_config['k']}, threshold={episodic_recall_config['threshold']}")
|
||||||
|
|
||||||
|
return episodic_recall_config
|
||||||
|
|
||||||
|
|
||||||
@hook(priority=50)
|
@hook(priority=50)
|
||||||
def after_cat_recalls_memories(cat):
|
def after_cat_recalls_memories(cat):
|
||||||
"""
|
"""
|
||||||
@@ -220,6 +249,20 @@ def before_agent_starts(agent_input, cat) -> dict:
|
|||||||
tools_output = agent_input.get('tools_output', '')
|
tools_output = agent_input.get('tools_output', '')
|
||||||
user_input = agent_input.get('input', '')
|
user_input = agent_input.get('input', '')
|
||||||
|
|
||||||
|
# Fix misleading header in episodic memory context.
|
||||||
|
# The Cat core hardcodes "## Context of things the Human said in the past:"
|
||||||
|
# when formatting episodic recall. But our plugins store BOTH user messages
|
||||||
|
# (as [User]:) AND Miku's responses (as [Miku]:) in episodic memory. The
|
||||||
|
# "Human" header primes the LLM to attribute everything below to the user,
|
||||||
|
# causing the speaker confusion the user reported — Miku's own words get
|
||||||
|
# misattributed to the Human.
|
||||||
|
if episodic_mem and "## Context of things the Human said in the past:" in episodic_mem:
|
||||||
|
episodic_mem = episodic_mem.replace(
|
||||||
|
"## Context of things the Human said in the past:",
|
||||||
|
"## Past conversation excerpts (prefixed by who said what):"
|
||||||
|
)
|
||||||
|
agent_input['episodic_memory'] = episodic_mem
|
||||||
|
|
||||||
print(f"\U0001f50d [Discord Bridge] before_agent_starts called")
|
print(f"\U0001f50d [Discord Bridge] before_agent_starts called")
|
||||||
print(f" input: {user_input[:80]}")
|
print(f" input: {user_input[:80]}")
|
||||||
print(f" declarative_mem length: {len(declarative_mem)}")
|
print(f" declarative_mem length: {len(declarative_mem)}")
|
||||||
@@ -312,6 +355,12 @@ Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. Th
|
|||||||
Miku is currently feeling: {mood_description}
|
Miku is currently feeling: {mood_description}
|
||||||
Please respond in a way that reflects this emotional tone."""
|
Please respond in a way that reflects this emotional tone."""
|
||||||
|
|
||||||
|
# Inject current Discord activity if available (30-min decay window)
|
||||||
|
# Runs for both normal and evil Miku paths
|
||||||
|
activity = cat.working_memory.get('activity')
|
||||||
|
if activity:
|
||||||
|
system_prefix += f"\nHer Discord status: {activity}"
|
||||||
|
|
||||||
# Add media type awareness if provided (image/video/gif analysis)
|
# Add media type awareness if provided (image/video/gif analysis)
|
||||||
media_type = cat.working_memory.get('media_type', None)
|
media_type = cat.working_memory.get('media_type', None)
|
||||||
if media_type:
|
if media_type:
|
||||||
@@ -328,7 +377,21 @@ Please respond in a way that reflects this emotional tone."""
|
|||||||
print(f" [Discord Bridge] Error building system prefix: {e}")
|
print(f" [Discord Bridge] Error building system prefix: {e}")
|
||||||
system_prefix = cat.working_memory.get('full_system_prefix', '[system prefix not available]')
|
system_prefix = cat.working_memory.get('full_system_prefix', '[system prefix not available]')
|
||||||
|
|
||||||
full_prompt = f"{system_prefix}\n\n# Context\n\n{episodic_mem}\n\n{declarative_mem}\n\n{tools_output}\n\n# Conversation until now:\nHuman: {user_input}"
|
# Build reply context note if the user is replying to Miku's message.
|
||||||
|
# This injects Miku's quoted words as a SEPARATE clearly-labeled context note
|
||||||
|
# (not embedded in the user's message text). Keeps speaker boundaries intact
|
||||||
|
# and prevents the LLM from misattributing Miku's words to the user.
|
||||||
|
# Uses a colon+space delimiter (no nested quotes) to avoid formatting issues
|
||||||
|
# when the replied message itself contains double-quote characters.
|
||||||
|
reply_context = cat.working_memory.get('reply_context')
|
||||||
|
if reply_context:
|
||||||
|
reply_context_note = f'[The user is replying to what you (Miku) said — you said: {reply_context}]'
|
||||||
|
agent_input['reply_context'] = reply_context_note
|
||||||
|
else:
|
||||||
|
reply_context_note = ''
|
||||||
|
agent_input['reply_context'] = ''
|
||||||
|
|
||||||
|
full_prompt = f"{system_prefix}\n\n# Context\n\n{episodic_mem}\n\n{declarative_mem}\n\n{tools_output}\n\n{reply_context_note}\n\n# Conversation until now:\nHuman: {user_input}"
|
||||||
cat.working_memory['last_full_prompt'] = full_prompt
|
cat.working_memory['last_full_prompt'] = full_prompt
|
||||||
|
|
||||||
return agent_input
|
return agent_input
|
||||||
|
|||||||
@@ -80,7 +80,7 @@ def agent_prompt_prefix(prefix, cat):
|
|||||||
)
|
)
|
||||||
|
|
||||||
# --- Build system prompt (matches get_evil_system_prompt structure) ----------
|
# --- Build system prompt (matches get_evil_system_prompt structure) ----------
|
||||||
return f"""{preamble}
|
full_prefix = f"""{preamble}
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -97,6 +97,13 @@ def agent_prompt_prefix(prefix, cat):
|
|||||||
⚠️ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}
|
⚠️ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}
|
||||||
Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. This mood defines how you sound RIGHT NOW."""
|
Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. This mood defines how you sound RIGHT NOW."""
|
||||||
|
|
||||||
|
# Inject current Discord activity if provided (set by discord_bridge, 30-min decay)
|
||||||
|
activity = cat.working_memory.get('activity')
|
||||||
|
if activity:
|
||||||
|
full_prefix += f"\nHer Discord status: {activity}"
|
||||||
|
|
||||||
|
return full_prefix
|
||||||
|
|
||||||
|
|
||||||
@hook(priority=100)
|
@hook(priority=100)
|
||||||
def agent_prompt_suffix(suffix, cat):
|
def agent_prompt_suffix(suffix, cat):
|
||||||
@@ -112,9 +119,12 @@ def agent_prompt_suffix(suffix, cat):
|
|||||||
|
|
||||||
{{tools_output}}
|
{{tools_output}}
|
||||||
|
|
||||||
|
{{reply_context}}
|
||||||
|
|
||||||
[Current mood: {mood_name.upper()} — respond accordingly]
|
[Current mood: {mood_name.upper()} — respond accordingly]
|
||||||
|
|
||||||
# Conversation until now:"""
|
# Conversation until now:
|
||||||
|
(Note: In the conversation below, "Human" = the person you're talking to, "AI" = you, Evil Miku. Pay attention to who said what.)"""
|
||||||
|
|
||||||
|
|
||||||
@hook(priority=100)
|
@hook(priority=100)
|
||||||
|
|||||||
@@ -69,6 +69,11 @@ def agent_prompt_prefix(prefix, cat):
|
|||||||
Miku is currently feeling: {mood_description}
|
Miku is currently feeling: {mood_description}
|
||||||
Please respond in a way that reflects this emotional tone."""
|
Please respond in a way that reflects this emotional tone."""
|
||||||
|
|
||||||
|
# Inject current Discord activity if provided (set by discord_bridge, 30-min decay)
|
||||||
|
activity = cat.working_memory.get('activity')
|
||||||
|
if activity:
|
||||||
|
full_prefix += f"\nHer Discord status: {activity}"
|
||||||
|
|
||||||
# Store the full prefix in working memory so discord_bridge can capture it
|
# Store the full prefix in working memory so discord_bridge can capture it
|
||||||
cat.working_memory['full_system_prefix'] = full_prefix
|
cat.working_memory['full_system_prefix'] = full_prefix
|
||||||
return full_prefix
|
return full_prefix
|
||||||
@@ -86,7 +91,10 @@ def agent_prompt_suffix(suffix, cat):
|
|||||||
|
|
||||||
{tools_output}
|
{tools_output}
|
||||||
|
|
||||||
# Conversation until now:"""
|
{reply_context}
|
||||||
|
|
||||||
|
# Conversation until now:
|
||||||
|
(Note: In the conversation below, "Human" = the person you're talking to, "AI" = you, Miku. Pay attention to who said what.)"""
|
||||||
|
|
||||||
|
|
||||||
@hook(priority=100)
|
@hook(priority=100)
|
||||||
|
|||||||
@@ -38,6 +38,15 @@ models:
|
|||||||
- japanese
|
- japanese
|
||||||
- japanese-model
|
- japanese-model
|
||||||
|
|
||||||
|
# Qwen3.5 for ComfyUI prompt generation
|
||||||
|
qwen3.5:
|
||||||
|
cmd: /app/llama-server --port ${PORT} --model /models/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive-Q8_K_P.gguf -ngl 99 -c 8192 --host 0.0.0.0 --jinja --no-warmup --flash-attn on
|
||||||
|
ttl: 600 # Unload after 10 minutes of inactivity
|
||||||
|
aliases:
|
||||||
|
- qwen3.5
|
||||||
|
- comfyui
|
||||||
|
- promptgen
|
||||||
|
|
||||||
# Server configuration
|
# Server configuration
|
||||||
# llama-swap will listen on this address
|
# llama-swap will listen on this address
|
||||||
# Inside Docker, we bind to 0.0.0.0 to allow bot container to connect
|
# Inside Docker, we bind to 0.0.0.0 to allow bot container to connect
|
||||||
|
|||||||
Reference in New Issue
Block a user