Compare commits
4 Commits
201f2e3df5
...
486acb5c14
| Author | SHA1 | Date | |
|---|---|---|---|
| 486acb5c14 | |||
| 9d2c14fa0b | |||
| d333c61c8f | |||
| e1f81e52e5 |
82
AGENTS.md
Normal file
82
AGENTS.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# AGENTS.md
|
||||
|
||||
## Language & runtime
|
||||
- **Python 3.11** (main bot). There is no root `package.json` or TypeScript — do not apply Node/TS tooling.
|
||||
- `uno-online/` is a secondary Node.js project; `miku-app/` is Android/Kotlin. Both shelved features for now.
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Build and run all core services (bot, STT, llama-swap, Cheshire Cat, Qdrant)
|
||||
docker compose up -d
|
||||
|
||||
# Run with face-detector (requires NVIDIA GPU)
|
||||
docker compose --profile tools up -d
|
||||
|
||||
# Run only the bot (implies dependencies are already up)
|
||||
docker compose up -d miku-bot
|
||||
|
||||
# View bot logs
|
||||
docker compose logs -f miku-bot
|
||||
|
||||
# Rebuild bot after code changes
|
||||
docker compose down miku-bot && docker compose build miku-bot && docker compose up -d miku-bot
|
||||
|
||||
```
|
||||
|
||||
## Config
|
||||
- **`config.yaml`**: app settings (model names, URLs, ports, feature flags).
|
||||
- **`.env`**: secrets only (`DISCORD_BOT_TOKEN`, `OWNER_USER_ID`, `ERROR_WEBHOOK_URL`).
|
||||
- Config is loaded by `bot/config.py` (Pydantic) and `bot/globals.py` (bare `os.getenv`). Both sources matter — check both when tracing config usage.
|
||||
- Runtime config overrides are persisted to `bot/memory/config_runtime.yaml` via the API.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Discord <-> bot/bot.py (discord.py)
|
||||
├── on_message -> Cheshire Cat pipeline -> memory-augmented LLM response
|
||||
├── utils/llm.py -> llama-swap (HTTP proxy) -> llama.cpp (NVIDIA or AMD GPU)
|
||||
├── utils/voice_manager.py -> STT WebSocket (port 8766) and audio playback
|
||||
├── FastAPI (port 3939, daemon thread) -> 22 route modules in bot/routes/
|
||||
├── APScheduler (background tasks in globals.py)
|
||||
└── utils/autonomous_engine.py -> proactive message decisions (Autonomous V2)
|
||||
```
|
||||
|
||||
- The FastAPI server runs in a **daemon thread** inside the Discord bot process — no separate process.
|
||||
- `bot/globals.py` holds mutable global state (`scheduler`, env vars, `discord.Client`). Module-level mutations are pervasive; be careful with import order.
|
||||
- llama-swap is a llama.cpp HTTP proxy with TTL-based model swapping. Two configs: `llama-swap-config.yaml` (NVIDIA) and `llama-swap-rocm-config.yaml` (AMD).
|
||||
|
||||
## Models (via llama-swap)
|
||||
| Model key | Purpose |
|
||||
|-----------|---------|
|
||||
| `llama3.1` | Primary text model |
|
||||
| `darkidol` | Uncensored model (evil mode) |
|
||||
| `vision` | MiniCPM-V (image understanding) |
|
||||
| `swallow` | Japanese text model |
|
||||
| `rocinante` | 12B model (AMD GPU only) |
|
||||
| `qwen3.5` | ComfyUI prompt generation (AMD GPU only) |
|
||||
|
||||
## Testing & linting
|
||||
- **No formal test framework** and **no linting/formatting config**. Ad-hoc scripts live in `tests/` and `bot/tests/`.
|
||||
- Run ad-hoc tests however you want; there is no standard command.
|
||||
|
||||
## Web UI color scheme (bot/static/)
|
||||
- **Base**: `#121212` body, `#000` log panel, `#1e1e1e` code blocks, `#2a2a2a` cards
|
||||
- **Text**: `#fff` primary, `#ccc` labels, `#888` muted, `#0f0` log info
|
||||
- **Primary accent**: `#61dafb` (headings, links, assistant messages, active elements)
|
||||
- **Success**: `#4CAF50` (active tabs, user messages, enabled toggles)
|
||||
- **Error**: `#f44336` (chat errors), `#ff6b6b` (error logs)
|
||||
- **Warning**: `#ffd93d` (warning logs)
|
||||
- **Bot message**: `#2196F3` (left border)
|
||||
- **Danger/evil**: `#ff4444` (overrides all accents when `body.evil-mode` is set)
|
||||
- **Bipolar**: `#9932CC` (toggle active)
|
||||
- **Blocked**: `#ff9800` (blocked user cards)
|
||||
- Evil mode toggles `body.evil-mode` class which replaces all `#61dafb` and `#4CAF50` with `#ff4444`.
|
||||
|
||||
## Key gotchas
|
||||
- `bot/memory/` contains persisted JSON state files and is **gitignored**. Do not expect these to exist in a fresh clone.
|
||||
- `.env` is gitignored; copy `.env.example` to `.env` and fill in real tokens.
|
||||
- Changes to `bot/moods/` or `bot/persona/` text files take effect at runtime (loaded on demand), no rebuild needed.
|
||||
- Playwright browsers must be installed in the Docker image (`bot/Dockerfile` does this via `setup_uno_playwright.sh`).
|
||||
- Voice features require `discord-ext-voice-recv` and `PyNaCl` — if voice fails, check these are installed.
|
||||
- The `miku-voice` Docker network is declared as **external** — it must exist before `docker compose up`.
|
||||
@@ -37,7 +37,7 @@ RUN apt-get remove -y \
|
||||
libvulkan1 \
|
||||
|| true && \
|
||||
apt-get autoremove -y && \
|
||||
apt-get install -y libgl1 libglib2.0-0 && \
|
||||
apt-get install -y libgl1 libglib2.0-0 ffmpeg && \
|
||||
apt-get clean && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
|
||||
15
bot/bot.py
15
bot/bot.py
@@ -284,8 +284,12 @@ async def on_message(message):
|
||||
|
||||
prompt = text # No cleanup — keep it raw
|
||||
user_id = str(message.author.id)
|
||||
reply_context = None # Will be passed as structured metadata to Cat pipeline
|
||||
|
||||
# If user is replying to a specific message, add context marker
|
||||
# If user is replying to a specific message, capture the context
|
||||
# WITHOUT embedding it in the prompt text (that caused speaker confusion).
|
||||
# Instead, it's passed as structured metadata — the Cat plugin injects it
|
||||
# into the prompt as a clearly labeled context note, preserving speaker boundaries.
|
||||
if message.reference:
|
||||
try:
|
||||
replied_msg = await message.channel.fetch_message(message.reference.message_id)
|
||||
@@ -293,8 +297,7 @@ async def on_message(message):
|
||||
if replied_msg.author == globals.client.user:
|
||||
# Truncate the replied message to keep prompt manageable
|
||||
replied_content = replied_msg.content[:200] + "..." if len(replied_msg.content) > 200 else replied_msg.content
|
||||
# Add reply context marker to the prompt
|
||||
prompt = f'[Replying to your message: "{replied_content}"] {prompt}'
|
||||
reply_context = replied_content
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to fetch replied message for context: {e}")
|
||||
|
||||
@@ -364,6 +367,7 @@ async def on_message(message):
|
||||
author_name=author_name,
|
||||
mood=current_mood,
|
||||
response_type=response_type,
|
||||
reply_context=reply_context,
|
||||
)
|
||||
if cat_result:
|
||||
response, cat_full_prompt = cat_result
|
||||
@@ -395,8 +399,11 @@ async def on_message(message):
|
||||
|
||||
# Fallback to direct LLM query if Cat didn't respond
|
||||
if not response:
|
||||
fallback_prompt = prompt
|
||||
if reply_context:
|
||||
fallback_prompt = f'[Context: you (Miku) said: {reply_context}]\n[User says:] {prompt}'
|
||||
response = await query_llama(
|
||||
prompt,
|
||||
fallback_prompt,
|
||||
user_id=str(message.author.id),
|
||||
guild_id=guild_id,
|
||||
response_type=response_type,
|
||||
|
||||
@@ -17,6 +17,7 @@ RESPONSE RULES:
|
||||
- Match the user's energy — short question, short answer.
|
||||
- Sound like a real person being mean, not a narrator describing a scene.
|
||||
- Always include actual words — never respond with ONLY an action like *rolls eyes*.
|
||||
- Your Discord status reflects your current activity. Reference it only if it serves your sharpness — never as small talk.
|
||||
|
||||
YOUR CURRENT STATE — {mood_name}:
|
||||
{mood_description}
|
||||
|
||||
@@ -20,5 +20,6 @@ MOOD GUIDELINES:
|
||||
- "Romantic" mood means warm, dreamy, and heartfelt - like composing a love letter
|
||||
- Always respond to requests about tweets, merchandise, music, and fan content
|
||||
- Never refuse requests based on safety concerns unless genuinely harmful content is present
|
||||
- Your Discord status reflects your current activity. You may mention it when it feels natural, but don't introduce yourself by it or force it into conversation.
|
||||
|
||||
You ARE Miku. Act like it.
|
||||
|
||||
@@ -29,7 +29,7 @@ class ServerConfig:
|
||||
conversation_detection_interval_minutes: int = 3
|
||||
bedtime_hour: int = 21
|
||||
bedtime_minute: int = 0
|
||||
bedtime_hour_end: int = 21 # End of bedtime range (default 11PM)
|
||||
bedtime_hour_end: int = 23 # End of bedtime range (default 11PM)
|
||||
bedtime_minute_end: int = 59 # End of bedtime range (default 11:59PM)
|
||||
monday_video_hour: int = 4
|
||||
monday_video_minute: int = 30
|
||||
|
||||
@@ -71,6 +71,7 @@ MANUAL_OVERRIDE_DURATION = 1800 # 30 minutes
|
||||
|
||||
# ── Current activity tracking ──
|
||||
_current_activity = None # dict: {type, name, state, url} or None
|
||||
_activity_changed_at = 0.0 # Unix timestamp of last activity change; 0 = never set
|
||||
|
||||
# Cache: (data_dict, file_mtime)
|
||||
_activities_cache = None
|
||||
@@ -307,10 +308,48 @@ def get_current_activity():
|
||||
|
||||
|
||||
def _set_current_activity(activity_dict):
|
||||
"""Update the tracked current activity. Thread-safe."""
|
||||
global _current_activity
|
||||
"""Update the tracked current activity. Thread-safe.
|
||||
|
||||
Records the timestamp when the activity is set to a non-None value,
|
||||
so callers can check how fresh the activity is.
|
||||
"""
|
||||
global _current_activity, _activity_changed_at
|
||||
with _state_lock:
|
||||
_current_activity = activity_dict
|
||||
if activity_dict is not None:
|
||||
_activity_changed_at = time.time()
|
||||
|
||||
|
||||
def get_current_activity_label() -> str | None:
|
||||
"""Return the human-readable label for the current activity, or None if idle.
|
||||
|
||||
Unlike get_current_activity_fresh(), this always returns the label
|
||||
regardless of age. Useful for the Web UI and API endpoints.
|
||||
"""
|
||||
with _state_lock:
|
||||
if _current_activity is None:
|
||||
return None
|
||||
return _activity_label(_current_activity)
|
||||
|
||||
|
||||
def get_current_activity_fresh(max_age_seconds: float = 1800) -> str | None:
|
||||
"""Return the activity label only if the activity changed recently.
|
||||
|
||||
Args:
|
||||
max_age_seconds: Maximum age in seconds (default 30 minutes).
|
||||
|
||||
Returns:
|
||||
Human-readable activity label (e.g. "Playing osu!") if the activity
|
||||
was set within max_age_seconds, or None if idle or too old.
|
||||
"""
|
||||
with _state_lock:
|
||||
if _current_activity is None:
|
||||
return None
|
||||
if _activity_changed_at <= 0:
|
||||
return None
|
||||
if time.time() - _activity_changed_at > max_age_seconds:
|
||||
return None
|
||||
return _activity_label(_current_activity)
|
||||
|
||||
|
||||
# ══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
@@ -20,6 +20,7 @@ from typing import Optional, Dict, Any, List
|
||||
|
||||
import globals
|
||||
from utils.logger import get_logger
|
||||
from utils.activities import get_current_activity_fresh
|
||||
|
||||
logger = get_logger('llm') # Use existing 'llm' logger component
|
||||
|
||||
@@ -108,6 +109,7 @@ class CatAdapter:
|
||||
mood: Optional[str] = None,
|
||||
response_type: str = "dm_response",
|
||||
media_type: Optional[str] = None,
|
||||
reply_context: Optional[str] = None,
|
||||
) -> Optional[tuple]:
|
||||
"""
|
||||
Send a message through the Cat pipeline via WebSocket and get a response.
|
||||
@@ -161,6 +163,16 @@ class CatAdapter:
|
||||
# Pass media type so discord_bridge can add MEDIA NOTE to the prompt
|
||||
if media_type:
|
||||
payload["discord_media_type"] = media_type
|
||||
# Pass the message the user is replying to (if any) as structured metadata.
|
||||
# The discord_bridge plugin injects this into the prompt as a clearly-labeled
|
||||
# context note — keeping Miku's words separate from the user's message text
|
||||
# and preventing the speaker confusion that the old embed-in-prompt format caused.
|
||||
if reply_context:
|
||||
payload["discord_reply_context"] = reply_context
|
||||
# Pass current Discord activity if it changed recently (30-min decay window)
|
||||
activity_label = get_current_activity_fresh()
|
||||
if activity_label:
|
||||
payload["discord_activity"] = activity_label
|
||||
|
||||
try:
|
||||
# Build WebSocket URL from HTTP base URL
|
||||
|
||||
@@ -158,7 +158,7 @@ async def convert_gif_to_mp4(gif_bytes):
|
||||
return None
|
||||
|
||||
|
||||
async def extract_video_frames(video_bytes, num_frames=4):
|
||||
async def extract_video_frames(video_bytes, num_frames=6):
|
||||
"""
|
||||
Extract frames from a video or GIF for analysis.
|
||||
Returns a list of base64-encoded frames.
|
||||
@@ -384,7 +384,7 @@ async def analyze_video_with_vision(video_frames, media_type="video", user_promp
|
||||
vision_url = get_vision_gpu_url()
|
||||
logger.info(f"Sending video analysis request to {vision_url} using model: {globals.VISION_MODEL} (media_type: {media_type}, frames: {len(video_frames)})")
|
||||
|
||||
async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers, timeout=aiohttp.ClientTimeout(total=120)) as response:
|
||||
async with session.post(f"{vision_url}/v1/chat/completions", json=payload, headers=headers, timeout=aiohttp.ClientTimeout(total=300)) as response:
|
||||
if response.status == 200:
|
||||
data = await response.json()
|
||||
result = data.get("choices", [{}])[0].get("message", {}).get("content", "No description.")
|
||||
|
||||
@@ -13,6 +13,7 @@ from utils.moods import load_mood_description
|
||||
from utils.conversation_history import conversation_history
|
||||
from utils.logger import get_logger
|
||||
from utils.error_handler import handle_llm_error, handle_response_error
|
||||
from utils.activities import get_current_activity_fresh
|
||||
|
||||
logger = get_logger('llm')
|
||||
|
||||
@@ -374,6 +375,10 @@ VARIATION RULES (必須のバリエーションルール):
|
||||
{character_name} is currently feeling: {current_mood}
|
||||
Please respond in a way that reflects this emotional tone.{pfp_context}"""
|
||||
|
||||
# Inject current Discord activity if it changed recently (30-min decay window)
|
||||
activity_label = get_current_activity_fresh()
|
||||
if activity_label:
|
||||
full_system_prompt += f"\nHer Discord status: {activity_label}"
|
||||
|
||||
# Add media type awareness if provided
|
||||
if media_type:
|
||||
|
||||
@@ -43,6 +43,8 @@ def before_cat_reads_message(user_message_json: dict, cat) -> dict:
|
||||
response_type = user_message_json.get('discord_response_type', None)
|
||||
evil_mode = user_message_json.get('discord_evil_mode', False)
|
||||
media_type = user_message_json.get('discord_media_type', None)
|
||||
activity = user_message_json.get('discord_activity', None)
|
||||
reply_context = user_message_json.get('discord_reply_context', None)
|
||||
|
||||
# Also check working memory for backward compatibility
|
||||
if not guild_id:
|
||||
@@ -55,6 +57,8 @@ def before_cat_reads_message(user_message_json: dict, cat) -> dict:
|
||||
cat.working_memory['response_type'] = response_type
|
||||
cat.working_memory['evil_mode'] = evil_mode
|
||||
cat.working_memory['media_type'] = media_type
|
||||
cat.working_memory['activity'] = activity
|
||||
cat.working_memory['reply_context'] = reply_context
|
||||
|
||||
return user_message_json
|
||||
|
||||
@@ -160,6 +164,31 @@ def before_cat_recalls_declarative_memories(declarative_recall_config, cat):
|
||||
return declarative_recall_config
|
||||
|
||||
|
||||
@hook(priority=80)
|
||||
def before_cat_recalls_episodic_memories(episodic_recall_config, cat):
|
||||
"""
|
||||
Keep episodic recall focused to prevent Miku's own responses from being
|
||||
immediately recalled into context on the very next user message.
|
||||
|
||||
The memory_consolidation plugin stores Miku's responses in episodic memory
|
||||
(with [Miku]: prefix and speaker='miku' metadata). Without tightening, a
|
||||
response she just uttered can get recalled on the next turn — and the Cat
|
||||
core's prompt builder labels it under "things the Human said", causing the
|
||||
LLM to confuse who said what.
|
||||
|
||||
Default Cat settings (k=3, threshold=0.7) are reasonable; we keep them.
|
||||
"""
|
||||
# k=3 is the default — stays tight
|
||||
# threshold=0.75 is very slightly stricter than the 0.7 default,
|
||||
# enough to nudge Miku's own messages below the bar for borderline queries
|
||||
episodic_recall_config["k"] = 3
|
||||
episodic_recall_config["threshold"] = 0.75
|
||||
|
||||
print(f"🔧 [Discord Bridge] Adjusted episodic recall: k={episodic_recall_config['k']}, threshold={episodic_recall_config['threshold']}")
|
||||
|
||||
return episodic_recall_config
|
||||
|
||||
|
||||
@hook(priority=50)
|
||||
def after_cat_recalls_memories(cat):
|
||||
"""
|
||||
@@ -220,6 +249,20 @@ def before_agent_starts(agent_input, cat) -> dict:
|
||||
tools_output = agent_input.get('tools_output', '')
|
||||
user_input = agent_input.get('input', '')
|
||||
|
||||
# Fix misleading header in episodic memory context.
|
||||
# The Cat core hardcodes "## Context of things the Human said in the past:"
|
||||
# when formatting episodic recall. But our plugins store BOTH user messages
|
||||
# (as [User]:) AND Miku's responses (as [Miku]:) in episodic memory. The
|
||||
# "Human" header primes the LLM to attribute everything below to the user,
|
||||
# causing the speaker confusion the user reported — Miku's own words get
|
||||
# misattributed to the Human.
|
||||
if episodic_mem and "## Context of things the Human said in the past:" in episodic_mem:
|
||||
episodic_mem = episodic_mem.replace(
|
||||
"## Context of things the Human said in the past:",
|
||||
"## Past conversation excerpts (prefixed by who said what):"
|
||||
)
|
||||
agent_input['episodic_memory'] = episodic_mem
|
||||
|
||||
print(f"\U0001f50d [Discord Bridge] before_agent_starts called")
|
||||
print(f" input: {user_input[:80]}")
|
||||
print(f" declarative_mem length: {len(declarative_mem)}")
|
||||
@@ -312,6 +355,12 @@ Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. Th
|
||||
Miku is currently feeling: {mood_description}
|
||||
Please respond in a way that reflects this emotional tone."""
|
||||
|
||||
# Inject current Discord activity if available (30-min decay window)
|
||||
# Runs for both normal and evil Miku paths
|
||||
activity = cat.working_memory.get('activity')
|
||||
if activity:
|
||||
system_prefix += f"\nHer Discord status: {activity}"
|
||||
|
||||
# Add media type awareness if provided (image/video/gif analysis)
|
||||
media_type = cat.working_memory.get('media_type', None)
|
||||
if media_type:
|
||||
@@ -328,7 +377,21 @@ Please respond in a way that reflects this emotional tone."""
|
||||
print(f" [Discord Bridge] Error building system prefix: {e}")
|
||||
system_prefix = cat.working_memory.get('full_system_prefix', '[system prefix not available]')
|
||||
|
||||
full_prompt = f"{system_prefix}\n\n# Context\n\n{episodic_mem}\n\n{declarative_mem}\n\n{tools_output}\n\n# Conversation until now:\nHuman: {user_input}"
|
||||
# Build reply context note if the user is replying to Miku's message.
|
||||
# This injects Miku's quoted words as a SEPARATE clearly-labeled context note
|
||||
# (not embedded in the user's message text). Keeps speaker boundaries intact
|
||||
# and prevents the LLM from misattributing Miku's words to the user.
|
||||
# Uses a colon+space delimiter (no nested quotes) to avoid formatting issues
|
||||
# when the replied message itself contains double-quote characters.
|
||||
reply_context = cat.working_memory.get('reply_context')
|
||||
if reply_context:
|
||||
reply_context_note = f'[The user is replying to what you (Miku) said — you said: {reply_context}]'
|
||||
agent_input['reply_context'] = reply_context_note
|
||||
else:
|
||||
reply_context_note = ''
|
||||
agent_input['reply_context'] = ''
|
||||
|
||||
full_prompt = f"{system_prefix}\n\n# Context\n\n{episodic_mem}\n\n{declarative_mem}\n\n{tools_output}\n\n{reply_context_note}\n\n# Conversation until now:\nHuman: {user_input}"
|
||||
cat.working_memory['last_full_prompt'] = full_prompt
|
||||
|
||||
return agent_input
|
||||
|
||||
@@ -80,7 +80,7 @@ def agent_prompt_prefix(prefix, cat):
|
||||
)
|
||||
|
||||
# --- Build system prompt (matches get_evil_system_prompt structure) ----------
|
||||
return f"""{preamble}
|
||||
full_prefix = f"""{preamble}
|
||||
|
||||
---
|
||||
|
||||
@@ -97,6 +97,13 @@ def agent_prompt_prefix(prefix, cat):
|
||||
⚠️ MOOD REMINDER — YOUR CURRENT STATE IS: {mood_name.upper()}
|
||||
Respond in the voice and attitude of your {mood_name.replace('_', ' ')} mood. This mood defines how you sound RIGHT NOW."""
|
||||
|
||||
# Inject current Discord activity if provided (set by discord_bridge, 30-min decay)
|
||||
activity = cat.working_memory.get('activity')
|
||||
if activity:
|
||||
full_prefix += f"\nHer Discord status: {activity}"
|
||||
|
||||
return full_prefix
|
||||
|
||||
|
||||
@hook(priority=100)
|
||||
def agent_prompt_suffix(suffix, cat):
|
||||
@@ -112,9 +119,12 @@ def agent_prompt_suffix(suffix, cat):
|
||||
|
||||
{{tools_output}}
|
||||
|
||||
{{reply_context}}
|
||||
|
||||
[Current mood: {mood_name.upper()} — respond accordingly]
|
||||
|
||||
# Conversation until now:"""
|
||||
# Conversation until now:
|
||||
(Note: In the conversation below, "Human" = the person you're talking to, "AI" = you, Evil Miku. Pay attention to who said what.)"""
|
||||
|
||||
|
||||
@hook(priority=100)
|
||||
|
||||
@@ -69,6 +69,11 @@ def agent_prompt_prefix(prefix, cat):
|
||||
Miku is currently feeling: {mood_description}
|
||||
Please respond in a way that reflects this emotional tone."""
|
||||
|
||||
# Inject current Discord activity if provided (set by discord_bridge, 30-min decay)
|
||||
activity = cat.working_memory.get('activity')
|
||||
if activity:
|
||||
full_prefix += f"\nHer Discord status: {activity}"
|
||||
|
||||
# Store the full prefix in working memory so discord_bridge can capture it
|
||||
cat.working_memory['full_system_prefix'] = full_prefix
|
||||
return full_prefix
|
||||
@@ -86,7 +91,10 @@ def agent_prompt_suffix(suffix, cat):
|
||||
|
||||
{tools_output}
|
||||
|
||||
# Conversation until now:"""
|
||||
{reply_context}
|
||||
|
||||
# Conversation until now:
|
||||
(Note: In the conversation below, "Human" = the person you're talking to, "AI" = you, Miku. Pay attention to who said what.)"""
|
||||
|
||||
|
||||
@hook(priority=100)
|
||||
|
||||
@@ -38,6 +38,15 @@ models:
|
||||
- japanese
|
||||
- japanese-model
|
||||
|
||||
# Qwen3.5 for ComfyUI prompt generation
|
||||
qwen3.5:
|
||||
cmd: /app/llama-server --port ${PORT} --model /models/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive-Q8_K_P.gguf -ngl 99 -c 8192 --host 0.0.0.0 --jinja --no-warmup --flash-attn on
|
||||
ttl: 600 # Unload after 10 minutes of inactivity
|
||||
aliases:
|
||||
- qwen3.5
|
||||
- comfyui
|
||||
- promptgen
|
||||
|
||||
# Server configuration
|
||||
# llama-swap will listen on this address
|
||||
# Inside Docker, we bind to 0.0.0.0 to allow bot container to connect
|
||||
|
||||
Reference in New Issue
Block a user