Organize documentation: Move all .md files to readmes/ directory

2025-12-07 17:21:59 +02:00
parent 8c74ad5260
commit 88d4256755
28 changed files with 0 additions and 0 deletions
--- a/readmes/TESTING_V2.md
+++ b/readmes/TESTING_V2.md
@@ -0,0 +1,315 @@
+# Testing Autonomous System V2
+
+## Quick Start Guide
+
+### Step 1: Enable V2 System (Optional - Test Mode)
+
+The V2 system can run **alongside** V1 for comparison. To enable it:
+
+**Option A: Edit `bot.py` to start V2 on bot ready**
+
+Add this to the `on_ready()` function in `bot/bot.py`:
+
+```python
+# After existing setup code, add:
+from utils.autonomous_v2_integration import start_v2_system_for_all_servers
+
+# Start V2 autonomous system
+await start_v2_system_for_all_servers(client)
+```
+
+**Option B: Manual API testing (no code changes needed)**
+
+Just use the API endpoints to check what V2 is thinking, without actually running it.
+
+### Step 2: Test the V2 Decision System
+
+#### Check what V2 is "thinking" for a server:
+
+```bash
+# Get current social stats
+curl http://localhost:3939/autonomous/v2/stats/<GUILD_ID>
+
+# Example response:
+{
+  "status": "ok",
+  "guild_id": 759889672804630530,
+  "stats": {
+    "loneliness": "0.42",
+    "boredom": "0.65",
+    "excitement": "0.15",
+    "curiosity": "0.20",
+    "chattiness": "0.70",
+    "action_urgency": "0.48"
+  }
+}
+```
+
+#### Trigger a manual V2 analysis:
+
+```bash
+# See what V2 would decide right now
+curl http://localhost:3939/autonomous/v2/check/<GUILD_ID>
+
+# Example response:
+{
+  "status": "ok",
+  "guild_id": 759889672804630530,
+  "analysis": {
+    "stats": { ... },
+    "interest_score": "0.73",
+    "triggers": [
+      "KEYWORD_DETECTED (0.60): Interesting keywords: vocaloid, miku",
+      "CONVERSATION_PEAK (0.60): Lots of people are chatting"
+    ],
+    "recent_messages": 15,
+    "conversation_active": true,
+    "would_call_llm": true
+  }
+}
+```
+
+#### Get overall V2 status:
+
+```bash
+# See V2 status for all servers
+curl http://localhost:3939/autonomous/v2/status
+
+# Example response:
+{
+  "status": "ok",
+  "servers": {
+    "759889672804630530": {
+      "server_name": "Example Server",
+      "loop_running": true,
+      "action_urgency": "0.52",
+      "loneliness": "0.30",
+      "boredom": "0.45",
+      "excitement": "0.20",
+      "chattiness": "0.70"
+    }
+  }
+}
+```
+
+### Step 3: Monitor Behavior
+
+#### Watch for V2 log messages:
+
+```bash
+docker compose logs -f bot | grep -E "🧠|🎯|🤔"
+```
+
+You'll see messages like:
+```
+🧠 Starting autonomous decision loop for server 759889672804630530
+🎯 Interest score 0.73 - Consulting LLM for server 759889672804630530
+🤔 LLM decision: YES, someone mentioned you (Interest: 0.73)
+```
+
+#### Compare V1 vs V2:
+
+**V1 logs:**
+```
+💬 Miku said something general in #miku-chat
+```
+
+**V2 logs:**
+```
+🎯 Interest score 0.82 - Consulting LLM
+🤔 LLM decision: YES
+💬 Miku said something general in #miku-chat
+```
+
+### Step 4: Tune the System
+
+Edit `bot/utils/autonomous_v2.py` to adjust behavior:
+
+```python
+# How sensitive is the decision system?
+self.LLM_CALL_THRESHOLD = 0.6  # Lower = more responsive (more LLM calls)
+self.ACTION_THRESHOLD = 0.5     # Lower = more chatty
+
+# How fast do stats build?
+LONELINESS_BUILD_RATE = 0.01    # Higher = gets lonely faster
+BOREDOM_BUILD_RATE = 0.01       # Higher = gets bored faster
+
+# Check intervals
+MIN_SLEEP = 30    # Seconds between checks during active chat
+MAX_SLEEP = 180   # Seconds between checks when quiet
+```
+
+### Step 5: Understanding the Stats
+
+#### Loneliness (0.0 - 1.0)
+- **Increases**: When not mentioned for >30 minutes
+- **Decreases**: When mentioned, engaged
+- **Effect**: At 0.7+, seeks attention
+
+#### Boredom (0.0 - 1.0)
+- **Increases**: When quiet, hasn't spoken in >1 hour
+- **Decreases**: When shares content, conversation happens
+- **Effect**: At 0.7+, likely to share tweets/content
+
+#### Excitement (0.0 - 1.0)
+- **Increases**: During active conversations
+- **Decreases**: Fades over time (decays fast)
+- **Effect**: Higher = more likely to jump into conversation
+
+#### Curiosity (0.0 - 1.0)
+- **Increases**: Interesting keywords detected
+- **Decreases**: Fades over time
+- **Effect**: High curiosity = asks questions
+
+#### Chattiness (0.0 - 1.0)
+- **Set by mood**:
+  - excited/bubbly: 0.85-0.9
+  - neutral: 0.5
+  - shy/sleepy: 0.2-0.3
+  - asleep: 0.0
+- **Effect**: Base multiplier for all interactions
+
+### Step 6: Trigger Examples
+
+Test specific triggers by creating conditions:
+
+#### Test MENTIONED trigger:
+1. Mention @Miku in the autonomous channel
+2. Check stats: `curl http://localhost:3939/autonomous/v2/check/<GUILD_ID>`
+3. Should show: `"triggers": ["MENTIONED (0.90): Someone mentioned me!"]`
+
+#### Test KEYWORD trigger:
+1. Say "I love Vocaloid music" in channel
+2. Check stats
+3. Should show: `"triggers": ["KEYWORD_DETECTED (0.60): Interesting keywords: vocaloid, music"]`
+
+#### Test CONVERSATION_PEAK:
+1. Have 3+ people chat within 5 minutes
+2. Check stats
+3. Should show: `"triggers": ["CONVERSATION_PEAK (0.60): Lots of people are chatting"]`
+
+#### Test LONELINESS:
+1. Don't mention Miku for 30+ minutes
+2. Check stats: `curl http://localhost:3939/autonomous/v2/stats/<GUILD_ID>`
+3. Watch loneliness increase over time
+
+### Step 7: Debugging
+
+#### V2 won't start?
+```bash
+# Check if import works
+docker compose exec bot python -c "from utils.autonomous_v2 import autonomous_system_v2; print('OK')"
+```
+
+#### V2 never calls LLM?
+```bash
+# Check interest scores
+curl http://localhost:3939/autonomous/v2/check/<GUILD_ID>
+
+# If interest_score is always < 0.6:
+# - Channel might be too quiet
+# - Stats might not be building
+# - Try mentioning Miku (instant 0.9 score)
+```
+
+#### V2 calls LLM too much?
+```bash
+# Increase threshold in autonomous_v2.py:
+self.LLM_CALL_THRESHOLD = 0.7  # Was 0.6
+```
+
+## Performance Monitoring
+
+### Expected LLM Call Frequency
+
+**Quiet server (few messages):**
+- V1: ~10 random calls/day
+- V2: ~2-5 targeted calls/day
+- **GPU usage: LOWER with V2**
+
+**Active server (100+ messages/day):**
+- V1: ~10 random calls/day (same)
+- V2: ~10-20 targeted calls/day (responsive to activity)
+- **GPU usage: SLIGHTLY HIGHER, but much more relevant**
+
+### Check GPU Usage
+
+```bash
+# Monitor GPU while bot is running
+nvidia-smi -l 1
+
+# V1: GPU spikes randomly every 15 minutes
+# V2: GPU spikes only when something interesting happens
+```
+
+### Monitor LLM Queue
+
+If you notice lag:
+1. Check how many LLM calls are queued
+2. Increase `LLM_CALL_THRESHOLD` to reduce frequency
+3. Increase check intervals for quieter periods
+
+## Migration Path
+
+### Phase 1: Testing (Current)
+- V1 running (scheduled actions)
+- V2 running (parallel, logging decisions)
+- Compare behaviors
+- Tune V2 parameters
+
+### Phase 2: Gradual Replacement
+```python
+# In server_manager.py, comment out V1 jobs:
+# scheduler.add_job(
+#     self._run_autonomous_for_server,
+#     IntervalTrigger(minutes=15),
+#     ...
+# )
+
+# Keep V2 running
+autonomous_system_v2.start_loop_for_server(guild_id, client)
+```
+
+### Phase 3: Full Migration
+- Disable all V1 autonomous jobs
+- Keep only V2 system
+- Keep manual triggers for testing
+
+## Troubleshooting
+
+### "Module not found: autonomous_v2"
+```bash
+# Restart the bot container
+docker compose restart bot
+```
+
+### "Stats always show 0.00"
+- V2 decision loop might not be running
+- Check: `curl http://localhost:3939/autonomous/v2/status`
+- Should show: `"loop_running": true`
+
+### "Interest score always low"
+- Channel might be genuinely quiet
+- Try creating activity: post messages, images, mention Miku
+- Loneliness/boredom build over time (30-60 min)
+
+### "LLM called too frequently"
+- Increase thresholds in `autonomous_v2.py`
+- Check which triggers are firing: use `/autonomous/v2/check`
+- Adjust trigger scores if needed
+
+## API Endpoints Reference
+
+```
+GET  /autonomous/v2/stats/{guild_id}  - Get social stats
+GET  /autonomous/v2/check/{guild_id}  - Manual analysis (what would V2 do?)
+GET  /autonomous/v2/status             - V2 status for all servers
+```
+
+## Next Steps
+
+1. Run V2 for 24-48 hours
+2. Compare decision quality vs V1
+3. Tune thresholds based on server activity
+4. Gradually phase out V1 if V2 works well
+5. Add dashboard for real-time stats visualization