# Testing Autonomous System V2 ## Quick Start Guide ### Step 1: Enable V2 System (Optional - Test Mode) The V2 system can run **alongside** V1 for comparison. To enable it: **Option A: Edit `bot.py` to start V2 on bot ready** Add this to the `on_ready()` function in `bot/bot.py`: ```python # After existing setup code, add: from utils.autonomous_v2_integration import start_v2_system_for_all_servers # Start V2 autonomous system await start_v2_system_for_all_servers(client) ``` **Option B: Manual API testing (no code changes needed)** Just use the API endpoints to check what V2 is thinking, without actually running it. ### Step 2: Test the V2 Decision System #### Check what V2 is "thinking" for a server: ```bash # Get current social stats curl http://localhost:3939/autonomous/v2/stats/ # Example response: { "status": "ok", "guild_id": 759889672804630530, "stats": { "loneliness": "0.42", "boredom": "0.65", "excitement": "0.15", "curiosity": "0.20", "chattiness": "0.70", "action_urgency": "0.48" } } ``` #### Trigger a manual V2 analysis: ```bash # See what V2 would decide right now curl http://localhost:3939/autonomous/v2/check/ # Example response: { "status": "ok", "guild_id": 759889672804630530, "analysis": { "stats": { ... }, "interest_score": "0.73", "triggers": [ "KEYWORD_DETECTED (0.60): Interesting keywords: vocaloid, miku", "CONVERSATION_PEAK (0.60): Lots of people are chatting" ], "recent_messages": 15, "conversation_active": true, "would_call_llm": true } } ``` #### Get overall V2 status: ```bash # See V2 status for all servers curl http://localhost:3939/autonomous/v2/status # Example response: { "status": "ok", "servers": { "759889672804630530": { "server_name": "Example Server", "loop_running": true, "action_urgency": "0.52", "loneliness": "0.30", "boredom": "0.45", "excitement": "0.20", "chattiness": "0.70" } } } ``` ### Step 3: Monitor Behavior #### Watch for V2 log messages: ```bash docker compose logs -f bot | grep -E "🧠|🎯|🤔" ``` You'll see messages like: ``` 🧠 Starting autonomous decision loop for server 759889672804630530 🎯 Interest score 0.73 - Consulting LLM for server 759889672804630530 🤔 LLM decision: YES, someone mentioned you (Interest: 0.73) ``` #### Compare V1 vs V2: **V1 logs:** ``` 💬 Miku said something general in #miku-chat ``` **V2 logs:** ``` 🎯 Interest score 0.82 - Consulting LLM 🤔 LLM decision: YES 💬 Miku said something general in #miku-chat ``` ### Step 4: Tune the System Edit `bot/utils/autonomous_v2.py` to adjust behavior: ```python # How sensitive is the decision system? self.LLM_CALL_THRESHOLD = 0.6 # Lower = more responsive (more LLM calls) self.ACTION_THRESHOLD = 0.5 # Lower = more chatty # How fast do stats build? LONELINESS_BUILD_RATE = 0.01 # Higher = gets lonely faster BOREDOM_BUILD_RATE = 0.01 # Higher = gets bored faster # Check intervals MIN_SLEEP = 30 # Seconds between checks during active chat MAX_SLEEP = 180 # Seconds between checks when quiet ``` ### Step 5: Understanding the Stats #### Loneliness (0.0 - 1.0) - **Increases**: When not mentioned for >30 minutes - **Decreases**: When mentioned, engaged - **Effect**: At 0.7+, seeks attention #### Boredom (0.0 - 1.0) - **Increases**: When quiet, hasn't spoken in >1 hour - **Decreases**: When shares content, conversation happens - **Effect**: At 0.7+, likely to share tweets/content #### Excitement (0.0 - 1.0) - **Increases**: During active conversations - **Decreases**: Fades over time (decays fast) - **Effect**: Higher = more likely to jump into conversation #### Curiosity (0.0 - 1.0) - **Increases**: Interesting keywords detected - **Decreases**: Fades over time - **Effect**: High curiosity = asks questions #### Chattiness (0.0 - 1.0) - **Set by mood**: - excited/bubbly: 0.85-0.9 - neutral: 0.5 - shy/sleepy: 0.2-0.3 - asleep: 0.0 - **Effect**: Base multiplier for all interactions ### Step 6: Trigger Examples Test specific triggers by creating conditions: #### Test MENTIONED trigger: 1. Mention @Miku in the autonomous channel 2. Check stats: `curl http://localhost:3939/autonomous/v2/check/` 3. Should show: `"triggers": ["MENTIONED (0.90): Someone mentioned me!"]` #### Test KEYWORD trigger: 1. Say "I love Vocaloid music" in channel 2. Check stats 3. Should show: `"triggers": ["KEYWORD_DETECTED (0.60): Interesting keywords: vocaloid, music"]` #### Test CONVERSATION_PEAK: 1. Have 3+ people chat within 5 minutes 2. Check stats 3. Should show: `"triggers": ["CONVERSATION_PEAK (0.60): Lots of people are chatting"]` #### Test LONELINESS: 1. Don't mention Miku for 30+ minutes 2. Check stats: `curl http://localhost:3939/autonomous/v2/stats/` 3. Watch loneliness increase over time ### Step 7: Debugging #### V2 won't start? ```bash # Check if import works docker compose exec bot python -c "from utils.autonomous_v2 import autonomous_system_v2; print('OK')" ``` #### V2 never calls LLM? ```bash # Check interest scores curl http://localhost:3939/autonomous/v2/check/ # If interest_score is always < 0.6: # - Channel might be too quiet # - Stats might not be building # - Try mentioning Miku (instant 0.9 score) ``` #### V2 calls LLM too much? ```bash # Increase threshold in autonomous_v2.py: self.LLM_CALL_THRESHOLD = 0.7 # Was 0.6 ``` ## Performance Monitoring ### Expected LLM Call Frequency **Quiet server (few messages):** - V1: ~10 random calls/day - V2: ~2-5 targeted calls/day - **GPU usage: LOWER with V2** **Active server (100+ messages/day):** - V1: ~10 random calls/day (same) - V2: ~10-20 targeted calls/day (responsive to activity) - **GPU usage: SLIGHTLY HIGHER, but much more relevant** ### Check GPU Usage ```bash # Monitor GPU while bot is running nvidia-smi -l 1 # V1: GPU spikes randomly every 15 minutes # V2: GPU spikes only when something interesting happens ``` ### Monitor LLM Queue If you notice lag: 1. Check how many LLM calls are queued 2. Increase `LLM_CALL_THRESHOLD` to reduce frequency 3. Increase check intervals for quieter periods ## Migration Path ### Phase 1: Testing (Current) - V1 running (scheduled actions) - V2 running (parallel, logging decisions) - Compare behaviors - Tune V2 parameters ### Phase 2: Gradual Replacement ```python # In server_manager.py, comment out V1 jobs: # scheduler.add_job( # self._run_autonomous_for_server, # IntervalTrigger(minutes=15), # ... # ) # Keep V2 running autonomous_system_v2.start_loop_for_server(guild_id, client) ``` ### Phase 3: Full Migration - Disable all V1 autonomous jobs - Keep only V2 system - Keep manual triggers for testing ## Troubleshooting ### "Module not found: autonomous_v2" ```bash # Restart the bot container docker compose restart bot ``` ### "Stats always show 0.00" - V2 decision loop might not be running - Check: `curl http://localhost:3939/autonomous/v2/status` - Should show: `"loop_running": true` ### "Interest score always low" - Channel might be genuinely quiet - Try creating activity: post messages, images, mention Miku - Loneliness/boredom build over time (30-60 min) ### "LLM called too frequently" - Increase thresholds in `autonomous_v2.py` - Check which triggers are firing: use `/autonomous/v2/check` - Adjust trigger scores if needed ## API Endpoints Reference ``` GET /autonomous/v2/stats/{guild_id} - Get social stats GET /autonomous/v2/check/{guild_id} - Manual analysis (what would V2 do?) GET /autonomous/v2/status - V2 status for all servers ``` ## Next Steps 1. Run V2 for 24-48 hours 2. Compare decision quality vs V1 3. Tune thresholds based on server activity 4. Gradually phase out V1 if V2 works well 5. Add dashboard for real-time stats visualization