miku-discord

Files

koko210Serve 5a740c9334 feat(memory): hybrid trivial-message classifier (heuristics + LLM batch)

Step 3 of memory system overhaul: smart junk detection.

Replaces the old 37-pattern frozenset (44% accuracy) with a 3-tier hybrid:

TIER 1 - DEFINITELY_TRIVIAL (instant delete, no LLM):
  50+ exact-match patterns, pure emoji, single char, punctuation-only

TIER 2 - DEFINITELY_IMPORTANT (instant keep, no LLM):
  8+ words, question with substance, first-person statements,
  numbers/dates, links, mentions

TIER 3 - BORDERLINE (batch → LLM for economical classification):
  2-7 word messages without clear markers
  Compact prompt: ~150-200 tokens per 20-message batch
  Safety default: KEEP on any parsing error

Real-time filtering (discord_bridge) uses conservative heuristics only:
  - 1-char, pure reactions, single emoji, custom emoji-only
  - 50+ single-word fillers
  - Never deletes multi-word messages in real-time
  - Philosophy: false negatives (junk stored) > false positives (data lost)

Consolidation gets the full hybrid pipeline with LLM for borderline
cases, achieving much better accuracy than the old 44% while keeping
token costs minimal (LLM only called during nightly consolidation,
not real-time chat).

2026-05-15 14:07:35 +03:00

discord_bridge.py

feat(memory): hybrid trivial-message classifier (heuristics + LLM batch)

2026-05-15 14:07:35 +03:00

plugin.json

Phase 3: Unified Cheshire Cat integration with WebSocket-based per-user isolation

2026-02-07 20:22:03 +02:00

settings.json

Phase 3: Unified Cheshire Cat integration with WebSocket-based per-user isolation

2026-02-07 20:22:03 +02:00