- Drop custom Dockerfiles; docker-compose uses ghcr.io pre-built images which ship llama-swap + llama-server with no pinned versions (always latest) - NVIDIA GTX 1660 (6GB): add -fit off --no-kv-offload --cache-type-k q4_0 --cache-type-v q4_0 to fix OOM segfault with new llama.cpp b9014's GPU-side KV cache default - AMD RX 6800 (16GB): flags unchanged; KV cache stays on GPU for max speed - Both running llama-swap v211 + llama.cpp b9014 (2026-05-05)
Description
Llama.cpp-powered Hatsune Miku Discord bot with autonomous features, chat, and image generation
Languages
Python
82.1%
JavaScript
10.1%
HTML
4.1%
Jupyter Notebook
1.2%
Shell
1%
Other
1.5%