koko210Serve 9eb081efb1 llama-swap: use pre-built images (:cuda, :rocm) with GPU-specific flags
- Drop custom Dockerfiles; docker-compose uses ghcr.io pre-built images
  which ship llama-swap + llama-server with no pinned versions (always latest)
- NVIDIA GTX 1660 (6GB): add -fit off --no-kv-offload --cache-type-k q4_0 --cache-type-v q4_0
  to fix OOM segfault with new llama.cpp b9014's GPU-side KV cache default
- AMD RX 6800 (16GB): flags unchanged; KV cache stays on GPU for max speed
- Both running llama-swap v211 + llama.cpp b9014 (2026-05-05)
2026-05-05 16:53:34 +03:00
2025-12-07 17:15:09 +02:00
Description
Llama.cpp-powered Hatsune Miku Discord bot with autonomous features, chat, and image generation
458 MiB
Languages
Python 82.1%
JavaScript 10.1%
HTML 4.1%
Jupyter Notebook 1.2%
Shell 1%
Other 1.5%