miku-discord

Koko210/miku-discord

Fork 0

Commit Graph

Author	SHA1	Message	Date
koko210Serve	675bb21653	Disable model warmup to improve switching speed - Added --no-warmup flag to both llama3.1 and vision models - Reduces model switch time by 2-5 seconds per swap - No impact on response quality, only minor first-token latency - Better for frequent model switching use case and tight VRAM budget	2025-12-10 10:09:37 +02:00
koko210Serve	8c74ad5260	Initial commit: Miku Discord Bot	2025-12-07 17:15:09 +02:00

Author

SHA1

Message

Date

koko210Serve

675bb21653

Disable model warmup to improve switching speed

- Added --no-warmup flag to both llama3.1 and vision models
- Reduces model switch time by 2-5 seconds per swap
- No impact on response quality, only minor first-token latency
- Better for frequent model switching use case and tight VRAM budget

2025-12-10 10:09:37 +02:00

koko210Serve

8c74ad5260

Initial commit: Miku Discord Bot

2025-12-07 17:15:09 +02:00

2 Commits