saw-int4
Shell
★ 27
updated 2mo ago
Official implementation of Paper "System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving"
No plain-English explanation yet — one is being written right now. Check back in a minute.