gitmyhub

saw-int4

Shell ★ 27 updated 2mo ago

Official implementation of Paper "System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving"

No plain-English explanation yet — one is being written right now. Check back in a minute.