gitmyhub

SparseX

★ 3 updated 20d ago ⑂ fork

vllm implementation for paper《SparseX: Efficient Segment-Level KV Cache Sharing for Interleaved LLM Serving》

No plain-English explanation yet — one is being written right now. Check back in a minute.