gitmyhub

FlexPrefill

Python ★ 169 updated 8mo ago

Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

No plain-English explanation yet — one is being written right now. Check back in a minute.