gitmyhub

native-sparse-attention-pytorch

Python ★ 808 updated 10mo ago

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

No plain-English explanation yet — one is being written right now. Check back in a minute.