ffpa-attn
Python
β
310
updated 3d ago
π€FFPA: Extends FlashAttention-2 via Split-D for large headdims, 1.5x~3Γβπ vs SDPA, up to 430Tπ on H200.
No plain-English explanation yet β one is being written right now. Check back in a minute.