gitmyhub

long-context-attention

★ 0 updated 1y ago ⑂ fork

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

No plain-English explanation yet — one is being written right now. Check back in a minute.