hybrid-linear-attention
Python
★ 36
updated 2mo ago
Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts
No plain-English explanation yet — one is being written right now. Check back in a minute.