linear-attention-transformer
Python
★ 838
updated 2y ago
Transformer based on a variant of attention that is linear complexity in respect to sequence length
No plain-English explanation yet — one is being written right now. Check back in a minute.