gitmyhub

Mixture-of-Transformers

Python ★ 248 updated 9mo ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.

No plain-English explanation yet — one is being written right now. Check back in a minute.