Skywork-MoE
★ 140
updated 2y ago
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
No plain-English explanation yet — one is being written right now. Check back in a minute.