ZhiLight
C++
★ 905
updated 3mo ago
A highly optimized LLM inference acceleration engine for Llama and its variants.
No plain-English explanation yet — one is being written right now. Check back in a minute.