gitmyhub

ZhiLight

C++ ★ 905 updated 3mo ago

A highly optimized LLM inference acceleration engine for Llama and its variants.

No plain-English explanation yet — one is being written right now. Check back in a minute.