PrefixQuant
Python
★ 0
updated 5mo ago
⑂ fork
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
No plain-English explanation yet — one is being written right now. Check back in a minute.