gitmyhub

AutoAWQ

★ 0 updated 1y ago ⑂ fork

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

No plain-English explanation yet — one is being written right now. Check back in a minute.