Awesome-LLM-Inference
Python
β
5.3k
updated 2mo ago
πA curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.π
No plain-English explanation yet β one is being written right now. Check back in a minute.