Members
-
dflash
DFlash: Block Diffusion for Flash Speculative Decoding
Python ★ 5.2k 1mo agoExplain → -
paroquant
[ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Python ★ 311 17d agoExplain → -
sparselora
[ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
Python ★ 76 3mo agoExplain → -
flash-colreduce
Fast, memory-efficient attention column reduction (e.g., sum, mean, max)
Python ★ 49 4mo agoExplain →
No repos match these filters.