5-day current streak·87-day longest streak
Jun 2025
2930
Jul 2025
12345678910111213141516171819202122232425262728293031
Aug 2025
12345678910111213141516171819202122232425262728293031
Sep 2025
123456789101112131415161718192021222324252627282930
Oct 2025
12345678910111213141516171819202122232425262728293031
Nov 2025
123456789101112131415161718192021222324252627282930
Dec 2025
12345678910111213141516171819202122232425262728293031
Jan 2026
12345678910111213141516171819202122232425262728293031
Feb 2026
12345678910111213141516171819202122232425262728
Mar 2026
12345678910111213141516171819202122232425262728293031
Apr 2026
123456789101112131415161718192021222324252627282930
May 2026
12345678910111213141516171819202122232425262728293031
Jun 2026
123456789101112131415161718192021222324252627282930
Jul 2026
123
-
vllm ★ PINNED ⑂
A high-throughput and memory-efficient inference and serving engine for LLMs
Python ★ 8 8h agoExplain → -
punica_triton_kernel
No description.
Python ★ 4 1y agoExplain → -
flashinfer ⑂
FlashInfer: Kernel Library for LLM Serving
Python ★ 0 1mo agoExplain → -
peft ⑂
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
★ 0 3mo agoExplain → -
recipes ⑂
Common recipes to run vLLM
Jupyter Notebook ★ 0 4mo agoExplain → -
DeepGEMM ⑂
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Cuda ★ 0 5mo agoExplain → -
vllm-project.github.io ⑂
No description.
★ 0 7mo agoExplain → -
DeepEP ⑂
DeepEP: an efficient expert-parallel communication library
★ 0 11mo agoExplain → -
bitsandbytes ⑂
Accessible large language models via k-bit quantization for PyTorch.
Python ★ 0 1y agoExplain → -
pytorch ⑂
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Python ★ 0 1y agoExplain → -
cutlass ⑂
CUDA Templates for Linear Algebra Subroutines
★ 0 1y agoExplain →
No repos match these filters.