48-day current streak·48-day longest streak
-
blislab ★ PINNED ⑂
BLISlab: A Sandbox for Optimizing GEMM
C ★ 0 6y agoExplain → -
pytorch ★ PINNED ⑂
Tensors and Dynamic neural networks in Python with strong GPU acceleration
C++ ★ 0 1y agoExplain → -
FBGEMM ★ PINNED ⑂
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
C++ ★ 0 7mo agoExplain → -
flash-attention ★ PINNED ⑂
Fast and memory-efficient exact attention
Python ★ 0 1y agoExplain → -
tvm ★ PINNED ⑂
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Python ★ 0 6y agoExplain → -
Tsinghua_Data_Center
sed
Python ★ 4 13y agoExplain → -
crawler
Baidu training
C++ ★ 3 13y agoExplain → -
Awesome-LLM-Inference ⑂
📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
★ 1 1y agoExplain → -
LAFF ⑂
Learn the theory of linear algebra hand-in-hand with the practice of software library development.
JavaScript ★ 1 12y agoExplain → -
HowToOptimizeGemm
No description.
C ★ 1 9y agoExplain → -
jianyuh.github.io
No description.
SCSS ★ 0 17h agoExplain → -
AReaL ⑂
Distributed RL System for LLM Reasoning
★ 0 1y agoExplain → -
llama-toolchain ⑂
Model components of the Llama Stack APIs
★ 0 1y agoExplain → -
claude-code-cheat-sheet ⑂
Ultimate collection of Claude Code tips, tricks, hacks, and workflows that you can use to master Claude Code in minutes
★ 0 5mo agoExplain → -
jianyuh
No description.
★ 0 5mo agoExplain → -
triton ⑂
Github mirror of trition-lang/triton repo.
★ 0 6mo agoExplain → -
vllm ⑂
A high-throughput and memory-efficient inference and serving engine for LLMs
Python ★ 0 8mo agoExplain → -
batch_invariant_ops ⑂
No description.
★ 0 8mo agoExplain → -
torchrec-3 ⑂
Pytorch domain library for recommendation systems
★ 0 2y agoExplain → -
pplx-kernels ⑂
Perplexity GPU Kernels
★ 0 1y agoExplain → -
tblis ⑂
TBLIS is a library and framework for performing tensor operations, especially tensor contraction, using efficient native algorithms.
C ★ 0 9y agoExplain → -
nano-vllm ⑂
Nano vLLM
★ 0 1y agoExplain → -
sglang ⑂
SGLang is a fast serving framework for large language models and vision language models.
★ 0 1y agoExplain → -
flashinfer ⑂
FlashInfer: Kernel Library for LLM Serving
★ 0 1y agoExplain → -
picotron ⑂
Minimalistic 4D-parallelism distributed training framework for education purpose
★ 0 1y agoExplain → -
TransformerEngine ⑂
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
Python ★ 0 2y agoExplain → -
xformers ⑂
Hackable and optimized Transformers building blocks, supporting a composable construction.
Python ★ 0 2y agoExplain → -
cutlass ⑂
CUDA Templates for Linear Algebra Subroutines
C++ ★ 0 4y agoExplain → -
effectivepython ⑂
Effective Python: Second Edition — Source Code and Errata for the Book
★ 0 5y agoExplain → -
torchrec-1 ⑂
Pytorch domain library for recommendation systems
★ 0 4y agoExplain → -
friendLunarBirthday
generate "csv" format data of my friends' Chinese lunar Birthday for Google Calendar import
Python ★ 0 9y agoExplain → -
param ⑂
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.
★ 0 3y agoExplain → -
sparse-ads-baselines ⑂
No description.
★ 0 6y agoExplain → -
tutorials ⑂
PyTorch tutorials.
★ 0 6y agoExplain → -
glow ⑂
Compiler for Neural Network hardware accelerators
★ 0 6y agoExplain → -
pytext ⑂
A natural language modeling framework based on PyTorch
★ 0 6y agoExplain → -
hub ⑂
No description.
★ 0 6y agoExplain → -
asmjit ⑂
Complete x86/x64 JIT and AOT Assembler for C++
C++ ★ 0 6y agoExplain → -
tblis-strassen
No description.
★ 0 9y agoExplain → -
wizard
Automatically synthesizing program from pseudo description
Java ★ 0 11y agoExplain → -
sudonohup.github.com
No description.
HTML ★ 0 9y agoExplain → -
CS378_PfCandP ⑂
CS378 Programming for Correctness and Performance
C ★ 0 9y agoExplain → -
me
about page for my personal website
CSS ★ 0 10y agoExplain → -
about
No description.
HTML ★ 0 10y agoExplain → -
cpu_gpu_profiling
No description.
C ★ 0 10y agoExplain → -
vimbackup
vim plugin backup
VimL ★ 0 9y agoExplain → -
bf-knn ⑂
Brute-Force k-Nearest Neighbors Search on the GPU
Cuda ★ 0 10y agoExplain → -
neon ⑂
Nervana's python based Deep Learning Framework
Python ★ 0 10y agoExplain → -
csv-parser-cplusplus
Automatically exported from code.google.com/p/csv-parser-cplusplus
Shell ★ 0 11y agoExplain → -
sparse-coding-with-gpus
Automatically exported from code.google.com/p/sparse-coding-with-gpus
Cuda ★ 0 11y agoExplain → -
katy ⑂
Katy SIMD code generator
C++ ★ 0 12y agoExplain → -
NLP
No description.
Java ★ 0 11y agoExplain → -
LLVM
No description.
C++ ★ 0 11y agoExplain → -
FOP
No description.
Java ★ 0 11y agoExplain → -
PythonTools
Some useful tools I write in Python
Python ★ 0 11y agoExplain → -
pl0-compiler-2011-fall
the course project for compiler in undergrad
C++ ★ 0 11y agoExplain → -
Strassen
No description.
C ★ 0 10y agoExplain → -
operating-system-2012-spring
the course project for operating system in undergrad
C ★ 0 12y agoExplain → -
crack_passwd
Crack the password of the library account of buaa with the enumeration
Python ★ 0 13y agoExplain → -
OpenStack-Grizzly-Install-Guide ⑂
A full install guide for OpenStack Grizzly
Shell ★ 0 13y agoExplain → -
renrenIncreaseVisit
Increase the visitor statistics of renren.com
Python ★ 0 13y agoExplain → -
OpenStack-Folsom-Install-guide ⑂
A full installation guide for OpenStack Folsom with Quantum
Shell ★ 0 13y agoExplain → -
gotgithub ⑂
GotGitHub: an open source E-book about GitHub in Chinese
Python ★ 0 13y agoExplain → -
Three-Phase-Commit
No description.
Java ★ 0 12y agoExplain → -
iClickerToBlackBoard
The Grading script for uploading the students' grade from iClicker to BlackBoard for my TA course: CS303E elements of computers and programming
Python ★ 0 12y agoExplain → -
resume ⑂
My resume, generated with moderncv
★ 0 13y agoExplain → -
renren-relationship ⑂
人人好友关系
Python ★ 0 13y agoExplain → -
maple ⑂
A dynamic analysis framework for concurrent programs (x86 binaries). It is shipped with a few tools written using this framework for testing concurrent programs.
C++ ★ 0 13y agoExplain → -
ospf-2012-spring
the course project for computer network, the implementation for OSPF protocol
C++ ★ 0 12y agoExplain →
No repos match these filters.