Members
-
dynamo ★ PINNED
A Datacenter Scale Distributed Inference Serving Framework
Rust ★ 7.3k 2m agoExplain → -
nixl ★ PINNED
NVIDIA Inference Xfer Library (NIXL)
C++ ★ 1.1k 1h agoExplain → -
aiconfigurator ★ PINNED
Offline optimization of your disaggregated Dynamo graph
Python ★ 347 3m agoExplain → -
aiperf ★ PINNED
AIPerf is a comprehensive benchmarking tool that measures the performance of generative AI models served by your preferred inference solution.
Python ★ 388 1h agoExplain → -
grove ★ PINNED
Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
Go ★ 225 11h agoExplain → -
modelexpress ★ PINNED
Model Express is a Rust-based component meant to be placed next to existing model inference systems to speed up their startup times and improve overall performance.
Rust ★ 76 22m agoExplain → -
aitune
NVIDIA AITune is an inference toolkit designed for tuning and deploying Deep Learning models with a focus on NVIDIA GPUs.
Python ★ 277 20d agoExplain → -
flextensor
FlexTensor is a tensor offloading and management library for PyTorch that enables running large models on limited GPU memory by intelligently offloading tensors between GPU and CPU memory.
Python ★ 106 20d agoExplain → -
enhancements
Enhancement Proposals and Architecture Decisions
★ 12 25d agoExplain → -
frontend-crates
No description.
Rust ★ 5 59m agoExplain → -
agent-plugins
No description.
TypeScript ★ 5 19m agoExplain → -
velo
No description.
Rust ★ 5 12d agoExplain → -
.github
No description.
★ 1 10mo agoExplain →
No repos match these filters.