gitmyhub

k8s-device-plugin

Go ★ 3.8k updated 2d ago

NVIDIA device plugin for Kubernetes

This project is a plugin that allows Kubernetes to recognize and use NVIDIA graphics cards (GPUs) installed in a server cluster. Kubernetes is a system that manages many containers (packaged software units) running across multiple machines. Without this plugin, Kubernetes has no way of knowing that GPUs exist or of assigning them to workloads that need them.

Once the plugin is installed, software running inside the cluster can request GPU access the same way it requests memory or CPU time. This matters for machine learning training jobs, video processing, and other tasks that run much faster on a GPU than on a standard processor.

The plugin can be deployed with a single command for basic testing, or through a tool called Helm for production use, which gives more control over configuration. It supports sharing a single GPU among multiple workloads through time-slicing or a technology called MPS, which can reduce hardware costs when no single job needs the full GPU.

Configuration can be provided as command-line flags, environment variables, or a configuration file. The README covers prerequisites in detail, including the need to install NVIDIA drivers and the NVIDIA Container Toolkit before the plugin will work.

The full README is longer than what was shown.