litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
LitGPT is a Python library with clean, from-scratch implementations of 20+ large language models like Llama 3 and Mistral, covering inference, fine-tuning, and pre-training on any GPU setup.
LitGPT is a Python library for working with large language models, which are the kind of AI systems that understand and generate text. It provides clean, from-scratch implementations of over 20 well-known models including Llama 3, Gemma, Phi, Qwen, Falcon, Mistral, and others. The distinguishing feature is that every model is written without layered abstractions, meaning the code is readable and debuggable rather than buried inside a framework that hides what is happening.
The library covers three main workflows. First, you can load a pre-existing model and use it to generate text, answer questions, or process documents. Second, you can fine-tune an existing model on your own data, which means taking a general-purpose AI and training it further to specialize in a particular task or domain. Third, you can pre-train a model from scratch, which requires much more computing power but gives full control over what the model learns. LitGPT includes YAML configuration files called recipes that contain tested settings for each workflow so you do not have to figure out the best configuration yourself.
The library supports running on anywhere from a single consumer GPU up to clusters of a thousand or more GPUs, and it includes techniques for reducing memory usage so models can run on hardware with limited memory. It is designed for both experimentation and production use, and is licensed under Apache 2.0, which allows commercial use without restrictions.
To get started with basic inference, you install the package with pip, load a model by name, and call a generate function. The README shows this working in about five lines of Python code. For fine-tuning and pre-training, LitGPT uses a command-line interface where you point it at your data and a configuration file, and it handles the training loop.
The project is maintained by Lightning AI, the same company behind the PyTorch Lightning training framework. They also offer cloud GPU infrastructure for running LitGPT workloads, though the library itself runs anywhere Python and PyTorch are available.
Where it fits
- Load a Llama 3 or Mistral model and generate text responses in about five lines of Python code
- Fine-tune a pre-trained language model on your own dataset using the command-line interface and tested YAML recipes
- Pre-train a custom language model from scratch on a multi-GPU cluster with built-in memory-efficient training techniques
- Study how a large language model is implemented without abstraction layers hiding the core architecture