trl

Python ★ 19k updated 3h ago

Train transformer language models with reinforcement learning.

A Python library for fine-tuning and aligning AI language models after initial training, using techniques like supervised fine-tuning and human preference optimization.

PythonPyTorchTransformersPEFTLoRAsetup: moderatecomplexity 4/5

TRL (Transformers Reinforcement Learning) is a Python library for taking already-trained AI language models and improving them further using techniques developed after the initial training phase — a process called post-training. It is built on top of the Hugging Face Transformers ecosystem and supports multiple model types.

The library provides ready-to-use trainer classes for different post-training approaches. Supervised Fine-Tuning (SFT) continues training a model on new example data. Direct Preference Optimization (DPO) and Group Relative Policy Optimization (GRPO) are methods that align a model's outputs more closely with human preferences, without the complexity of traditional reinforcement learning setups. There is also a RewardTrainer for training separate models that score how good a response is.

Training can scale from a single graphics card to large multi-machine clusters. Integration with PEFT (Parameter-Efficient Fine-Tuning) tools like LoRA and QLoRA allows training of large models on more modest hardware by only updating a small fraction of the model's parameters. A command-line interface makes it possible to start fine-tuning runs without writing any code. The library is released under the Apache 2.0 license.

Where it fits

Fine-tune a pre-trained language model on your own dataset to specialize it for a specific task.
Align a language model's responses with human preferences using DPO or GRPO without complex reinforcement learning setup.
Train a reward model that scores how good a language model's responses are.
Run large model fine-tuning on modest hardware by combining LoRA with TRL's PEFT integration.

Open on GitHub → Full breakdown on explaingit →