disco-rl-pytorch
Implementation and explorations into DiscoRL, Discovering state-of-the-art reinforcement learning algorithms, David Silver's last work at Deepmind
A work-in-progress PyTorch implementation of DiscoRL, a method from a 2025 Nature paper by David Silver for automatically discovering which reinforcement learning algorithms perform best rather than relying on human-designed ones.
This repository is a PyTorch implementation of DiscoRL, short for Discovering state-of-the-art reinforcement learning algorithms. The research it is based on was published in Nature in 2025 and represents the last work David Silver completed at DeepMind. Reinforcement learning is a field of AI where a system learns by trial and error, receiving rewards for good actions and penalties for bad ones; DiscoRL is a method for automatically discovering which learning algorithms perform best rather than relying on human-designed ones.
The repository is marked as a work in progress, and the README is minimal: it contains a diagram, a brief description, and citation references for the underlying research paper and a related paper on test-time training. There is no setup guide, usage documentation, or code walkthrough provided at this stage. The project comes from lucidrains, a prolific open-source contributor known for implementing recent AI research papers in PyTorch as learning and reference resources.
Where it fits
- Study the DiscoRL algorithm from the 2025 Nature paper by reading a clean PyTorch reference implementation.
- Experiment with automated reinforcement learning algorithm discovery by running DiscoRL on your own environments.