reinforcement-learning
Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
A hands-on learning resource with Python code examples and exercises for reinforcement learning, aligned with the Sutton-Barto textbook and David Silver's lectures.
This repository is a learning resource for reinforcement learning — a branch of artificial intelligence where a software agent learns to make decisions by trial and error, receiving rewards for good actions and penalties for bad ones. Think of it like training a dog with treats, but applied to algorithms.
The code is designed to accompany two specific learning materials: the textbook "Reinforcement Learning: An Introduction" (2nd edition) by Sutton and Barto, and David Silver's university lecture course on reinforcement learning. Each folder in the repo corresponds to a chapter or topic from those materials, and contains exercises, worked solutions, a summary of the key concepts, and links to further reading.
The implemented algorithms cover a progression from foundational to more advanced techniques: dynamic programming (planning when you have a complete model of the environment), Monte Carlo methods (learning from complete episodes of experience), temporal difference learning (learning step by step without waiting for an episode to end), Q-Learning (a widely studied off-policy method), and Deep Q-Learning (combining Q-Learning with neural networks to handle complex problems like Atari games). Policy gradient methods and an actor-critic algorithm are also included.
Everything is written in Python 3 using Jupyter Notebooks — interactive documents that mix code, explanations, and output — and uses OpenAI Gym for training environments and TensorFlow for the neural network-based algorithms.
You would use this repo if you are studying reinforcement learning and want hands-on code alongside the theory.
Where it fits
- Study reinforcement learning algorithms step-by-step with working code examples and explanations.
- Train agents to play Atari games using deep Q-learning and neural networks.
- Work through exercises from the Sutton-Barto textbook with ready-made solutions and implementations.
- Understand the progression from simple methods like Monte Carlo to advanced techniques like actor-critic algorithms.