EqR
[ICML 2026] Code for Equilibrium Reasoners: learning attractor dynamics for scalable reasoning
Research code from Carnegie Mellon that trains neural networks to solve hard reasoning tasks like extreme Sudoku and maze navigation by running the same reasoning step repeatedly until the answer converges.
EqR (Equilibrium Reasoners) is a research project from Carnegie Mellon University that explores a different way to do multi-step reasoning in neural networks. The idea is that instead of having a model produce an answer in one forward pass, you run the model's reasoning layer repeatedly until it converges to a stable output, called an attractor. The code in this repository reproduces experiments from an accompanying academic paper.
The two test tasks are Sudoku-Extreme (very hard Sudoku puzzles that standard models struggle with) and Maze-Unique (30x30 maze navigation problems where exactly one solution path exists). These tasks were chosen because they require extended step-by-step reasoning. By running the model iteratively and checking when the output stops changing, EqR can apply more compute to harder problems without changing the network's size.
The repository contains training and evaluation scripts, dataset builders for both task types (or download scripts to fetch pre-built datasets from Hugging Face), and pre-trained checkpoints. Training uses distributed GPU setups via torchrun and requires two CUDA-compiled extensions: a custom optimizer called adam-atan2 and FlashAttention for the maze task. Both must be compiled from source, and the README includes detailed notes on getting the build right.
Evaluation lets you control how many iterative reasoning steps the model runs (the halt_max_steps parameter) and how many different starting points to try in parallel (breadth search). The model is compared against a standard transformer baseline under the same compute budget.
The codebase builds on two earlier repositories (HRM and TRM) and is released under the Apache 2.0 license. Large-scale inference code using Google's XLA compiler is listed as a planned future release.
Where it fits
- Reproduce the EqR paper experiments on Sudoku-Extreme and Maze-Unique to validate iterative reasoning results.
- Test how increasing the number of reasoning iterations improves accuracy on hard problems compared to a standard transformer baseline.