gitmyhub

HRM

Python ★ 13k updated 2mo ago

Hierarchical Reasoning Model Official Release

A tiny 27-million-parameter AI model that uses two cooperating thinking modules, slow abstract planning and fast detailed computation, to solve complex reasoning tasks like Sudoku and mazes better than much larger models.

PythonPyTorchFlashAttentionCUDAWeights and BiasesHugging Facesetup: hardcomplexity 4/5

This repository contains the official code for the Hierarchical Reasoning Model, or HRM, a research AI architecture designed to handle complex reasoning tasks. The core idea is that most current AI models use a technique called Chain-of-Thought, where they generate long sequences of intermediate reasoning steps to solve problems. HRM takes a different approach, drawing inspiration from how the human brain uses different processing speeds for different types of thinking: a slower, higher-level module handles abstract planning while a faster, lower-level module handles detailed computations. These two modules run in a loop during a single forward pass through the model, producing surprisingly capable reasoning without needing explicit step-by-step supervision during training.

What makes HRM notable is its size. The model has only 27 million parameters, which is extremely small compared to the large language models that currently dominate AI. Despite this, it was trained on only 1,000 examples and achieves near-perfect results on tasks like solving very difficult Sudoku puzzles and finding optimal paths through large mazes. It also outperforms much larger models on the Abstraction and Reasoning Corpus benchmark, which is a standard test for measuring general reasoning ability in AI systems. The paper describing the architecture is available on arXiv.

The repository lets you train HRM from scratch or load pre-trained checkpoints from Hugging Face. Pre-trained checkpoints are available for the ARC-AGI-2 benchmark, extreme Sudoku puzzles, and hard maze-solving. Training requires a GPU with CUDA support. The quick-start guide walks through training a Sudoku solver on a single laptop GPU in around ten hours, while full-scale experiments are designed to run on an 8-GPU setup.

The code depends on PyTorch, FlashAttention (with different versions for different GPU generations), and Weights and Biases for tracking training metrics. Setup instructions in the README cover installing CUDA and the required Python packages. A puzzle visualizer is included as an HTML file to help you explore the training data visually.

Where it fits