AI-Scientist

Jupyter Notebook ★ 14k updated 6mo ago

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

A system that uses large language models to automate the full scientific research cycle, generating hypotheses, running experiments, and producing formatted academic papers with AI peer review.

PythonJupyter NotebookLaTeXNVIDIA CUDAOpenAI APIAnthropic APIGoogle Gemini APIsetup: hardcomplexity 5/5

The AI Scientist is a system from Sakana AI that attempts to automate the full cycle of scientific research using large language models. Given a research template and a set of starting ideas, it can generate new research hypotheses, write and run experiments, analyze results, and produce a formatted academic paper, including a review of that paper by another AI model. The aim is to have AI conduct research with minimal human involvement, rather than just assisting human researchers.

The system works through experiment templates that define a research domain. Three templates are included: NanoGPT (a small language model training setup), 2D Diffusion (a generative modeling task), and Grokking (a phenomenon in neural network learning). Each template gives the system a codebase to modify and experiment with. The AI generates ideas, writes code changes, runs the experiments on a GPU machine, reads the results, and then writes a LaTeX paper summarizing what it found. A separate reviewer pass uses an LLM to evaluate the generated paper.

Running the system requires a Linux machine with NVIDIA GPUs, a Python environment, a LaTeX installation (for PDF generation), and API keys for at least one supported frontier model such as GPT-4o, Claude, or Gemini. The README lists all supported model providers including OpenAI, Anthropic, Google, and options via Amazon Bedrock and Vertex AI. The project recommends using only frontier-grade models since weaker models produce poor research quality.

The project includes an important safety warning: the system executes code written by the LLM, which could include network access, file operations, or installation of packages. Running it in a containerized environment with restricted network access is strongly advised.

Sample papers produced by the system are available in the repository and on a shared Google Drive folder from the original research runs. Community-contributed templates beyond the three official ones are accepted but are not maintained by the Sakana AI team.

Where it fits

Automate generation of novel research papers on small language model training using GPT-4o or Claude as the driving model
Run experiments on neural network grokking phenomena without manually writing any experiment code
Generate and AI-review a 2D diffusion model research paper using the included experiment template

Open on GitHub → Full breakdown on explaingit →