Sakana AI ORG

@SakanaAI ·Tokyo ·sakana.ai

On a quest to create a new kind of foundation model based on nature-inspired intelligence.

54 repos
3.3k followers
0 following

Python 79%
Jupyter Notebook 11%
Red 2%
Cuda 2%
HTML 2%

Members

Kaixhin
Taishi-N324

All public repos (54)

Show forks Show archived

AI-Scientist

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

A system that uses large language models to automate the full scientific research cycle, generating hypotheses, running experiments, and producing formatted academic papers with AI peer review.

Jupyter Notebook ★ 14k 6mo ago
Explain →
AI-Scientist-v2

The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search

Python ★ 6.6k 6mo ago
Explain →
continuous-thought-machines

Continuous Thought Machines, because thought takes time and reasoning is a process.

Python ★ 2.0k 5mo ago
Explain →
evolutionary-model-merge

Official repository of Evolutionary Optimization of Model Merging Recipes

Python ★ 1.4k 1y ago
Explain →
text-to-lora

Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input

Python ★ 1.3k 1y ago
Explain →
ShinkaEvolve

ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution 🧬

Python ★ 1.2k 12d ago
Explain →
self-adaptive-llms

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

Python ★ 1.2k 1y ago
Explain →
doc-to-lora

Hypernetworks that update LLMs to remember factual information

Python ★ 756 6d ago
Explain →
treequest

A Tree Search Library with Flexible API for LLM Inference-Time Scaling

Python ★ 552 4mo ago
Explain →
asal

Automating the Search for Artificial Life with Foundation Models!

Jupyter Notebook ★ 474 8mo ago
Explain →
RLT

Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.

Python ★ 362 1y ago
Explain →
evo-memory

Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.

Python ★ 359 1y ago
Explain →
AI-Scientist-ICLR2025-Workshop-Experiment

No description.

Python ★ 301 1y ago
Explain →
sparser-faster-llms

Cuda kernels for leveraging LLM sparsity to improve throughput and decrease the memory requirements during inference and training.

Cuda ★ 245 1mo ago
Explain →
DiffusionBlocks

DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

Python ★ 229 4mo ago
Explain →
DroPE

Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding

Python ★ 218 5mo ago
Explain →
drq

Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

Red ★ 205 5mo ago
Explain →
DiscoPOP ⑂

Code for Discovering Preference Optimization Algorithms with and for Large Language Models

Python ★ 197 2y ago
Explain →
ALE-Bench

The official repository of ALE-Bench

Python ★ 188 17d ago
Explain →
natural_niches

The code repository of the paper: Competition and Attraction Improve Model Fusion

Jupyter Notebook ★ 171 10mo ago
Explain →
TinySwallow-ChatUI

Browser-based chat UI for TinySwallow-1.5B that runs without API calls.

CSS ★ 136 6mo ago
Explain →
TAID

Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"

Python ★ 123 8mo ago
Explain →
ab-mcts-arc2

No description.

Python ★ 115 11mo ago
Explain →
robust-kbench

No description.

Python ★ 98 7mo ago
Explain →
kame

No description.

Python ★ 92 1mo ago
Explain →
digital-ecosystem

Interactive multi-agent NCA ecosystem simulation

JavaScript ★ 82 2mo ago
Explain →
repo

RePo: Language Models with Context Re-Positioning

Python ★ 77 2mo ago
Explain →
petri-dish-nca

No description.

Python ★ 58 7mo ago
Explain →
TinySwallow-ChatUI-Local

Python-based chat demo for TinySwallow-1.5B that works completely offline

Python ★ 58 1y ago
Explain →
CycleQD

CycleQD is a framework for parameter space model merging.

Python ★ 49 1y ago
Explain →
IASC

LLMs for Constructed Languages

HTML ★ 48 2mo ago
Explain →
shachi

Reimagining Agent-based Modeling with Large Language Model Agents via Shachi

Python ★ 46 6d ago
Explain →
edinet2dataset

edinet2dataset is a tool to construct financial dataset using EDINET.

Python ★ 40 3mo ago
Explain →
EDINET-Bench

[ICLR 2026] Evaluating the performance of LLMs on Japanese challenging financial tasks.

Python ★ 35 3mo ago
Explain →
Kamon

Data and code for understanding and generation of Kamon.

Python ★ 35 3mo ago
Explain →
vllm ⑂

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 35 2y ago
Explain →
kame_finetune

No description.

Python ★ 30 1mo ago
Explain →
L2D

Large language models to diffusion finetuning code

Python ★ 27 1y ago
Explain →
TransEvalnia

Reasoning-based Evaluation and Ranking of Translations.

Python ★ 20 18d ago
Explain →
fast-weight-product-key-memory

Code for Fast-weight Product Key Memory (FwPKM)

Python ★ 19 3mo ago
Explain →
neuroevolution-for-ai

Neuroevolution Community

★ 14 7mo ago
Explain →
AC-DC

No description.

Python ★ 11 2mo ago
Explain →
nca-alife

Learning Neural Cellular Automata that produce Open-Ended Alife!

Jupyter Notebook ★ 11 1y ago
Explain →
rl-razor-mnist

Replication of the MNIST experiments from paper **RL's Razor: Why Online Reinforcement Learning Forgets Less**

Python ★ 9 3mo ago
Explain →
Sudoku-Bench

An AI benchmark for creative, human-like problem solving using Sudoku variants

★ 7 17d ago
Explain →
DreamCubed

No description.

Jupyter Notebook ★ 6 1mo ago
Explain →
google-code-golf-2025

No description.

Python ★ 6 7mo ago
Explain →
orcaclaw ⑂

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞 Now with more orca.

★ 5 4mo ago
Explain →
ike

A DeepSpeed-based framework for distributed training and inference of language models.

Python ★ 2 3mo ago
Explain →
BALROG ⑂

Benchmarking Agentic LLM and VLM Reasoning On Games

★ 2 10mo ago
Explain →
AC-DC-eval_harness

No description.

Python ★ 1 2mo ago
Explain →
mle-bench-shinka-agent ⑂

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python ★ 1 7mo ago
Explain →
LanguageEvolution

No description.

Python ★ 0 9d ago
Explain →
KamonBench

KamonBench: A Grammar-Based Dataset for Evaluating Compositional Factor Recovery in Vision-Language Models

Python ★ 0 1mo ago
Explain →