Taishi Nakamura

@Taishi-N324 ·Japan ·taishi-n324.github.io

56 repos
144 followers
503 following

HTML 50%
Shell 25%
Python 25%

Hi 👋 My name is Taishi 🎓 PhD Student at Institute of Science Tokyo (formerly Tokyo Institute of Technology) --- 🌐 Connect with me…

All public repos (56)

Show forks Show archived

Drop-Upcycling

[ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Shell ★ 24 8mo ago
Explain →
Awesome-RL-Reasoning

Awesome-RL-Reasoning

★ 15 21d ago
Explain →
Qualsimu

双方向型ビジュアル量子教材

HTML ★ 2 1y ago
Explain →
maxtext ⑂

A simple, performant and scalable Jax LLM!

Python ★ 1 10mo ago
Explain →
long-context ⑂

YaRN: Efficient Context Window Extension of Large Language Models

Python ★ 1 2y ago
Explain →
multimodal ⑂

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

★ 1 2y ago
Explain →
llama-recipes ⑂

Examples and recipes for Llama 2 model

★ 1 2y ago
Explain →
fmengine ⑂

Utilities for Training Very Large Models

Python ★ 1 2y ago
Explain →
Taishi-N324.github.io

No description.

HTML ★ 0 1mo ago
Explain →
modded-nanogpt ⑂

NanoGPT (124M) in 2 minutes

Python ★ 0 1mo ago
Explain →
tilelang ⑂

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

★ 0 1mo ago
Explain →
autoresearch ⑂

AI agents running research on single-GPU nanochat training automatically

Python ★ 0 1mo ago
Explain →
parameter-golf ⑂

Train the smallest LM you can that fits in 16MB. Best model wins!

Python ★ 0 1mo ago
Explain →
Automodel ⑂

🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

Python ★ 0 2mo ago
Explain →
Taishi-N324

Config files for my GitHub profile.

★ 0 2mo ago
Explain →
openclaw ⑂

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

★ 0 2mo ago
Explain →
nanochat ⑂

The best ChatGPT that $100 can buy.

Python ★ 0 2mo ago
Explain →
ThunderKittens ⑂

Tile primitives for speedy kernels

★ 0 2mo ago
Explain →
Awesome-RL-Agent

No description.

★ 0 3mo ago
Explain →
terminal-bench-3 ⑂

🚧 Accepting Task Submissions 🚧

★ 0 3mo ago
Explain →
verl ⑂

verl: Volcano Engine Reinforcement Learning for LLMs

Python ★ 0 11mo ago
Explain →
sglang ⑂

SGLang is a fast serving framework for large language models and vision language models.

★ 0 7mo ago
Explain →
vllm ⑂

A high-throughput and memory-efficient inference and serving engine for LLMs

Python ★ 0 6mo ago
Explain →
transformers ⑂

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python ★ 0 6mo ago
Explain →
DeepEP ⑂

DeepEP: an efficient expert-parallel communication library

★ 0 8mo ago
Explain →
DeepGEMM ⑂

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

★ 0 8mo ago
Explain →
Curator ⑂

Scalable data pre processing and curation toolkit for LLMs

Python ★ 0 11mo ago
Explain →
torchtitan ⑂

A PyTorch native platform for training generative AI models

★ 0 1y ago
Explain →
lm-evaluation-harness ⑂

A framework for few-shot evaluation of language models.

★ 0 11mo ago
Explain →
evalplus ⑂

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024

★ 0 1y ago
Explain →
OLMo-core ⑂

PyTorch building blocks for the OLMo ecosystem

Python ★ 0 1y ago
Explain →
keras-tuner-alpha ⑂

No description.

Python ★ 0 1y ago
Explain →
nccl-tests ⑂

NCCL Tests

Cuda ★ 0 1y ago
Explain →
lm-evaluation-harness-aurora ⑂

No description.

★ 0 2y ago
Explain →
hpsc-2024 ⑂

No description.

Shell ★ 0 2y ago
Explain →
Megatron-LM-LUMI ⑂

Ongoing research training transformer models at scale

★ 0 2y ago
Explain →
LLaVA ⑂

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

★ 0 2y ago
Explain →
EasyContext ⑂

No description.

★ 0 2y ago
Explain →
alignment-handbook ⑂

Robust recipes to align language models with human and AI preferences

Python ★ 0 2y ago
Explain →
llm-leaderboard ⑂

Project of llm evaluation to Japanese tasks

Python ★ 0 2y ago
Explain →
llama ⑂

Inference code for LLaMA models

★ 0 2y ago
Explain →
AgentBench ⑂

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

Python ★ 0 2y ago
Explain →
FastChat ⑂

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

★ 0 2y ago
Explain →
SEED ⑂

Empowers LLMs with the ability to see and draw.

Python ★ 0 2y ago
Explain →
Robin ⑂

No description.

★ 0 2y ago
Explain →
llm-jp-sakura-ansible ⑂

No description.

Jinja ★ 0 2y ago
Explain →
mlmm-evaluation ⑂

Multilingual Large Language Models Evaluation Benchmark

★ 0 2y ago
Explain →
VMLU ⑂

No description.

★ 0 2y ago
Explain →
FIN-bench ⑂

Evaluation of Finnish generative models

★ 0 2y ago
Explain →
bigcode-evaluation-harness ⑂

A framework for the evaluation of autoregressive code generation language models.

★ 0 2y ago
Explain →
llm-jp-sft ⑂

No description.

★ 0 2y ago
Explain →
t5x ⑂

No description.

★ 0 2y ago
Explain →
Megatron-LLM ⑂

distributed trainer for LLMs

★ 0 2y ago
Explain →
Megatron-DeepSpeed

No description.

Python ★ 0 2y ago
Explain →
llm-foundry ⑂

LLM training code for MosaicML foundation models

Python ★ 0 2y ago
Explain →
RedPajama-Data ⑂

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python ★ 0 3y ago
Explain →