Hi 👋 My name is Taishi 🎓 PhD Student at Institute of Science Tokyo (formerly Tokyo Institute of Technology) --- 🌐 Connect with me…
Hi 👋 My name is Taishi
🎓 PhD Student at Institute of Science Tokyo (formerly Tokyo Institute of Technology)
---
🌐 Connect with me




-
Drop-Upcycling
[ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Shell ★ 24 8mo agoExplain → -
Awesome-RL-Reasoning
Awesome-RL-Reasoning
★ 15 21d agoExplain → -
Qualsimu
双方向型ビジュアル量子教材
HTML ★ 2 1y agoExplain → -
maxtext ⑂
A simple, performant and scalable Jax LLM!
Python ★ 1 10mo agoExplain → -
long-context ⑂
YaRN: Efficient Context Window Extension of Large Language Models
Python ★ 1 2y agoExplain → -
multimodal ⑂
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
★ 1 2y agoExplain → -
llama-recipes ⑂
Examples and recipes for Llama 2 model
★ 1 2y agoExplain → -
fmengine ⑂
Utilities for Training Very Large Models
Python ★ 1 2y agoExplain → -
Taishi-N324.github.io
No description.
HTML ★ 0 1mo agoExplain → -
modded-nanogpt ⑂
NanoGPT (124M) in 2 minutes
Python ★ 0 1mo agoExplain → -
tilelang ⑂
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
★ 0 1mo agoExplain → -
autoresearch ⑂
AI agents running research on single-GPU nanochat training automatically
Python ★ 0 1mo agoExplain → -
parameter-golf ⑂
Train the smallest LM you can that fits in 16MB. Best model wins!
Python ★ 0 1mo agoExplain → -
Automodel ⑂
🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
Python ★ 0 2mo agoExplain → -
Taishi-N324
Config files for my GitHub profile.
★ 0 2mo agoExplain → -
openclaw ⑂
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
★ 0 2mo agoExplain → -
nanochat ⑂
The best ChatGPT that $100 can buy.
Python ★ 0 2mo agoExplain → -
ThunderKittens ⑂
Tile primitives for speedy kernels
★ 0 2mo agoExplain → -
Awesome-RL-Agent
No description.
★ 0 3mo agoExplain → -
terminal-bench-3 ⑂
🚧 Accepting Task Submissions 🚧
★ 0 3mo agoExplain → -
verl ⑂
verl: Volcano Engine Reinforcement Learning for LLMs
Python ★ 0 11mo agoExplain → -
sglang ⑂
SGLang is a fast serving framework for large language models and vision language models.
★ 0 7mo agoExplain → -
vllm ⑂
A high-throughput and memory-efficient inference and serving engine for LLMs
Python ★ 0 6mo agoExplain → -
transformers ⑂
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python ★ 0 6mo agoExplain → -
DeepEP ⑂
DeepEP: an efficient expert-parallel communication library
★ 0 8mo agoExplain → -
DeepGEMM ⑂
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
★ 0 8mo agoExplain → -
Curator ⑂
Scalable data pre processing and curation toolkit for LLMs
Python ★ 0 11mo agoExplain → -
torchtitan ⑂
A PyTorch native platform for training generative AI models
★ 0 1y agoExplain → -
lm-evaluation-harness ⑂
A framework for few-shot evaluation of language models.
★ 0 11mo agoExplain → -
evalplus ⑂
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
★ 0 1y agoExplain → -
OLMo-core ⑂
PyTorch building blocks for the OLMo ecosystem
Python ★ 0 1y agoExplain → -
keras-tuner-alpha ⑂
No description.
Python ★ 0 1y agoExplain → -
nccl-tests ⑂
NCCL Tests
Cuda ★ 0 1y agoExplain → -
lm-evaluation-harness-aurora ⑂
No description.
★ 0 2y agoExplain → -
hpsc-2024 ⑂
No description.
Shell ★ 0 2y agoExplain → -
Megatron-LM-LUMI ⑂
Ongoing research training transformer models at scale
★ 0 2y agoExplain → -
LLaVA ⑂
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
★ 0 2y agoExplain → -
EasyContext ⑂
No description.
★ 0 2y agoExplain → -
alignment-handbook ⑂
Robust recipes to align language models with human and AI preferences
Python ★ 0 2y agoExplain → -
llm-leaderboard ⑂
Project of llm evaluation to Japanese tasks
Python ★ 0 2y agoExplain → -
llama ⑂
Inference code for LLaMA models
★ 0 2y agoExplain → -
AgentBench ⑂
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Python ★ 0 2y agoExplain → -
FastChat ⑂
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
★ 0 2y agoExplain → -
SEED ⑂
Empowers LLMs with the ability to see and draw.
Python ★ 0 2y agoExplain → -
Robin ⑂
No description.
★ 0 2y agoExplain → -
llm-jp-sakura-ansible ⑂
No description.
Jinja ★ 0 2y agoExplain → -
mlmm-evaluation ⑂
Multilingual Large Language Models Evaluation Benchmark
★ 0 2y agoExplain → -
VMLU ⑂
No description.
★ 0 2y agoExplain → -
FIN-bench ⑂
Evaluation of Finnish generative models
★ 0 2y agoExplain → -
bigcode-evaluation-harness ⑂
A framework for the evaluation of autoregressive code generation language models.
★ 0 2y agoExplain → -
llm-jp-sft ⑂
No description.
★ 0 2y agoExplain → -
t5x ⑂
No description.
★ 0 2y agoExplain → -
Megatron-LLM ⑂
distributed trainer for LLMs
★ 0 2y agoExplain → -
Megatron-DeepSpeed
No description.
Python ★ 0 2y agoExplain → -
llm-foundry ⑂
LLM training code for MosaicML foundation models
Python ★ 0 2y agoExplain → -
RedPajama-Data ⑂
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Python ★ 0 3y agoExplain →
No repos match these filters.