👋 Hi, I'm Eugene Yan I build recommendation systems and AI-powered experiences that serve customers at scale. Currently, I'm a Principal Applied Scientist at Amazon. Outside of work, I also...…
👋 Hi, I'm Eugene Yan
I build recommendation systems and AI-powered experiences that serve customers at scale. Currently, I'm a Principal Applied Scientist at Amazon. Outside of work, I also...
- Write, speak, and prototype on ideas in machine learning, RecSys, and LLMs.
- Teach how to apply ML/LLMs effectively at ApplyingML.com & Applied-LLMs.org.
- Send a newsletter about data, ML, and what I'm learning to 10,000+ subscribers.
- In 2026 I'm learning: LLMs for security, memory and retrieval, human-agent collaboration.
- Fun fact: I don't use the QWERTY keyboard (I use Dvorak instead).
📝 Recent Writing
<!-- writing starts -->
- How to Work and Compound with AI - Sun, 03 May 2026
- 2025 Year in Review - Sun, 14 Dec 2025
- Product Evals in Three Simple Steps - Sun, 23 Nov 2025
- Advice for New Principal Tech ICs (i.e., Notes to Myself) - Sun, 19 Oct 2025
- Training an LLM-RecSys Hybrid for Steerable Recs with Semantic IDs - Sun, 14 Sep 2025
View the archives (<!-- writing_count starts -->210<!-- writing_count ends --> posts) @ eugeneyan.com.
---
-
applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
★ 30k 1y agoExplain → -
open-llms
📋 A list of open LLMs available for commercial use.
★ 13k 1y agoExplain → -
ml-surveys
📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.
★ 2.9k 3y agoExplain → -
ml-design-docs
📝 Design doc template & examples for machine learning systems (requirements, methodology, implementation, etc.)
★ 704 3y agoExplain → -
obsidian-copilot
🤖 A prototype assistant for writing and thinking
Python ★ 562 2y agoExplain → -
1-on-1s
🌱 1-on-1 questions and resources from my time as a manager.
★ 386 2y agoExplain → -
news-agents
📰 Building News Agents to Summarize News with MCP, Q, and tmux
Python ★ 319 11mo agoExplain → -
testing-ml
🔍 Minimal examples of machine learning tests for implementation, behaviour, and performance.
Python ★ 271 3y agoExplain → -
llm-paper-notes
Notes from the Latent Space paper club. Follow along or start your own!
★ 250 1y agoExplain → -
applyingml
📌 Papers, guides, and mentor interviews on applying machine learning for ApplyingML.com—the ghost knowledge of machine learning.
MDX ★ 210 2y agoExplain → -
papermill-mlflow
🧪 Simple data science experimentation & tracking with jupyter, papermill, and mlflow.
Jupyter Notebook ★ 191 1y agoExplain → -
python-collab-template
🛠 Python project template with unit tests, code coverage, linting, type checking, Makefile wrapper, and GitHub Actions.
Python ★ 154 2y agoExplain → -
recsys-nlp-graph
🛒 Simple recommender with matrix factorization, graph, and NLP. Beating the regular collaborative filtering baseline.
Python ★ 147 1y agoExplain → -
semantic-ids-llm
Semantic IDs: How to train an LLM-Recommender Hybrid with steerability and reasoning on recommendations.
Jupyter Notebook ★ 126 9mo agoExplain → -
align-app
No description.
TypeScript ★ 98 1y agoExplain → -
visualizing-finetunes
No description.
Jupyter Notebook ★ 78 2y agoExplain → -
fastapi-html
Sample repository demonstrating how to use FastAPI to serve HTML web apps.
Python ★ 76 2y agoExplain → -
eugeneyan
No description.
Python ★ 56 21d agoExplain → -
framework-comparison
No description.
TypeScript ★ 40 1y agoExplain → -
discord-llm
Experimenting with LLMs to Research, Reflect, and Plan (LLM assistants, retrieval, and Discord integration)
Jupyter Notebook ★ 33 1y agoExplain → -
raspberry-llm
Calling LLM APIs on a Raspberry Pi for lulz
Python ★ 24 3y agoExplain → -
poc-docker-template
Simple template showing how to set up docker for reproducible data science with Jupyter notebooks.
Jupyter Notebook ★ 23 2y agoExplain → -
text-to-image
No description.
Jupyter Notebook ★ 20 3y agoExplain → -
my-cs-degree ⑂
A CS degree I designed for myself, 2020
★ 19 5y agoExplain → -
learning-typescript
No description.
JavaScript ★ 16 3y agoExplain → -
awesome-mlops ⑂
A curated list of references for MLOps
★ 14 5y agoExplain → -
nocode-ml
😝 End-to-end machine learning; "no code" required!
★ 13 5y agoExplain → -
awesome-fastapi ⑂
A curated list of awesome things related to FastAPI
★ 11 5y agoExplain → -
deep-rl
Repository for deep reinforcement learning with OpenAI
Python ★ 8 8y agoExplain → -
design-patterns
No description.
Java ★ 8 4y agoExplain → -
testing-pipelines
No description.
Python ★ 7 3y agoExplain → -
kaggle_springleaf
Code for Kaggle Springleaf Email Prediction Challenge
Python ★ 6 10y agoExplain → -
Computational-Thinking-and-Data-Science
edX: Introduction to Computational Thinking and Data Science (Oct 2014)
Python ★ 6 11y agoExplain → -
DeepLearningBook ⑂
MIT Deep Learning Book in PDF format
★ 5 10y agoExplain → -
Mining-Massive-Datasets
Coursera: Mining Massive Datasets (Sep 2014)
R ★ 5 11y agoExplain → -
ama
Ask Me Anything
★ 5 5y agoExplain → -
search_engineering ⑂
Search Engineering course materials
Python ★ 4 3y agoExplain → -
Computer-Science-and-Programming-In-Python
edX: Introduction to Computer Science and Programming in Python (July 2014)
Python ★ 4 11y agoExplain → -
datagene
No description.
Jupyter Notebook ★ 4 3y agoExplain → -
Data-Analysis-and-Statistical-Inference-Project
Coursera: Data Analysis & Statistical Inference Project (Feb 2014)
R ★ 4 12y agoExplain → -
Statistical-Learning
Stanford OpenX: Introduction to Statistical Learning
HTML ★ 4 11y agoExplain → -
Statistical-Inference
This repository contains the lab assignments for the facilitation of John Hopkins University' Coursera MOOC on Statistical Inference.
R ★ 4 11y agoExplain → -
kaggle_titanic
Code for Kaggle Titanic Challenge (and other learning)
HTML ★ 4 11y agoExplain → -
openai-cookbook ⑂
Examples and guides for using the OpenAI API
Jupyter Notebook ★ 4 3y agoExplain → -
search_fundamentals_course ⑂
No description.
Python ★ 4 3y agoExplain → -
workshop ⑂
AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker
★ 4 6y agoExplain → -
neural_networks_and_deep_learning
No description.
★ 3 10y agoExplain → -
Getting-and-Cleaning-Data
Coursera: Getting and Cleaning Data (May 2014)
R ★ 3 11y agoExplain → -
search_with_machine_learning_course ⑂
No description.
Jupyter Notebook ★ 3 3y agoExplain → -
Time-Series-Analysis
Simple forecasting with Regression Model
R ★ 3 11y agoExplain → -
R-Programming
Coursera: R Programming (May 2014)
R ★ 2 12y agoExplain → -
Machine-Learning
Coursera: Machine Learning (Aug 2014)
Matlab ★ 2 11y agoExplain → -
kaggle_otto
Code for Kaggle Otto Production Classification Challenge
R ★ 2 11y agoExplain → -
evals ⑂
Evals is a framework for evaluating OpenAI models and an open-source registry of benchmarks.
Python ★ 2 3y agoExplain → -
openai-experiments ⑂
Just me playing around with OpenAI
★ 2 3y agoExplain → -
model_test ⑂
A proof of concept library for generating and running machine learning model tests
★ 2 5y agoExplain → -
Twitter-SMA
Twitter Streaming and Analysis with Python and R
R ★ 2 12y agoExplain → -
scratch
No description.
Jupyter Notebook ★ 2 6y agoExplain → -
workspace-testing
No description.
Python ★ 1 2y agoExplain → -
ProgrammingAssignment2 ⑂
Repository for Programming Assignment 2 for R Programming on Coursera
R ★ 1 12y agoExplain → -
eslint-plugin-react ⑂
React specific linting rules for ESLint
★ 1 5y agoExplain → -
Interactive-Programming-in-Python
Coursera: Interactive Programming in Python (Apr 2014)
Python ★ 1 12y agoExplain → -
Visualizations
Random Visualizations
R ★ 1 11y agoExplain → -
Demand-Forecasting
Prototyping various forecasting techniques
R ★ 1 11y agoExplain → -
json-to-utterances
No description.
Jupyter Notebook ★ 1 5y agoExplain → -
eugeneyan-comments
No description.
★ 1 5y agoExplain → -
devtools-angels ⑂
active angel investors in developer tools!
★ 1 5y agoExplain → -
awesome-self-supervised-learning ⑂
A curated list of awesome self-supervised methods
★ 1 5y agoExplain → -
Misc
No description.
R ★ 1 11y agoExplain → -
xgboost ⑂
Large-scale and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, on single node, hadoop yarn and more.
C++ ★ 1 11y agoExplain → -
py_ml_utils ⑂
Some small utility modules to help with pandas, numpy and sklearn usage in other projects
Python ★ 1 11y agoExplain → -
mooc-setup ⑂
Information for setting up for the BerkeleyX Spark Intro MOOC, and lab assignments for the course
★ 1 11y agoExplain → -
DKSG-HOME
Sharing my R script used in the DKSG DataLearn for home
R ★ 1 11y agoExplain → -
tensorflow-mnist ⑂
Tensorflow MNIST example using Dataset API
Python ★ 1 8y agoExplain → -
ISLR-Labs ⑂
Notes and exercise attempts for "An Introduction to Statistical Learning"
R ★ 1 11y agoExplain → -
blog ⑂
No description.
★ 0 2y agoExplain → -
instructor ⑂
openai function calls for humans
★ 0 2y agoExplain → -
railway-nextjs ⑂
The barebones NextJS app
★ 0 1y agoExplain → -
dssg ⑂
dssg
JavaScript ★ 0 11y agoExplain → -
IPython-notebook-extensions ⑂
Some js extension for IPython notebook
JavaScript ★ 0 11y agoExplain →
No repos match these filters.