happy-llm

Jupyter Notebook ★ 31k updated 1mo ago

📚 从零开始构建大模型

Free, open-source course teaching how large language models work and how to build one from scratch using PyTorch, with theory and hands-on coding in Jupyter Notebooks.

PythonPyTorchJupyter NotebookTransformerLLaMA2setup: moderatecomplexity 3/5

Happy-LLM is a free, open-source learning course that teaches people how large language models (LLMs) — the technology behind AI chat tools — actually work under the hood, and how to build one from scratch. The course is written in Chinese and uses Jupyter Notebooks to combine explanations with runnable code.

The course is divided into two parts. The first part covers the foundational theory: what natural language processing (NLP) is, how the Transformer architecture works (Transformer is the core design pattern used by virtually all modern AI language models), and how pre-trained language models are structured. The second part is hands-on: learners implement a complete LLM based on the LLaMA2 design, train it from scratch using PyTorch (a programming framework for machine learning), then apply it using techniques like RAG (Retrieval-Augmented Generation — where the model looks up external information before answering) and Agents (systems where the model takes actions, not just answers questions).

The course also covers fine-tuning — taking an existing pre-trained model and adapting it to a specific task — using methods like LoRA and QLoRA, which are efficient techniques that reduce the computing resources needed.

Someone would use this if they want to go beyond just using AI tools and actually understand how they work internally, or if they are a student, researcher, or developer who wants a structured, hands-on path into LLM development without paying for a course.

Where it fits

Learn how transformer models and LLMs work by studying theory and implementing them step-by-step.
Build and train your own language model from scratch using PyTorch without paying for a course.
Understand fine-tuning techniques like LoRA to adapt pre-trained models to specific tasks efficiently.
Implement advanced LLM features like RAG and Agents to make models that retrieve information and take actions.

Open on GitHub → Full breakdown on explaingit →