so-large-lm

★ 7.4k updated 6mo ago

大模型基础: 一文了解大模型基础知识

A free 14-chapter Chinese-language course on how large language models work, from Transformer architecture through training, fine-tuning, safety, and AI agent design, rooted in Stanford CS324.

setup: easycomplexity 1/5

So-Large-LM is an open-source Chinese-language educational project that teaches how large language models work, from foundational concepts through practical training and deployment. It is maintained by Datawhale, a Chinese open-source learning community, and is structured as a 14-chapter course rooted in the Stanford CS324 curriculum and a generative AI course by professor Hung-yi Lee.

The course covers the full lifecycle of a large language model. Early chapters explain the architecture decisions that make these models work: how the Transformer structure processes language, how positional encoding helps the model understand word order, and how attention mechanisms let the model weigh relationships between words. Later chapters cover data preparation, training strategies, and efficient fine-tuning methods that let researchers adapt a pre-trained model to a specific task without retraining it from scratch.

The project also addresses topics that many purely technical tutorials skip: environmental costs such as carbon emissions from large training runs, legal questions around copyright and fair use of training data, social harms including bias and hallucination, and how AI agents are structured. A dedicated chapter traces the full history of Meta's Llama model family from version 1 through version 3.

The README and all course content are written in Chinese. Companion video lectures are available on Bilibili. Datawhale positions this project as the theoretical foundation in a three-part learning path, with separate sibling repositories covering hands-on application development and open-source model deployment.

The target audience includes students, researchers, industry professionals, and policy specialists who want a thorough grounding in how large language models are built and governed.

Where it fits

Study the full lifecycle of a large language model chapter by chapter, from architecture basics to deployment.
Learn efficient fine-tuning methods to adapt a pre-trained model to a specific task without retraining from scratch.
Understand AI social harms, bias, hallucination, and legal questions around training data for policy or governance work.

Open on GitHub → Full breakdown on explaingit →