the-algorithm-ml
Source code for Twitter's Recommendation Algorithm
Twitter's open-sourced machine learning models that power the For You feed, including the Heavy Ranker that decides what content appears on your home timeline and TwHIN embeddings that represent users and content as numerical vectors.
This repository contains open-sourced machine learning models that Twitter uses to power parts of its recommendation system. The code covers two specific models: the Heavy Ranker that decides what shows up in the For You feed on the home timeline, and TwHIN embeddings, which are a way of representing Twitter users and content as numerical vectors for use in recommendation tasks. A research paper on TwHIN is linked from the README for anyone who wants the technical background.
The project is written in Python and is intended to run inside a Python virtual environment on Linux machines. It also depends on torchrec, a library for large-scale recommendation systems that works best with an Nvidia GPU. If you do not have a Linux machine with an Nvidia GPU, running this code locally will likely require extra workarounds the README does not cover.
Setup is handled by a single shell script, and each sub-project within the repository has its own README with more specific instructions for running that model. The top-level README is brief and points readers to those individual sub-project folders for details.
The README is sparse overall. It identifies what is included and how to get started at a high level, but does not describe in plain terms how the ranking or embedding models work, what inputs they take, or how they were trained. Readers who want deeper context would need to explore the sub-project folders and the linked research paper directly.
This repository is primarily useful to people with a machine learning background who want to study or adapt the actual models Twitter uses. It is not a product users interact with directly, and it is not a tool for general-purpose use without significant technical knowledge.
Where it fits
- Study how Twitter's For You feed ranking model selects and orders content to show to users
- Use TwHIN embeddings as a starting point for building your own social media recommendation system
- Adapt the Heavy Ranker model architecture for a custom large-scale content ranking task