gitmyhub

the-algorithm

Scala ★ 73k updated 9mo ago

Source code for the X Recommendation Algorithm

The source code for X's (formerly Twitter's) recommendation system that decides which posts appear in your For You feed, a multi-stage pipeline covering candidate selection, ranking, and filtering.

ScalaPythonRustBazelsetup: hardcomplexity 5/5

This is the source code for the recommendation algorithm that powers X (formerly Twitter). Its job is to decide which posts appear in your "For You" feed, which notifications you receive, and what shows up when you search or explore the platform. In short, it answers the question: out of hundreds of millions of posts, which ones should this specific user see right now?

The system works in several stages. First, candidate sources gather a large pool of potentially relevant posts from both accounts you follow and accounts you don't. Then ranking models score each candidate based on factors like how likely you are to engage with it, how reputable the author is, and whether it matches your interests. Finally, filtering layers remove content that violates policies or legal requirements before the final feed is assembled and delivered to you. Key internal components include SimClusters (which groups users into interest communities), TwHIN (which builds relationship maps between users and posts), and a page-rank-style reputation scorer called Tweepcred.

You would look at this repository if you are a researcher studying recommendation systems, a developer curious about how large-scale feed algorithms are structured, or someone interested in transparency around algorithmic content selection. It is not a standalone runnable application but rather a collection of services and machine learning jobs that require the broader X infrastructure to operate. The primary languages are Scala and Python, with some Rust for high-performance model serving (a component called Navi). Build tooling uses Bazel. This is reference and study material, not a plug-and-play product.

Where it fits