semantic
Parsing, analyzing, and comparing source code across many languages
Semantic is a library and command line tool, originally built by GitHub, for reading and analyzing source code written in various programming languages. It can parse code (meaning it reads the text and builds a structured representation of what the code means), extract symbols like function and class names, and compare two versions of a file to understand what changed. GitHub used this internally to power code navigation features on github.com, such as jumping to a function definition or seeing all references to a variable.
The tool supports a range of programming languages including Python, JavaScript, TypeScript, Ruby, Go, PHP, and others, with varying levels of support for each. For each language, it can produce different kinds of output: a tree showing the structure of the code, a list of symbols in JSON or binary format, or statistics about how long parsing took. You run it from the command line by pointing it at one or more source code files.
Under the hood, Semantic is written in Haskell and uses tree-sitter, a separate tool for building language parsers, as its foundation. Building and running it requires Haskell tooling that many developers will not already have set up, and the setup process involves several steps.
This repository is no longer maintained by GitHub. The project has been abandoned and will not receive further updates or bug fixes. GitHub has recommended that anyone who wants to continue developing it should create their own copy. If you are looking for active code analysis tools, this project is not a good choice in its current state.