lit

TypeScript ★ 3.7k updated 1d ago

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

LIT, short for Learning Interpretability Tool, is a browser-based tool for understanding why machine learning models behave the way they do. It is aimed at researchers and developers who have already built or trained a model and want to inspect it: finding cases where it fails, understanding why it made a particular prediction, or checking whether it behaves consistently when you change small things in the input.

The tool works with text, image, and tabular data, and is compatible with popular machine learning frameworks like TensorFlow and PyTorch. It can be run as a standalone web server on your machine, or embedded directly inside notebook environments like Jupyter or Google Colab for more interactive use.

The browser interface offers several ways to explore a model. Local explanations use highlighting called salience maps to show which parts of an input most influenced a prediction. Aggregate analysis lets you compute custom metrics, slice your data into subgroups, and visualize how the model arranges its outputs in an embedding space, which is a geometric representation of how the model relates different inputs to each other. A counterfactual generator lets you edit an example and immediately see how the model's prediction changes, which is useful for testing robustness or finding edge cases. Side-by-side mode lets you compare two different models on the same data.

LIT is extensible: you can connect your own model by writing a short Python wrapper that follows the tool's data and model APIs. Additional interpretability components can be added on both the backend and frontend sides. The project is developed by PAIR (People and AI Research at Google) and includes live demos, a user guide, and a published research paper describing the design.

Open on GitHub → Full breakdown on explaingit →