pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
pytorch-grad-cam is a Python library that generates visual heatmaps showing which parts of an image caused an AI model to make a specific prediction, helping you understand and debug image classifiers.
This is a Python library that helps you understand why an image-recognition AI reaches the decisions it does. When a neural network classifies a photo as a dog or a cat, it processes thousands of internal signals, and by default there is no easy way to ask the network to show its work. This library generates visual heatmaps, called class activation maps, that highlight the specific pixels or regions in an image that pushed the network toward a particular answer.
The library supports many different methods for producing these heatmaps. GradCAM is the most common: it weights internal activations by their gradients to show where the model focused. Other methods like ScoreCAM, AblationCAM, and EigenCAM approach the same problem from different angles, each with trade-offs in speed, accuracy, and how faithfully they represent the model's actual reasoning. More than a dozen methods are included, and the README compares them in a table so you can choose the one that fits your situation.
Beyond plain image classification, the library works with object detection, semantic segmentation, image similarity comparisons, and multimodal models like CLIP. A collection of built-in metrics checks how reliable a given explanation is, which helps researchers and developers spot when a heatmap may be misleading rather than informative.
Install is straightforward: pip install grad-cam. The library is tested against standard convolutional network architectures and newer Vision Transformer designs. It supports processing images in batches for efficiency, and smoothing options are available to make the output heatmaps easier to read visually.
The project is intended both for people actively building or diagnosing computer vision models and for researchers comparing explainability techniques. If you want to understand why your image classifier made a wrong prediction, or to check that it is paying attention to the right parts of an image, this library gives you visual tools to investigate that.
Where it fits
- Visualize which pixels in a medical image made an AI classifier flag it as abnormal, to check whether the model is focusing on the right areas.
- Debug a misclassified image by generating a heatmap showing what the model paid attention to instead of the correct feature.
- Compare multiple explainability methods side by side to choose the most reliable one for your computer vision model.
- Validate that a trained object detector is highlighting the actual object and not background artifacts.