CVPR2026_Similarity_as_Evidence

Python ★ 18 updated 7d ago

Python implementation of our CVPR 2026 paper Similarity-as-Evidence: Calibrating Overconfident VLMs for Interpretable and Label-Efficient Medical Active Learning.

Official CVPR 2026 code that uses BiomedCLIP image similarity to estimate uncertainty in active learning for medical image classification, helping identify which unlabeled scans are most worth a human expert labeling.

PythonBiomedCLIPPyTorchsetup: hardcomplexity 4/5

This repository contains the official code for a research paper titled "Similarity-as-Evidence," published at the CVPR 2026 computer vision conference. The paper addresses a problem in AI systems used to classify medical images: AI models trained on limited labeled data tend to be overconfident in their predictions, meaning they give high certainty scores even when they should not. This overconfidence makes it harder to know which images would be most useful to have a human expert label next.

The method works within a framework called active learning, where an AI system is given a small set of labeled examples and then selects additional examples for a human to label, trying to choose the ones that will improve the model the most. The key idea here is using a biomedical AI model, called BiomedCLIP, to measure how similar an unlabeled image is to the already-labeled examples. That similarity score is then used as evidence about how uncertain the model should be, rather than trusting the model's own confidence estimate, which tends to be inflated.

The uncertainty is broken down into two components. Vacuity measures how little evidence the model has about an image overall. Dissonance measures how much the evidence points in conflicting directions, for instance when an image looks similar to examples from two different disease categories at once. The system combines these two measures with adjustable weights and uses the result to rank which unlabeled images to ask an expert to label next. It also tries to keep the selected images balanced across different disease categories to avoid spending all the labeling budget on one type of case.

The code supports ten medical image datasets covering brain tumors, breast ultrasound, skin conditions, knee X-rays, lung tissue, and retinal scans, among others. The repository does not include the medical images themselves and points to a separate data guide for how to obtain and format them. Installation requires a standard Python dependency install and an optional script to download the model weights.

Where it fits

Apply Similarity-as-Evidence active learning to a medical image dataset to rank which unlabeled scans would benefit most from expert labeling.
Compare vacuity and dissonance uncertainty scores against standard active learning baselines across ten benchmark medical imaging datasets.
Adapt BiomedCLIP similarity as an uncertainty proxy in your own active learning pipeline by modifying the provided acquisition function code.

Open on GitHub → Full breakdown on explaingit →