labelme
Image annotation with Python. Supports polygon, rectangle, circle, line, point, and AI-assisted annotation.
A graphical tool for drawing labels on images, draw shapes around objects to create training data for AI vision models, with AI-assisted annotation to speed up the process.
Labelme is a graphical tool for drawing annotations (labels) on images — a task required before training AI models to recognize objects, detect regions, or segment scenes. Think of it like a digital marker: you open an image, draw shapes around objects (polygons, rectangles, circles, lines, or single points), give each shape a label like "cat" or "car," and save that labeled data for later use in machine learning projects.
The tool supports several annotation styles: bounding boxes (rectangles around objects), polygon outlines, semantic segmentation (coloring every pixel by category), and instance segmentation (separating individual objects of the same type). It can also annotate video frame by frame.
A standout feature is AI-assisted annotation: instead of drawing every shape manually, you can use built-in AI models (SAM, EfficientSAM) that auto-suggest polygon or mask shapes from a single click, or use text-to-annotation models like YOLO-world to label objects by typing their name.
Labelme is written in Python and uses Qt for its graphical interface. It can be installed via pip (Python's package manager), downloaded as a standalone app from labelme.io without needing Python at all, or obtained through Linux system packages. Annotations are saved as JSON files and can be exported in standard formats used by computer vision training pipelines. The interface is available in 20 languages. The full README is longer than what was provided.
Where it fits
- Label thousands of images with bounding boxes and polygon outlines to build a training dataset for an object detection model.
- Use AI-assisted annotation to auto-suggest shapes from a single click, cutting manual labeling time significantly.
- Annotate video frame-by-frame for instance segmentation tasks in a computer vision pipeline.
- Export labeled data in standard ML formats to feed directly into a training script.