AlphaPose
Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System
A research tool from Shanghai Jiao Tong University that detects and tracks human body joint positions for every person simultaneously in images or video, supporting real-time multi-person pose estimation in crowded scenes.
AlphaPose is a research tool from Shanghai Jiao Tong University for detecting and tracking human body positions in images and video. It takes a photo or video as input and outputs the locations of body joints, such as shoulders, elbows, wrists, hips, knees, and ankles, for every person in the frame at once. This is called multi-person pose estimation.
The system works in real time and is designed to handle crowded scenes where multiple people overlap. It can detect 17 standard body keypoints used in common benchmarks, or expand to 26 or 136 keypoints that include hands, face, and feet. A 3D pose mode is also available, which estimates body shape and position in three dimensions using a separate model.
AlphaPose pairs pose detection with a tracker called PoseFlow, which connects body detections across video frames so that each person keeps a consistent identity as they move. This makes it useful for video analysis rather than just static images.
According to the benchmark numbers in the README, AlphaPose outperformed earlier systems like OpenPose on standard evaluation datasets at the time of those comparisons. The project is described as the first open-source system to cross certain accuracy thresholds on those datasets.
Running the tool requires a GPU. Installation and model download steps are documented in separate files in the repository. A Colab notebook is available if you want to try it without setting up a local environment. The project is a research release from the MVIG lab and includes citation instructions for academic use.
Where it fits
- Detect body joint positions for every person in a crowd video to analyze movement or posture at scale.
- Track individuals across video frames with consistent person IDs using the PoseFlow tracker.
- Estimate 3D body pose and shape from a video using AlphaPose's optional 3D estimation mode.
- Try multi-person pose estimation on your own images without a local GPU by using the provided Colab notebook.