gitmyhub

AlphaPose

Python ★ 8.6k updated 2y ago

Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System

A research tool from Shanghai Jiao Tong University that detects and tracks human body joint positions for every person simultaneously in images or video, supporting real-time multi-person pose estimation in crowded scenes.

PythonPyTorchCUDAsetup: hardcomplexity 4/5

AlphaPose is a research tool from Shanghai Jiao Tong University for detecting and tracking human body positions in images and video. It takes a photo or video as input and outputs the locations of body joints, such as shoulders, elbows, wrists, hips, knees, and ankles, for every person in the frame at once. This is called multi-person pose estimation.

The system works in real time and is designed to handle crowded scenes where multiple people overlap. It can detect 17 standard body keypoints used in common benchmarks, or expand to 26 or 136 keypoints that include hands, face, and feet. A 3D pose mode is also available, which estimates body shape and position in three dimensions using a separate model.

AlphaPose pairs pose detection with a tracker called PoseFlow, which connects body detections across video frames so that each person keeps a consistent identity as they move. This makes it useful for video analysis rather than just static images.

According to the benchmark numbers in the README, AlphaPose outperformed earlier systems like OpenPose on standard evaluation datasets at the time of those comparisons. The project is described as the first open-source system to cross certain accuracy thresholds on those datasets.

Running the tool requires a GPU. Installation and model download steps are documented in separate files in the repository. A Colab notebook is available if you want to try it without setting up a local environment. The project is a research release from the MVIG lab and includes citation instructions for academic use.

Where it fits