py-faster-rcnn

Python ★ 8.3k updated 6y ago

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version

A deprecated Python implementation of the Faster R-CNN object detection system that identifies and draws bounding boxes around objects in images, requiring Caffe and a GPU, users are directed to the newer Detectron project.

PythonCaffeCUDAMATLABsetup: hardcomplexity 5/5

Note first: this repository has been deprecated. The authors direct users to a newer project called Detectron, which has superseded it.

Faster R-CNN is a computer vision system for detecting and locating objects in images. Given a photograph, it can identify where specific things appear in the picture and draw a box around each one. The original research came out of Microsoft Research and was published in 2015. The core idea behind Faster R-CNN was to make object detection fast enough to approach real-time use by combining the region-proposal step and the classification step into a single network, rather than treating them as two separate passes.

This repository is a Python reimplementation of the original code, which was written in MATLAB. The Python version is slightly slower than the MATLAB original and produces results that are close but not identical, so it is not a drop-in replacement for reproducing the original paper's exact numbers. It also adds an approximate joint training mode that is around 1.5 times faster than the alternating optimization approach described in the paper.

Running the code requires a fairly involved setup. You need to build Caffe, which is a deep learning framework, along with its Python bindings. You also need a compatible GPU with several gigabytes of memory. Smaller network architectures need about 3 GB, while the larger VGG16 architecture needs roughly 11 GB for some training modes. Pre-trained model weights are downloaded separately via included scripts. Training data comes from the PASCAL VOC dataset, also downloaded separately.

The project is written in Python and released under the MIT license. It was a widely referenced starting point for object detection research during its active years, which explains the high star count despite being archived.

Where it fits

Study the original Faster R-CNN architecture as a research reference for understanding region proposal networks.
Train a baseline object detector on PASCAL VOC data using VGG16 to reproduce 2015-era benchmark results.
Understand how combining region proposal and classification into one network made detection faster than prior two-stage methods.
Use as a historical starting point for reviewing the lineage of modern object detection frameworks like Detectron.

Open on GitHub → Full breakdown on explaingit →