gitmyhub

dust3r

Python ★ 7.2k updated 9mo ago

DUSt3R: Geometric 3D Vision Made Easy

A research tool from Naver Labs that reconstructs 3D scenes from a set of ordinary photos by automatically estimating camera positions and surface geometry, no camera calibration needed.

PythonPyTorchcondaHugging FaceDockersetup: hardcomplexity 4/5

DUSt3R is a Python research tool that reconstructs 3D scenes from regular photographs. You give it a set of images taken from different angles of the same place, and it figures out the 3D geometry, estimating where every visible surface sits in three-dimensional space and how the camera was positioned in each shot. This is useful in robotics, visual effects, mapping, and any application where you need a 3D model built from photos alone.

The project comes from Naver Labs Europe and was presented at CVPR 2024. The key idea is that the system does not require you to know anything about the camera in advance, such as focal length or lens distortion parameters. It infers all of that from the images. After pairing the input images and running a neural network on each pair, a global alignment step reconciles all the pair-wise estimates into one consistent 3D point cloud.

To get started, you clone the repository, set up a Python environment using conda, install the required packages, and download one of the pre-trained model checkpoints from the project's servers or automatically via Hugging Face. Three checkpoint variants are provided, differing in input resolution and decoder architecture. Once the model is ready, you can run an interactive web demo with a single command, select your images through a browser interface, and view the 3D reconstruction when it finishes. A Docker setup is also provided for those who prefer a container-based workflow with optional GPU support.

The code can also be used directly in Python scripts. The README includes a short example showing how to load images, create image pairs, run the model, and then call the global aligner to produce the final point cloud. The project is licensed for non-commercial use only under CC BY-NC-SA 4.0.

Where it fits