gitmyhub

GenRecon

Python ★ 519 updated 5d ago

A research system that reconstructs detailed 3D indoor scene meshes with surface materials from multiple photos, using a generative AI prior to fill in geometry beyond what the photos directly show.

PythonPyTorchCUDAcondaCOLMAPsetup: hardcomplexity 5/5

GenRecon is a research system from the Technical University of Munich that reconstructs detailed three-dimensional models of indoor rooms from a set of ordinary photographs. Given multiple photos taken from different angles around a room, the system produces a complete 3D mesh with surface materials, not just a point cloud or rough shape.

The technical approach is unusual in that it uses a generative model, a type of AI that has learned what indoor spaces generally look like from large datasets, as a guide during the reconstruction process. Most reconstruction methods work purely from the input photos. GenRecon also conditions on those photos but additionally draws on the generative model's knowledge to fill in areas that are unclear or partially obscured in the photographs. The system divides a large scene into overlapping sections, reconstructs each one, and assembles them into a coherent whole.

The outputs are mesh files with physically-based rendering materials, meaning the geometry and surface appearance are represented in a format that game engines and professional 3D software can use directly. The paper accompanying this code reports that the system outperforms other reconstruction methods on standard benchmarks by about 16 percent.

Setup requires a CUDA-capable GPU, the CUDA toolkit, and running a setup script that installs Python dependencies including PyTorch and several compiled extensions. Pretrained weights for the three neural network components involved in the pipeline are available for download. Training the models from scratch requires preparing large indoor scene datasets and running three separate training stages.

The codebase is research code released alongside an academic paper published in May 2026. It is designed for researchers and engineers working on 3D reconstruction, computer vision, or applications that need realistic 3D scans of interior spaces from smartphone video or structured photo captures.

The full README is longer than what was shown.

Where it fits