gitmyhub

GN0

Python ★ 32 updated 16d ago

The official Implementation of GN0: Toward a Unified Paradigm for Generation, Evaluation, and Policy Learning in Visual-Language Navigation

A research framework for evaluating AI agents that navigate realistic indoor 3D environments by following spoken or written directions, using Gaussian Splatting reconstructions of real spaces as the test environments.

PythonCUDAsetup: hardcomplexity 4/5

GN0 is a research framework for training and testing AI agents that navigate indoor spaces by following spoken or written directions. The broader research field is called Vision-and-Language Navigation, where the goal is to build AI systems that can understand an instruction like "go to the bedroom and stand near the lamp" and then physically move through a 3D environment to carry it out.

What makes this project distinct is its use of 3D Gaussian Splatting as the underlying scene representation. Rather than working with simple 3D models or pre-rendered video, 3D Gaussian Splatting is a newer technique for reconstructing real indoor spaces from photographs in a way that looks highly realistic when rendered from any viewpoint. GN0 uses these reconstructed scenes as the environments where agents are tested.

The framework has three main components. GN-Matrix is a large dataset of navigation routes through these realistic scenes, including virtual human figures. GN-Bench is the simulation environment where agents are run and evaluated. GN-BAE is a navigation model trained to actually follow instructions through these spaces.

This repository specifically contains the GN-Bench evaluation workflow. A researcher can download the InteriorGS scene dataset and a pre-trained navigation model, then run a script to evaluate how well the model performs. The evaluation reports standard metrics for this field: how far the agent ends up from its target, whether it reached the goal at all, and how efficiently it traveled compared to the shortest possible path.

The project is associated with an academic paper published in 2026 and is intended for researchers working on embodied AI, robotics simulation, and language-guided navigation. Setting it up requires a GPU machine with CUDA support.

Where it fits