Embodied Gaussians introduces a novel dual "Gaussian-Particle" representation that bridges the gap between physical simulation and visual perception for robotics. Our approach:
- 🎯 Unifies geometric, physical, and visual world representations
- 🔮 Enables predictive simulation of future states
- 🔄 Allows online correction from visual observations
- 🌐 Integrates with an XPBD physics system
- 🎨 Renders high-quality images through 3D Gaussian splatting
For robots to robustly understand and interact with the physical world, it is highly beneficial to have a comprehensive representation -- modelling geometry, physics, and visual observations -- that informs perception, planning, and control algorithms. We propose a novel dual "Gaussian-Particle" representation that models the physical world while (i) enabling predictive simulation of future states and (ii) allowing online correction from visual observations in a dynamic world.
Our representation comprises particles that capture the geometrical aspect of objects in the world and can be used alongside a particle-based physics system to anticipate physically plausible future states. Attached to these particles are 3D Gaussians that render images from any viewpoint through a splatting process thus capturing the visual state. By comparing the predicted and observed images, our approach generates "visual forces" that correct the particle positions while respecting known physical constraints.
By integrating predictive physical modeling with continuous visually-derived corrections, our unified representation reasons about the present and future while synchronizing with reality. We validate our approach on 2D and 3D tracking tasks as well as photometric reconstruction quality.
This repository provides a reference implementation of Embodied Gaussians with some differences from the paper:
- 🔷 Rigid Bodies Only: Currently, this implementation only supports rigid body dynamics. The shape matching functionality described in the paper is not included.
- 🔨 Simplified Physics: Due to the rigid body constraint, the physics simulation is more straightforward but less flexible than the full implementation described in the paper.
-
First, install pixi
-
Then build the dependencies:
pixi r build
To run the included demo:
pixi r demo
These scripts require Realsense cameras directly connected to your device. Note: Offline image processing is not currently supported.
First, detect the ground plane by running:
python scripts/find_ground.py temp/ground_plane.json --extrinsics scripts/example_extrinsics.json --visualize
You will be prompted to segment the ground in the interface. The script will then calculate the ground points and plane parameters.
Convert the detected ground plane into Gaussian representations:
python scripts/build_body_from_pointcloud.py temp/ground_body.json --extrinsics scripts/example_extrinsics.json --points scripts/example_ground_plane.npy --visualize
Generate embodied Gaussian representations of objects using multiple viewpoints (more viewpoints yield better results):
- Run the scene building script:
python scripts/build_simple_body.py objects/tblock.json \
--extrinsics scripts/example_extrinsics.json \
--ground scripts/example_ground_plane.json \
--visualize
-
For each camera viewpoint:
- A segmentation GUI will appear
- Click to select the target object
- Press
Escape
when satisfied with the selection - Repeat for all viewpoints
-
The script will generate a JSON file containing both particle and Gaussian representations of your object.
You can visualize the object with
python scripts/visualize_object.py scripts/example_object.json
If you find this work useful, please consider citing our paper:
@inproceedings{
abouchakra-embodiedgaussians,
title={Physically Embodied Gaussian Splatting: A Realtime Correctable World Model for Robotics},
author={Jad Abou-Chakra and Krishan Rana and Feras Dayoub and Niko Suenderhauf},
booktitle={8th Annual Conference on Robot Learning},
year={2024},
url={https://openreview.net/forum?id=AEq0onGrN2}
}
For videos and additional information, visit our project page.
This software is provided as a research prototype and is not production-quality software. Please note that the code may contain missing features, bugs and errors. RAI Institute does not offer maintenance or support for this software.