RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer

RoboTransfer, a diffusion-based video generation framework for robotic data synthesis. Unlike previous methods, RoboTransfer integrates multi-view geometry with explicit control over scene components, such as background and object attributes. By incorporating cross-view feature interactions and global depth/normal conditions, RoboTransfer ensures geometry consistency across views. This framework allows fine-grained control, including background edits and object swaps.

✅ Setup Environment

We use uv to manage dependencies, to get our environments:

git clone https://github.com/HorizonRobotics/RoboTransfer.git
cd RoboTransfer
export UV_HTTP_TIMEOUT=600
uv sync
uv pip install -e .

🚀 Inference

uv run main.py # --mem_efficient for 4090

📈 More Inference Data

Update the dependencies of the data pipeline.

uv sync --extra data

⚙️ For more sim data

You can obtain more simulation data from the RoboTwin CVPR Challenge.

You can then use the process_sim.sh script to convert raw data (.pickle files and .hdf5) into the RoboTransfer format with geometric conditioning.

script/process_sim.sh

🤖 For more real data

For real-world data collected by the ALOHA-AgileX robot system, access the dataset RoboTransfer-RealData. You can then process raw RGB images using the process_real.sh script to convert them into RoboTransfer format with geometric conditioning.

script/process_real.sh

🙌 Acknowledgement

RoboTransfer builds upon the following amazing projects and models: 🌟 Video-Depth-Anything 🌟 Lotus 🌟 GPT4o 🌟 GroundSam 🌟 IOPaint

⚖️ License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

📚 Citation

If you use RoboTransfer in your research or projects, please cite:

@misc{liu2025robotransfergeometryconsistentvideodiffusion,
      title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer},
      author={Liu Liu and Xiaofeng Wang and Guosheng Zhao and Keyu Li and Wenkang Qin and Jiaxiong Qiu and Zheng Zhu and Guan Huang and Zhizhong Su},
      year={2025},
      eprint={2505.23171},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.23171},
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
robotransfer		robotransfer
script		script
submodule		submodule
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer

✅ Setup Environment

🚀 Inference

📈 More Inference Data

⚙️ For more sim data

🤖 For more real data

🙌 Acknowledgement

⚖️ License

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

HorizonRobotics/RoboTransfer

Folders and files

Latest commit

History

Repository files navigation

RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer

✅ Setup Environment

🚀 Inference

📈 More Inference Data

⚙️ For more sim data

🤖 For more real data

🙌 Acknowledgement

⚖️ License

📚 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages