A synthetic data generator for video caption pairs.
Simian creates synthetic data that is usable for generative video and video captioning tasks. The data consists of videos and captions. The videos are generated using Blender, a 3D modeling software.
NOTE: Simian requires Python 3.11.
- Install dependences:
pip install -r requirements.txt- Download the datasets:
./scripts/data/get_data.sh- [OPTIONAL] If you're on a headless Linux server, install Xorg and start it:
sudo apt-get install xserver-xorg -y && \
sudo python3 scripts/start_x_server.py startGenerate scenes without movement (static videos):
python3 -m simian.combiner --count 1000 --seed 42Add movement to all or no objects (camera is stationary):
python3 -m simian.combiner --count 1000 --seed 42 --movementAllow objects to be on top of each other (static or movement):
python3 -m simian.combiner --count 1000 --seed 42 --ontopMake camera follow an object (camera follows object):
python3 -m simian.combiner --count 1000 --seed 42 --camera_followRandomly apply movement, object stacking, and camera follow effects:
python3 -m simian.combiner --count 1000 --seed 42 --randomConfigure the flags as needed:
--widthand--heightare the resolution of the video.--start_indexand--end_indexare the number of videos in the combinations you want to run. 0-100 will compile all 100 videos.--combination_indexis the index of the combination to render.--output_diris the directory to save the rendered video.--hdri_pathis the directory containing the background images.--start_frameand--end_frameare the start and end frames of the video.--imagesadding this will output images instead of video at random frames. Creates multiple images per combination of varying sizesblend_file <absolute path to blend file>allows users to upload and use their own blend files as the terrainanimation_lengthis a percentage from 0-100 which describes how fast the animation should occur within the frames
Or generate all or part of the combination set using the batch.py script:
Run:
python3 -m simian.batch
To generate videos use the arrow keys and select enter. Then
To generate a video(s):
--start_index 0 --end_index 1000 --width 1024 --height 576 --start_frame 1 --end_frame 2To generate an video(s) with your own blend file:
--start_index 0 --end_index 1000 --width 1024 --height 576 --start_frame 1 --end_frame 3 ---blend <absolute path to blend file>To generate an image(s):
--start_index 0 --end_index 1000 --width 1024 --height 576 --start_frame 1 --end_frame 2 --imagesMust first embedd all the data
python3 server/server.py
You can make a free Redis account here.
For local testing and multiple local workers, you can use the following script to set up a local instance of Redis:
scripts/setup_redis.shYou can get a Huggingface API key here.
Now, start your workers
export REDIS_HOST=<myhost>.com
export REDIS_PORT=1337
export REDIS_USER=default
export REDIS_PASSWORD=<somepassword>
export HF_TOKEN=<token>
export HF_REPO_ID=<repo_id>
celery -A simian.worker worker --loglevel=infoYou can also build and run the worker with Docker
# build the container
docker build -t simian-worker .
# run the container with .env
docker run --env-file .env simian-worker
# run the container with environment variables
docker run -e REDIS_HOST={myhost} -e REDIS_PORT={port} -e REDIS_USER=default -e REDIS_PASSWORD={some password} -e HF_TOKEN={token} -e HF_REPO_ID={repo_id} simian-workerFinally, issue work to your task queue
python3 -m simian.distributed --width 1024 --height 576If you want to use a custom or hosted Redis instance (recommended), you can add the redis details like this:
export REDIS_HOST=<myhost>.com
export REDIS_PORT=1337
export REDIS_USER=default
export REDIS_PASSWORD=<somepassword>To run all tests
python3 -m simian.tests.__run__
To run tests look into the test folder and run whichever test file you want
python3 -m simian.tests.object_testWe are currently using the following datasets: Objaverse
Backgrounds are loaded from: Poly Haven
We welcome contributions! We're especially interested in help adding and refining datasets, improving generation quality, adding new features and dynamics and allowing the project to meet more use cases.
- Check out the issues here.
- Join our Discord here.
- Get in touch with us so we can coordinate on development.
- Or, you know, just YOLO a pull request. We're pretty chill.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use it, please cite us:
@misc{Simverse,
author = {Deep AI, Inc},
title = {Simverse: A Synthetic Data Generator for Video Caption Pairs},
year = {2024},
publisher = {GitHub},
howpublished = {\url{https://github.com/DeepAI-Research/Simverse}}
}This project follows the all-contributors specification. Contributions of any kind welcome!
Eric S 🚇💻 |
M̵̞̗̝̼̅̏̎͝Ȯ̴̝̻̊̃̋̀Õ̷̼͋N̸̩̿͜ ̶̜̠̹̼̩͒ 🚇 💻 |
Deep AI Research is sponsored by the following organizations:
Interested in working with us? Join our Discord or post an issue to get in touch.

