Skip to content

DeepAI-Research/SimData3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SimData3D Dataset

A filtered down dataset of the cap3d dataset, now containing only the most simple and quality of objects.

Overview

This project provides a script to filter captions from the cap3d dataset to remove 3D objects that contain many sub-objects. GLiNER was used for NER to capture the number of objects within text (with max threshold of <=2). The filter script can be found in the filtered folder

Requirements

This install.sh script will install GLiNER. Make sure the script has executable permissions. You can set executable permissions with:

chmod +x install.sh
./install.sh

OR 

pip3 install gliner

Data Filter

To filter the data run the script in the filter folder.

python3 filter.py

To split cap3d_captions file into multiple other files, look into the filter folder and run:

The following files will be saved into a prepare folder locally.

python3 split.py

Clone

  1. Clone the repository and navigate to the project directory:
git clone https://github.com/RaccoonResearch/simdata
cd simdata

About

SimData3D dataset for Simverse

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published