-
Notifications
You must be signed in to change notification settings - Fork 3.2k
feat:v3.0 to v2.1 (and v2.1 to v2.0) #2109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds functionality to convert LeRobot datasets from v3.0 format back to v2.1 format for backward compatibility. This is the reverse operation of the existing v2.1 to v3.0 conversion.
- Adds a new conversion script that transforms the consolidated v3.0 file layout back to the legacy per-episode structure of v2.1
- Implements reverse transformations for data files, video files, metadata, and configuration
- Provides command-line interface for dataset conversion with options for local directories and force conversion
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]> Signed-off-by: Yihao Liu <[email protected]>
|
Hey y'all; thanks for the useful script! I ran into a bug (in the context of fine-tuning GR00T-N1.5-3B with the Isaac-GR00T repo) which I fixed by making the following changes to the v2.1-converted dataset's info.json. Old lines: New lines: And for stats.json, I removed the "count" entry for the "action" and "observation.state" entries. This now matches actual v2.1 metadata sufficiently for GR00T fine-tuning. I believe y'all need to change the "LEGACY_DATA_PATH_TEMPLATE" and "LEGACY_VIDEO_PATH_TEMPLATE" variables in y'all's script + probably something else. |
|
You probably also need a script from v2.1 to v2.0: checkout here. |
|
|
||
|
|
||
| def convert_dataset( | ||
| repo_id: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the converstion script, this will be really helpful as currently Nvidia Groot N1.5 expects the LeRobot format to be v2.1. However, can you also add support to convert the data sets locally from V3.0 to V2.1 instead of only supporting the datasets uploaded to Hugging Face?
What this does
It converts back from v3.0 dataset to v2.1 in case you need back compatibility.
Examples:
How it was tested
Convert a dataset and visualize it using lerobot tools to confirm data validity.
SECTION TO REMOVE BEFORE SUBMITTING YOUR PR
Note: Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.
Note: Before submitting this PR, please read the contributor guideline.