huggingface
diff --git a/‎examples/10_use_so100.md
Lines changed: 16 additions & 0 deletions b/‎examples/10_use_so100.md
Lines changed: 16 additions & 0 deletions
diff --git a/‎examples/advanced/2_calculate_validation_loss.py
Lines changed: 4 additions & 0 deletions b/‎examples/advanced/2_calculate_validation_loss.py
Lines changed: 4 additions & 0 deletions
diff --git a/‎lerobot/common/datasets/transforms.py
Lines changed: 12 additions & 0 deletions b/‎lerobot/common/datasets/transforms.py
Lines changed: 12 additions & 0 deletions
diff --git a/‎lerobot/common/datasets/v2/batch_convert_dataset_v1_to_v2.py
Lines changed: 54 additions & 0 deletions b/‎lerobot/common/datasets/v2/batch_convert_dataset_v1_to_v2.py
Lines changed: 54 additions & 0 deletions
diff --git a/‎lerobot/common/envs/factory.py
Lines changed: 97 additions & 0 deletions b/‎lerobot/common/envs/factory.py
Lines changed: 97 additions & 0 deletions
@@ -94,6 +94,8 @@ python lerobot/scripts/find_motors_bus_port.py
 
 #### b. Example outputs
 
+#### b. Example outputs
+
 Example output when identifying the leader arm's port (e.g., `/dev/tty.usbmodem575E0031751` on Mac, or possibly `/dev/ttyACM0` on Linux):
 ```
 Finding all available ports for the MotorBus.
@@ -117,6 +119,8 @@ The port of this DynamixelMotorsBus is /dev/tty.usbmodem575E0032081
 Reconnect the usb cable.
 ```
 
+#### c. Troubleshooting
+On Linux, you might need to give access to the USB ports by running:
 #### c. Troubleshooting
 On Linux, you might need to give access to the USB ports by running:
 ```bash
@@ -233,6 +237,7 @@ Follow the video for removing gears. You need to remove the gear for the motors
 Follow the video for adding the motor horn. For SO-100, you need to align the holes on the motor horn to the motor spline to be approximately 1:30, 4:30, 7:30 and 10:30.
 Try to avoid rotating the motor while doing so to keep position 2048 set during configuration. It is especially tricky for the leader motors as it is more sensible without the gears, but it's ok if it's a bit rotated.
 
+## D. Assemble the arms
 ## D. Assemble the arms
 
 <details>
@@ -244,6 +249,7 @@ Try to avoid rotating the motor while doing so to keep position 2048 set during
 
 Follow the video for assembling the arms. It is important to insert the cables into the motor that is being assembled before you assemble the motor into the arm! Inserting the cables beforehand is much easier than doing this afterward. The first arm should take a bit more than 1 hour to assemble, but once you get used to it, you can do it under 1 hour for the second arm.
 
+## E. Calibrate
 ## E. Calibrate
 
 Next, you'll need to calibrate your SO-100 robot to ensure that the leader and follower arms have the same position values when they are in the same physical position. This calibration is essential because it allows a neural network trained on one SO-100 robot to work on another.
@@ -268,6 +274,8 @@ python lerobot/scripts/control_robot.py \
   --control.arms='["main_follower"]'
 ```
 
+#### b. Manual calibration of leader arm
+Follow step 6 of the [assembly video](https://youtu.be/FioA2oeFZ5I?t=724) which illustrates the manual calibration. You will need to move the leader arm to these positions sequentially:
 #### b. Manual calibration of leader arm
 Follow step 6 of the [assembly video](https://youtu.be/FioA2oeFZ5I?t=724) which illustrates the manual calibration. You will need to move the leader arm to these positions sequentially:
 
@@ -284,6 +292,7 @@ python lerobot/scripts/control_robot.py \
   --control.arms='["main_leader"]'
 ```
 
+## F. Teleoperate
 ## F. Teleoperate
 
 **Simple teleop**
@@ -296,6 +305,7 @@ python lerobot/scripts/control_robot.py \
 ```
 
 
+#### a. Teleop with displaying cameras
 #### a. Teleop with displaying cameras
 Follow [this guide to setup your cameras](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#c-add-your-cameras-with-opencvcamera). Then you will be able to display the cameras on your computer while you are teleoperating by running the following code. This is useful to prepare your setup before recording your first dataset.
 ```bash
@@ -304,6 +314,7 @@ python lerobot/scripts/control_robot.py \
   --control.type=teleoperate
 ```
 
+## G. Record a dataset
 ## G. Record a dataset
 
 Once you're familiar with teleoperation, you can record your first dataset with SO-100.
@@ -337,6 +348,7 @@ python lerobot/scripts/control_robot.py \
 
 Note: You can resume recording by adding `--control.resume=true`. Also if you didn't push your dataset yet, add `--control.local_files_only=true`.
 
+## H. Visualize a dataset
 ## H. Visualize a dataset
 
 If you uploaded your dataset to the hub with `--control.push_to_hub=true`, you can [visualize your dataset online](https://huggingface.co/spaces/lerobot/visualize_dataset) by copy pasting your repo id given by:
@@ -351,6 +363,7 @@ python lerobot/scripts/visualize_dataset_html.py \
   --local-files-only 1
 ```
 
+## I. Replay an episode
 ## I. Replay an episode
 
 Now try to replay the first episode on your robot:
@@ -365,6 +378,7 @@ python lerobot/scripts/control_robot.py \
 
 Note: If you didn't push your dataset yet, add `--control.local_files_only=true`.
 
+## J. Train a policy
 ## J. Train a policy
 
 To train a policy to control your robot, use the [`python lerobot/scripts/train.py`](../lerobot/scripts/train.py) script. A few arguments are required. Here is an example command:
@@ -388,6 +402,7 @@ Let's explain it:
 
 Training should take several hours. You will find checkpoints in `outputs/train/act_so100_test/checkpoints`.
 
+## K. Evaluate your policy
 ## K. Evaluate your policy
 
 You can use the `record` function from [`lerobot/scripts/control_robot.py`](../lerobot/scripts/control_robot.py) but with a policy checkpoint as input. For instance, run this command to record 10 evaluation episodes:
@@ -411,6 +426,7 @@ As you can see, it's almost the same command as previously used to record your t
 1. There is an additional `--control.policy.path` argument which indicates the path to your policy checkpoint with  (e.g. `outputs/train/eval_act_so100_test/checkpoints/last/pretrained_model`). You can also use the model repository if you uploaded a model checkpoint to the hub (e.g. `${HF_USER}/act_so100_test`).
 2. The name of dataset begins by `eval` to reflect that you are running inference (e.g. `${HF_USER}/eval_act_so100_test`).
 
+## L. More Information
 ## L. More Information
 
 Follow this [previous tutorial](https://github.com/huggingface/lerobot/blob/main/examples/7_get_started_with_real_robot.md#4-train-a-policy-on-your-data) for a more in-depth tutorial on controlling real robots with LeRobot.
 
@@ -12,6 +12,10 @@
 
 import torch
 
+from lerobot.common.datasets.lerobot_dataset import (
+    LeRobotDataset,
+    LeRobotDatasetMetadata,
+)
 from lerobot.common.datasets.lerobot_dataset import (
     LeRobotDataset,
     LeRobotDatasetMetadata,
 
@@ -61,6 +61,9 @@ def __init__(
             raise ValueError(
                 f"n_subset should be in the interval [1, {len(transforms)}]"
             )
+            raise ValueError(
+                f"n_subset should be in the interval [1, {len(transforms)}]"
+            )
 
         self.transforms = transforms
         total = sum(p)
@@ -124,6 +127,9 @@ def _check_input(self, sharpness):
                 raise ValueError(
                     "If sharpness is a single number, it must be non negative."
                 )
+                raise ValueError(
+                    "If sharpness is a single number, it must be non negative."
+                )
             sharpness = [1.0 - sharpness, 1.0 + sharpness]
             sharpness[0] = max(sharpness[0], 0.0)
         elif isinstance(sharpness, collections.abc.Sequence) and len(sharpness) == 2:
@@ -132,11 +138,17 @@ def _check_input(self, sharpness):
             raise TypeError(
                 f"{sharpness=} should be a single number or a sequence with length 2."
             )
+            raise TypeError(
+                f"{sharpness=} should be a single number or a sequence with length 2."
+            )
 
         if not 0.0 <= sharpness[0] <= sharpness[1]:
             raise ValueError(
                 f"sharpnesss values should be between (0., inf), but got {sharpness}."
             )
+            raise ValueError(
+                f"sharpnesss values should be between (0., inf), but got {sharpness}."
+            )
 
         return float(sharpness[0]), float(sharpness[1])
 
 
@@ -121,6 +121,10 @@
         "single_task": "Pick up the candy and unwrap it.",
         **ALOHA_STATIC_INFO,
     },
+    "aloha_static_candy": {
+        "single_task": "Pick up the candy and unwrap it.",
+        **ALOHA_STATIC_INFO,
+    },
     "aloha_static_coffee": {
         "single_task": "Place the coffee capsule inside the capsule container, then place the cup onto the center of the cup tray, then push the 'Hot Water' and 'Travel Mug' buttons.",
         **ALOHA_STATIC_INFO,
@@ -162,11 +166,21 @@
         **ALOHA_STATIC_INFO,
     },
     "aloha_static_vinh_cup": {
+        "single_task": "Pick up the plastic cup with the right arm, then pop its lid open with the left arm.",
         "single_task": "Pick up the plastic cup with the right arm, then pop its lid open with the left arm.",
         **ALOHA_STATIC_INFO,
     },
     "aloha_static_vinh_cup_left": {
         "single_task": "Pick up the plastic cup with the left arm, then pop its lid open with the right arm.",
+        "single_task": "Pick up the plastic cup with the left arm, then pop its lid open with the right arm.",
+        **ALOHA_STATIC_INFO,
+    },
+    "aloha_static_ziploc_slide": {
+        "single_task": "Slide open the ziploc bag.",
+        **ALOHA_STATIC_INFO,
+    },
+    "aloha_sim_insertion_scripted": {
+        "single_task": "Insert the peg into the socket.",
         **ALOHA_STATIC_INFO,
     },
     "aloha_static_ziploc_slide": {
@@ -185,6 +199,10 @@
         "single_task": "Insert the peg into the socket.",
         **ALOHA_STATIC_INFO,
     },
+    "aloha_sim_insertion_human": {
+        "single_task": "Insert the peg into the socket.",
+        **ALOHA_STATIC_INFO,
+    },
     "aloha_sim_insertion_human_image": {
         "single_task": "Insert the peg into the socket.",
         **ALOHA_STATIC_INFO,
@@ -213,11 +231,23 @@
         "single_task": "Push the T-shaped block onto the T-shaped target.",
         **PUSHT_INFO,
     },
+    "pusht": {
+        "single_task": "Push the T-shaped block onto the T-shaped target.",
+        **PUSHT_INFO,
+    },
+    "pusht_image": {
+        "single_task": "Push the T-shaped block onto the T-shaped target.",
+        **PUSHT_INFO,
+    },
     "unitreeh1_fold_clothes": {"single_task": "Fold the sweatshirt.", **UNITREEH_INFO},
     "unitreeh1_rearrange_objects": {
         "single_task": "Put the object into the bin.",
         **UNITREEH_INFO,
     },
+    "unitreeh1_rearrange_objects": {
+        "single_task": "Put the object into the bin.",
+        **UNITREEH_INFO,
+    },
     "unitreeh1_two_robot_greeting": {
         "single_task": "Greet the other robot with a high five.",
         **UNITREEH_INFO,
@@ -239,6 +269,18 @@
         "single_task": "Pick up the cube and lift it.",
         **XARM_INFO,
     },
+    "xarm_lift_medium_image": {
+        "single_task": "Pick up the cube and lift it.",
+        **XARM_INFO,
+    },
+    "xarm_lift_medium_replay": {
+        "single_task": "Pick up the cube and lift it.",
+        **XARM_INFO,
+    },
+    "xarm_lift_medium_replay_image": {
+        "single_task": "Pick up the cube and lift it.",
+        **XARM_INFO,
+    },
     "xarm_push_medium": {"single_task": "Push the cube onto the target.", **XARM_INFO},
     "xarm_push_medium_image": {
         "single_task": "Push the cube onto the target.",
@@ -252,6 +294,18 @@
         "single_task": "Push the cube onto the target.",
         **XARM_INFO,
     },
+    "xarm_push_medium_image": {
+        "single_task": "Push the cube onto the target.",
+        **XARM_INFO,
+    },
+    "xarm_push_medium_replay": {
+        "single_task": "Push the cube onto the target.",
+        **XARM_INFO,
+    },
+    "xarm_push_medium_replay_image": {
+        "single_task": "Push the cube onto the target.",
+        **XARM_INFO,
+    },
     "umi_cup_in_the_wild": {
         "single_task": "Put the cup on the plate.",
         "license": "apache-2.0",
 
@@ -15,6 +15,7 @@
 # limitations under the License.
 import importlib
 from collections import deque
+from collections import deque
 
 import gymnasium as gym
 
@@ -164,3 +165,99 @@ def _get_obs(self, observation):
         ret["state"] = observation
         ret["pixels"] = images
         return ret
+
+
+def make_maniskill_env(
+    cfg: DictConfig, n_envs: int | None = None
+) -> gym.vector.VectorEnv | None:
+    """Make ManiSkill3 gym environment"""
+    from mani_skill.vector.wrappers.gymnasium import ManiSkillVectorEnv
+
+    env = gym.make(
+        cfg.env.task,
+        obs_mode=cfg.env.obs,
+        control_mode=cfg.env.control_mode,
+        render_mode=cfg.env.render_mode,
+        sensor_configs=dict(width=cfg.env.image_size, height=cfg.env.image_size),
+        num_envs=n_envs,
+    )
+    # cfg.env_cfg.control_mode = cfg.eval_env_cfg.control_mode = env.control_mode
+    env = ManiSkillVectorEnv(env, ignore_terminations=True)
+    # state should have the size of 25
+    # env = ConvertToLeRobotEnv(env, n_envs)
+    # env = PixelWrapper(cfg, env, n_envs)
+    env._max_episode_steps = env.max_episode_steps = (
+        50  # gym_utils.find_max_episode_steps_value(env)
+    )
+    env.unwrapped.metadata["render_fps"] = 20
+
+    return env
+
+
+class PixelWrapper(gym.Wrapper):
+    """
+    Wrapper for pixel observations. Works with Maniskill vectorized environments
+    """
+
+    def __init__(self, cfg, env, num_envs, num_frames=3):
+        super().__init__(env)
+        self.cfg = cfg
+        self.env = env
+        self.observation_space = gym.spaces.Box(
+            low=0,
+            high=255,
+            shape=(num_envs, num_frames * 3, cfg.env.render_size, cfg.env.render_size),
+            dtype=np.uint8,
+        )
+        self._frames = deque([], maxlen=num_frames)
+        self._render_size = cfg.env.render_size
+
+    def _get_obs(self, obs):
+        frame = obs["sensor_data"]["base_camera"]["rgb"].cpu().permute(0, 3, 1, 2)
+        self._frames.append(frame)
+        return {
+            "pixels": torch.from_numpy(np.concatenate(self._frames, axis=1)).to(
+                self.env.device
+            )
+        }
+
+    def reset(self, seed):
+        obs, info = self.env.reset()  # (seed=seed)
+        for _ in range(self._frames.maxlen):
+            obs_frames = self._get_obs(obs)
+        return obs_frames, info
+
+    def step(self, action):
+        obs, reward, terminated, truncated, info = self.env.step(action)
+        return self._get_obs(obs), reward, terminated, truncated, info
+
+
+# TODO: Remove this
+class ConvertToLeRobotEnv(gym.Wrapper):
+    def __init__(self, env, num_envs):
+        super().__init__(env)
+
+    def reset(self, seed=None, options=None):
+        obs, info = self.env.reset(seed=seed, options={})
+        return self._get_obs(obs), info
+
+    def step(self, action):
+        obs, reward, terminated, truncated, info = self.env.step(action)
+        return self._get_obs(obs), reward, terminated, truncated, info
+
+    def _get_obs(self, observation):
+        sensor_data = observation.pop("sensor_data")
+        del observation["sensor_param"]
+        images = []
+        for cam_data in sensor_data.values():
+            images.append(cam_data["rgb"])
+
+        images = torch.concat(images, axis=-1)
+        # flatten the rest of the data which should just be state data
+        observation = common.flatten_state_dict(
+            observation, use_torch=True, device=self.base_env.device
+        )
+        ret = dict()
+        ret["state"] = observation
+        ret["pixels"] = images
+        return ret