Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what do actions in 'multi_post_retargeting_v3.pkl' means #22

Open
bb0928 opened this issue Dec 6, 2024 · 1 comment
Open

what do actions in 'multi_post_retargeting_v3.pkl' means #22

bb0928 opened this issue Dec 6, 2024 · 1 comment

Comments

@bb0928
Copy link

bb0928 commented Dec 6, 2024

Hi, when i am trying to load your action sequence in your demonstration files, i founded that the hand pose are shaking randomly, which seems not the true actions. my code are as follow:

    data = pickle.load(open('./demonstrations/relocate-mustard_bottle.pkl', 'rb'))
    actions = data['seq_10/multi_post_retargeting_v3.pkl']['actions']
    # qpos = data['seq_100/multi_post_retargeting_v3.pkl']['sim_data']
    # Loop visualization
    for i in range(actions.shape[0]):

        # action = np.random.uniform(low, high)
        action = actions[i]
        obs, reward, done, _ = env.step(action)
        env.render()

could you please tell me what does the actions in .pkl means? does it correspond to the action defined in mujoco? thanks.

@yzqin
Copy link
Owner

yzqin commented Dec 7, 2024

@bb0928

Your observation is correct. The actions are derived from inverse dynamics calculations on the estimated pose sequence. Since these poses are estimated rather than manually annotated, they contain significant noise which leads to high cumulative errors in the action sequence.

This is why replaying the action sequence in simulation doesn't produce accurate results. For this reason, DexMV's imitation learning avoids trajectory-level methods in favor of approaches like DAPG. These methods process imitation data as individual state-action pairs, where noise has less impact. Sequential replay of actions would compound errors during simulation.

This accumulated error is an inherent limitation of video-based pose estimation - it cannot perfectly replicate physical dynamics. For applications requiring high precision without scaling concerns, teleoperation would be more suitable. When learning from video, it's best to avoid relying on detailed physics, especially for sequential actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants