[bug?] sim data collection combines actions with next_observation
instead of observation on which the action is based
#636
Labels
simulation
Matters involving system simulation or modeling
In the data collection script for sim envs, the action (a_t) is determined based on the current observation (o_t).
However, when the
step
method is called, the observation is overwritten with the new observation o_{t+1}, and this pair (a_t, o_{t+1}} is recorded as a demonstration step. I believe this is a mistake. A simplified and corrected data collection loop is given below:Note that I also explicitly call the reset, to avoid storing the last observation with an action that is never executed (the autoreset ignores the action if step is called on an environment that needs to reset).
I have not run the script, but was merely looking for code that allowed me to collect demonstrations for my own gym Env and store them in the Lerobot dataset format.
The text was updated successfully, but these errors were encountered: