Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance SAC with Mixture-of-Expert and BEE Operator for Improved Stability and Performance #788

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions examples/baselines/sac/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,21 @@ Note that ManiSkill is still in beta, so we have not finalized training scripts

Below is a sample of various commands you can run to train a state-based policy to solve various tasks with SAC that are lightly tuned already. Note that control modes can be changed and can be important for improving sample efficiency.


```bash
python sac.py --env_id="PushCube-v1" \
--num_envs=32 --utd=0.5 --buffer_size=500_000 \
--total_timesteps=500_000 --eval_freq=50_000 --control-mode="pd_ee_delta_pos"
--total_timesteps=500_000 --eval_freq=50_000 --control-mode="pd_ee_delta_pos"
python sac.py --env_id="PickCube-v1" \
--num_envs=32 --utd=0.5 --buffer_size=500_000 \
--total_timesteps=500_000 --eval_freq=50_000 --control-mode="pd_ee_delta_pos"
--total_timesteps=500_000 --eval_freq=50_000 --control-mode="pd_ee_delta_pos"
python sac_moe.py --env_id="PushT-v1" \
--num_envs=32 --utd=0.5 --buffer_size=500_000 \
--total_timesteps=1_000_000 --eval_freq=50_000 --control-mode="pd_ee_delta_pos"
```

## Vision Based RL (RGBD)

Below is a sample of various commands for training a image-based policy with SAC that are lightly tuned. You will need to tune the buffer size accordingly as image based observations can take up a lot of memory. The settings below should all take less than 16GB of GPU memory. The examples.sh file has a full list of tested commands for running visual based SAC successfully on many tasks. Change the `--obs_mode` argument to "rgb", "rgb+depth", "depth" to train on RGB or RGBD observations or Depth observations.
Below is a sample of various commands for training a image-based policy with SAC that are lightly tuned. You will need to tune the buffer size accordingly as image based observations can take up a lot of memory. The settings below should all take less than 16GB of GPU memory. The examples.sh file has a full list of tested commands for running visual based SAC successfully on many tasks. Change the `--obs_mode` argument to "rgb", "rgb+depth", "depth" to train on RGB or RGBD observations or Depth observations.

```bash
python sac_rgbd.py --env_id="PickCube-v1" --obs_mode="rgb" \
Expand All @@ -31,7 +33,7 @@ python sac_rgbd.py --env_id="PickCube-v1" --obs_mode="rgb" \
python sac_rgbd.py --env_id="PickCube-v1" --obs_mode rgb+depth \
--num_envs=32 --utd=0.5 --buffer_size=300_000 \
--control-mode="pd_ee_delta_pos" --camera_width=64 --camera_height=64 \
--total_timesteps=1_000_000 --eval_freq=10_000
--total_timesteps=1_000_000 --eval_freq=10_000
```

### Notes and Optimization Tips
Expand All @@ -43,6 +45,7 @@ You can add `--no-include-state` to exclude any state based information from obs
## Citation

If you use this baseline please cite the following

```
@inproceedings{DBLP:conf/icml/HaarnojaZAL18,
author = {Tuomas Haarnoja and
Expand All @@ -66,4 +69,4 @@ If you use this baseline please cite the following
biburl = {https://dblp.org/rec/conf/icml/HaarnojaZAL18.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
```
Loading