This repository is the Pytorch implementation of Soft Actor Critic This implemetation is based on the algorithm version2 which removes Value Approximator.
Install dependencies with Docker. You can also install dependencies using requirements.txt.
To build Docker image, run this command.
# format: docker build -t . <image_name>
docker build -t . sac
After building image, use the following command to run the Docker container.
docker run -ti --gpus '"device='<gpu number>'"' -v <your working directory>:/app --ipc=host --name <container_name> <image_name> /bin/bash
# or you can run this command after changing docker_run.sh file in proper format
./docker_run.sh <gpu num> <container_name>
If you want to train your own agent with SAC algorithm in Pendulum-v1 or LunarLanderContinuous-v2 environment, run this command inside the Docker container.
# For Pendulum-v1
python train_.py --config config_pendulum.yaml
# For LunarLanderContinuous-v2
python train_.py --config config_lunarlander.yaml
You can freely change the hyperparameter if you needed.
You can test with the pretrained networks. It can be downloaded in following links.
To render the playing result with the network, run
# before run this command, you should put the path to checkpoint in config file.
# For Pendulum-v1
python render.py --config config_pendulum.yaml
# For LunarLanderContinuous-v2
python render.py --config config_lunarlander.yaml
The rendered result of playing Pendulum-v1.
The rendered result of playing LunarLanderContinuous-v2.
The training logs of SAC algorithm for Pendulum-v1.
The training logs of SAC algorithm for LunarLanderContinuous-v2.