Soft Actor Critic

This repository is the Pytorch implementation of Soft Actor Critic This implemetation is based on the algorithm version2 which removes Value Approximator.

Pseudocode of the SAC algorithm

Set up

Install dependencies with Docker. You can also install dependencies using requirements.txt.

To build Docker image, run this command.

# format: docker build -t . <image_name>
docker build -t . sac

After building image, use the following command to run the Docker container.

docker run -ti --gpus '"device='<gpu number>'"' -v <your working directory>:/app --ipc=host --name <container_name> <image_name> /bin/bash

# or you can run this command after changing docker_run.sh file in proper format
./docker_run.sh <gpu num> <container_name>

Train

If you want to train your own agent with SAC algorithm in Pendulum-v1 or LunarLanderContinuous-v2 environment, run this command inside the Docker container.

# For Pendulum-v1
python train_.py --config config_pendulum.yaml

# For LunarLanderContinuous-v2
python train_.py --config config_lunarlander.yaml

You can freely change the hyperparameter if you needed.

Test

You can test with the pretrained networks. It can be downloaded in following links.

Pendulum_v1

LunarLanderContinuous-v2

To render the playing result with the network, run

# before run this command, you should put the path to checkpoint in config file.
# For Pendulum-v1
python render.py --config config_pendulum.yaml

# For LunarLanderContinuous-v2
python render.py --config config_lunarlander.yaml

Results

The rendered result of playing Pendulum-v1.

The rendered result of playing LunarLanderContinuous-v2.

The training logs of SAC algorithm for Pendulum-v1.

The training logs of SAC algorithm for LunarLanderContinuous-v2.

Acknowlegement

https://github.com/seungeunrho/minimalRL/blob/master/sac.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Soft Actor Critic

Pseudocode of the SAC algorithm

Set up

Train

Test

Results

Acknowlegement

Files

README.md

Latest commit

History

README.md

File metadata and controls

Soft Actor Critic

Pseudocode of the SAC algorithm

Set up

Train

Test

Results

Acknowlegement