A Multi Agent RL agent that solves Unity's Tennis environment for a collaboration task
Overview: Broad goal is to train an agent to successfully solve Unity's Tennis environment.
Goal & Rewards: In this environment, two agents control rackets to bounce a ball over a net. If an agent hits the ball over the net, it receives a reward of +0.1. If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01. Thus, the goal of each agent is to keep the ball in play.
State Space: The state space consists of 8 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation.
Action Space: Actions are continuous. Each action is a vector with 2 numbers, corresponding to moves toward (or away from) the net, and jumping.
Solving condition: The task is episodic, and in order to solve the environment, agents must get an average score of +0.5 (over 100 consecutive episodes, after taking the maximum over both agents). Specifically,
- After each episode, we add up the rewards that each agent received (without discounting), to get a score for each agent. This yields 2 (potentially different) scores. We then take the maximum of these 2 scores.
- This yields a single score for each episode.
The environment is considered solved when the average (over 100 episodes) of those scores is at least +0.5.
- Create (and activate) a new environment with Python 3.6.
-
Linux or Mac:
conda create --name collaboration_competition python=3.6
source activate collaboration_competition
-
Windows:
conda create --name collaboration_competition python=3.6
activate collaboration_competition
-
Clone the repository, Then, install several dependencies.
git clone https://github.com/ramanshrivastava/multi-agent-rl-tennis.git
pip install -r requirements.txt
-
Create an IPython kernel for the
collaboration_competition
environment.python -m ipykernel install --user --name collaboration_competition --display-name "collaboration_competition"
-
Before running code in a notebook, change the kernel to match the
collaboration_competition
environment by using the drop-down Kernel menu.
- Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
(For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the environment.
- Place the file in the
multi-agent-rl-tennis
GitHub repository, in the root folder, and unzip (or decompress) the file.
- Open the
Tennis.ipynb
jupyter notebook and execute each code cell to train the agent. - Once the necessary packages are imported as a next step provide the
file_name
for theUnityEnvironment
- Execute the rest of the cells as it is there after
- provide the
PATH
for.pth
files in the code cell 9 to save actor and critic model weights.