A PyTorch Implementation of Vanilla Deep Q-Network (DQN)

This is a PyTorch implementation of the classic Deep Q-Network paper by Mnih et al: Human-level control through deep reinforcement learning. It's based on the vanilla DQN without additional techniques such as double DQN or dueling DQN, and was able to match the paper's results on Pong and Breakout (other games were not tested).

Pong: maximum testing reward of 19 in 6 million steps (v.s. benchmark 18.9 +/- 1.3)
Breakout: maximum testing reward of 377.6 in 60 million steps (v.s. benchmark 401.2 +/- 26.9)

Here, testing reward (labeled as eval_reward_0.05.avg in the reference charts below) is calculated by taking the mean of a rolling window of the most recent 50 eval episodes, where an eval eposide is run once every 10k steps, with a 5% exploration rate.

Pong: metrics for 6 million training steps

Breakout: metrics for 60 million training steps

Usage

Installing dependencies

$ pip install -r requirements.txt
$ sudo apt-get install libgl1-mesa-glx

Train a model for a specific game

$ python train.py <game> <number of training steps>

Use a trained model to play a game

$ python play.py <game> <path_to_model>

Reference Results

The references directory contains the tensorboard log files, models, and video recordings from the 6 million step training for Pong and 60 million step training for Breakout using the default config in config.py.

Name		Name	Last commit message	Last commit date
Latest commit History 357 Commits
images		images
models		models
references		references
utils		utils
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
config.py		config.py
play.py		play.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A PyTorch Implementation of Vanilla Deep Q-Network (DQN)

Usage

Reference Results

About

Releases

Packages

Languages

mingfei-li/dqn-atari

Folders and files

Latest commit

History

Repository files navigation

A PyTorch Implementation of Vanilla Deep Q-Network (DQN)

Usage

Reference Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages