Reinforcement.Learning

Contributions are welcome

Progress

- Deep Q Network
- Dueling Q Network
- Policy Gradient: REINFORCE
- Advantage Actor-Critic
- Deep Deterministic Policy Gradient

TODO

- Asynchronous Advantage Actor-Critic (A3C)
- Estimate the concrete performance of each algorithms

Licence

MIT Licence