This project aims to reproduce the results of several model-free RL algorithms in continuous action domain (mujuco environment).
This projects
My first stage of work is to reproduce this figure in the PPO paper.
On the next stage, I want to implement
Then next stage, discrete action space problem and raw video input (Atari) problems:
Rainbow on Atari with only 3M: It works but may need further tuning.
And then model-based algorithms (not planned)
TODOs:
PPO implementation is of high quality - matches the performance of openai.baselines.
Recently, I added Rainbow and DQN. The Rainbow implementation is of high quality on Atari games - enough for you to modify and write your own research paper. The DQN implementation is a minimum workaround and reaches a good performance on MountainCar (which is a simple task but many codes on Github do not achieve good performance or need additional reward/environment engineering). This is enough for you to have a fast test of your research ideas.