Awesome Open Source
Awesome Open Source


This project aims to reproduce the results of several model-free RL algorithms in continuous action domain (mujuco environment).

This projects

  • uses pytorch package
  • implements different algorithms independently in seperate files / minimal files
  • is written in simplest style
  • tries to follow the original paper and reproduce their results

My first stage of work is to reproduce this figure in the PPO paper.

  • [x] A2C
  • [x] ACER (A2C + Trust Region): It seems that this implementation has some problems ... (welcome bug report)
  • [X] CEM
  • [x] TRPO (TRPO single path)
  • [x] PPO (PPO clip)
  • [x] Vanilla PG

On the next stage, I want to implement

Then next stage, discrete action space problem and raw video input (Atari) problems:

  • [X] Rainbow: DQN and relevant techniques (target network / double Q-learning / prioritized experience replay / dueling network structure / distributional RL)
  • [X] PPO with random network distillation (RND)

Rainbow on Atari with only 3M: It works but may need further tuning.

And then model-based algorithms (not planned)

  • [ ] PILCO
  • [ ] PE-TS


  • [ ] change the way reward counts, current way may underestimate the reward (evaluate a deterministic model rather a stochastic/exploratory model)

PPO Implementation

PPO implementation is of high quality - matches the performance of openai.baselines.


Recently, I added Rainbow and DQN. The Rainbow implementation is of high quality on Atari games - enough for you to modify and write your own research paper. The DQN implementation is a minimum workaround and reaches a good performance on MountainCar (which is a simple task but many codes on Github do not achieve good performance or need additional reward/environment engineering). This is enough for you to have a fast test of your research ideas.

Alternatives To Reinforcement Implementation
Select To Compare

Alternative Project Comparisons
Related Awesome Lists
Top Programming Languages

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Python (891,361
Algorithms (38,042
Rl (2,335
Dqn (1,426
Atari (1,252
Rainbow (1,241
Ppo (632
Trpo (131