RL-botics is a toolbox with highly optimized implementations of Deep Reinforcement Learning algorithms for robotics developed with Keras and TensorFlow in Python3.
The objective was to have modular, clean and easy to read codebase so that the research community may build on top with ease. The implementations can be integrated with OpenAI Gym environments. The majority of the algorithms are Policy Search Methods as the toolbox is targetted for robotic applications.
It is highly recommended to install this package in a virtual environment, such as Miniconda. Please find the Conda installation here.
To create a new conda environment called
conda create -n RL python=3
To activate the environment:
source activate RL
To deactivate the environment:
To install the package, we recommend cloning the original package:
git clone https://github.com/Suman7495/rl-botics.git cd rl-botics pip install -e .
To run any algorithm in the default setting, simply run:
cd rl_botics/<algo>/ python main.py
For example, to run TRPO:
cd rl_botics/trpo/ python main.py
Numerous other options can be added too, but it is recommended to modify the hyerperparameters in
The algorithms implemented are:
To be added:
All environments are in the
envs directory. The environments available currently are:
All the algorithms are in the
rl_botics directory. Each algorithm specified above has an individual directory.
common contains common modular classes to easily build new algorithms.
approximators: Basic Deep Neural Networks (Dense, Conv, LSTM).
data_collection: Performs rollouts and collect observations and rewards
logger: Log training data and other information
plotter: Plot graphs
policies: Common policies such as Random, Softmax, Parametrized Softmax and Gaussian Policy
utils: Functions to compute the expected return, the Generalized Advantage Estimation (GAE), etc.
Each algorithm directory contains at least 3 files:
main.py: Main script to run the algorithm
hyperparameters.py: File to contain the default hyperparameters
<algo>.py: Implementation of the algorithm
utils.py: (Optional) File containing some utility functions
Some algorithm directories may have additional files specific to the algorithm.
To contribute to this package, it is recommended to follow this structure:
main.pyshould contain at least the following functions:
main: Parses input argument, builds the environment and agent, and train the agent.
argparse: Parses input argument and loads default hyperparameters from
<algo>.pyshould contain at least the following methods:
__init__: Initializes the classes
_build_graph: Calls the following methods to build the TensorFlow graph:
_init_placeholders: Initialize TensorFlow placeholders
_build_policy: Build policy TensorFlow graph
_build_value_function: Build value function TensorFlow graph
_loss: Build policy loss function TensorFlwo graph
train: Main training loop called by
update_policy: Update the policy
update_value: Update the value function
print_results: Print the training results
process_paths: (optional) Process collected trajectories to return the feed dictionary for TensorFlow
It is recommended to check the structure of
ppo.py and follow a similar structure.