Learn To Flap V1

An extensive survey of methods for training a Flappy Bird AI (CS403 course project)
Alternatives To Learn To Flap V1
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Flappylearning3,934
a month ago11mitJavaScript
Program learning to play Flappy Bird by machine learning (Neuroevolution)
Machine Learning Flappy Bird1,443
5 years ago6mitJavaScript
Machine Learning for Flappy Bird using Neural Network and Genetic Algorithm
Flappybirdrl860
5 years ago1JavaScript
Flappy Bird hack using Reinforcement Learning
Flappy Es140
a year agoPython
Flappy Bird AI using Evolution Strategies
Flappy Bird Genetic Algorithms75
6 years ago2mitPython
Use genetic algorithms to train flappy bird
Dqn56
6 years agoPython
Implementation of q-learning using TensorFlow
Asteroidslearning40
6 years ago1JavaScript
Program that learns to avoid asteroids by machine learning (Neuroevolution)
Flappybird Es31
6 years ago1Python
An AI agent Learning to play Flappy Bird using Evolution Strategies and deep learning models.
Q Bird19
6 months agoJavaScript
Flappy Bird with Q-learning
Flappy Bird19
2 years agootherC#
Flappy Bird solved using reinforcement learning in Unity with ML-Agents
Alternatives To Learn To Flap V1
Select To Compare


Alternative Project Comparisons
Readme

Learn to Flap

The idea is to show the power of Reinforcement Learning by building supervised learning based baselines and compare the scores with a Q-learning based bot. The training data has been taken by running simulations from here and recording the state and action. Then, we used a standard model-free Q-learning algorithm and compare results.

Results

For the baseline SVM and neural net models, we required a lot of data. Such data is not feasibly obtained in an arbitrary game. The point of the baseline is to show the effectiveness of RL algorithms. We also experimented on hyperparameter tuning for neural nets, and ran the bot multiple times and recorded the scores. We have included the models as well, so you can reproduce very similar results. After a bit of hyperparameter tuning, we settled for 4 layers with tanh and relu activations.

SVC tanh relu Q-learning

As we can see, the last histogram shows a denser distribution of higher scores than the other networks. All the other networks can barely get a 300+ score whereas the Q-learning model excels at this. Here is a scatter plot of the scores obtained in ~ 500 iterations of the game (Q-learning bot).

q-scatter

Q-learning nitty-gritties

To make the convergence faster, we employed both the epsilon greedy approach while training to search the state space more rigorously and not settle on locally optimal policies. Also, experience replay was used to keep in mind the previous experiences and learn from them. State space comprised of three parameters - vertical velocity of bird, distance from immediate pipe (both horizontal and vertical). The value of epsilon was decayed at a suitable rate and updation was done backwards (from last state to first state), so as to propagate information about the dead end states first. Training is stopped after sufficient convergence. The policy was also kept very simple: 1 point for being alive, 2 points for crossing the pipe, and -100000 points for crashing. This harsh policy ensured very quick convergence.

Road ahead

Many different algorithms can be tried keeping this as the baseline because Flappy Bird as a game is simple and can be extended. Different RL algorithms (A3C, for example) and genetic algorithms can also be tried on the game as well.

Demo

Here are some videos:

SVC model

NN model

Credits

Thanks to sourabhv/FlapPyBird for providing the raw environment.

Popular Flappy Bird Projects
Popular Machine Learning Projects
Popular Games Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Machine Learning
Deep Learning
Neural Network
Reinforcement Learning
Pygame
Baseline
Flappy Bird
Q Learning