The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.

Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|

Datascienceprojects | 449 | 4 months ago | 1 | Jupyter Notebook | ||||||

The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory. | ||||||||||

Ncaahoopr | 176 | 11 days ago | 5 | mit | R | |||||

An R package for working with NCAA Basketball Play-by-Play Data | ||||||||||

Bettor | 55 | 9 days ago | 1 | other | R | |||||

R Package for Sports betting | ||||||||||

Cfbscrapr | 22 | 2 years ago | other | R | ||||||

A scraping and aggregating package using the CollegeFootballData API | ||||||||||

News Shot Classification | 13 | 6 years ago | 1 | Python | ||||||

Extracts the shot classes and generic visual features for a broadcast news video. | ||||||||||

Tennis_match_prediction | 11 | 2 years ago | mit | Jupyter Notebook | ||||||

Research on calculating win probability and forecasting serve performance in tennis matches. | ||||||||||

Odds.converter | 9 | 5 years ago | R | |||||||

Convert Sports Betting Odds | ||||||||||

Field Goal Models | 6 | 7 years ago | R | |||||||

Modeling NFL Field Goal Probabilities in R | ||||||||||

Modeling The World Cup 2018 | 3 | 4 years ago | Jupyter Notebook | |||||||

Making World Cup 2018 predictions using statistical modeling with Python and player data from the FIFA 18 video game. | ||||||||||

Pybettor | 3 | 4 days ago | mit | Python | ||||||

Alternatives To DatascienceprojectsSelect To Compare

Alternative Project Comparisons

Readme

In this repository, you will find the source code to various projects I have been working on or still work-in-progress. The majority of the projects are accompanied by a Medium blog posts at tuannguyen-doan.medium.com. I published almost exclusively on Towards Data Science publication through Medium's Partnership program so please check out these articles as a way to support me and my future projects. Alternatively, you can also find my blog posts at my personal website here.

My interests lie in the intersection of statistical techniques, data visualization and sports (especially football). All the codes are written entirely in Python or R. I don't have a strong preference or attempt to make a concerted effort to code in a specific language/platform. The decision is mostly based on how specific functionalities needed for a project are supported (scraping in Python and data processing with dplyr piping in R).

A collection of projects that explore the intricate statistical aspect of the Beautiful Game

- Empirical Bayes and penalty taking ability - Using Bayesian statistics to make meaningful comparison between players across Europe.
- Poisson process and match prediction - Here we learn about the Poisson process and how a random model outperforms football experts with its prediction.
- The mathematics of football betting strategies - With the Poisson model and some additional help from mathematical research, can we beat the bookies?
- Fisher vs Neyman-Person debate and Paul the Octopus - We went over the theory (or many theories) of hypothesis testings and see how they apply to the psychic ability of Paul the Octopus.

- Bayes theorem and a probabilistic argument for God - Bayes theory and how people have been using it to justify the necessary existence of God.
- Dating with probability theory - Here we explore what probability theory has to say about the most optimal strategy to find the love of your life.
- Bayes theorem and why it matters to my workout routine - A lightweight introduction to Bayes' theorem and how it helps convince me to hit the gym.
- The Rule of Three and its application - A short introduction of the Rule of Three and how we can apply it to calculate the probability of events that have yet to happen. Application in voting, vaccine development, product quality monitoring, etc.
- Lindy's effect - A (slightly) mathematical description of the Lindy's effect and how one can use it as a guide for life.
- Normal Distribution with High Dimensionality - A statistical investigation into the myth of the "average Joe."

- A robust and scalable method to compare Percentile metrics in online experiments (Quora Data Blog, 2022) Conducting statistical tests for Percentile metrics can be tricky, as they have less neat mathematical properties than other more common metrics, such as the average or the ratios. I discuss Quora's method to A/B test these metrics in a statistically valid and scalable manner.
- How social learning amplifies moral outrage expression in online social networks (Science Advances, 2021) - Moral outrage shapes fundamental aspects of social life and is now widespread in online social networks. Here, we show how social learning processes amplify online moral outrage expressions over time.
- Application of machine learning models in predicting length of stay among healthcare workers in underserved communities in South Africa (Human Resources for Health, 2018) - We aim to use machine learning methods to predict health professional’s length of practice in the rural public healthcare sector based on their demographic information.

- NetworkX and Basemap - Here is a comprehensive tutorial of how we can visualize geographical data with powerful tools that support Python.
- Tkinter and Python - Building your own firework shows with Tkinter (and some math chops).
- Data visualization with Matplotlib and Seaborn - Learn how to construct publish-worthy visualizations with Matplotlib and Seaborn packages.

- End-to-end Machine Learning project with R - Here is a full data science project that covers data collection, cleaning, visualization, machine learning and validation.
- Unsupervised Learning - Clustering method with R - An introduction to an array of unsupervised learning algorithms: Hierachical clustering, k-means, and Factor Analysis.
- Collaborative Filtering with Python - A comprehensive guide to the mathematical details and implementation of popular Matrix Factorization methods.

Popular Probability Projects

Popular Sports Projects

Popular Mathematics Categories

Related Searches

Get A Weekly Email With Trending Projects For These Categories

No Spam. Unsubscribe easily at any time.

Jupyter Notebook

Visualization

Data Science

Statistics

Data Visualization

Probability

Football

Sports