Data Science Toolkit

Collection of stats, modeling, and data science tools in Python and R.
Alternatives To Data Science Toolkit
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Keras58,5633305 hours ago68May 13, 2022391apache-2.0Python
Deep Learning for humans
Scikit Learn54,52518,9446,72213 hours ago64May 19, 20222,193bsd-3-clausePython
scikit-learn: machine learning in Python
Ml For Beginners49,272
a day ago12mitJupyter Notebook
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Made With Ml33,193
a month ago5May 15, 201911mitJupyter Notebook
Learn how to responsibly develop, deploy and maintain production machine learning applications.
Spacy26,3341,533842a day ago196April 05, 2022107mitPython
💫 Industrial-strength Natural Language Processing (NLP) in Python
Ray25,975801999 hours ago76June 09, 20222,886apache-2.0Python
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
Streamlit25,2021740410 hours ago182July 27, 2022640apache-2.0Python
Streamlit — A faster way to build and share data apps.
Data Science Ipython Notebooks25,025
a month ago33otherPython
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Applied Ml24,242
16 days ago3mit
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Ai Expert Roadmap24,033
4 months ago13mitJavaScript
Roadmap to becoming an Artificial Intelligence Expert in 2022
Alternatives To Data Science Toolkit
Select To Compare

Alternative Project Comparisons


Welcome! The purpose of this repository is to serve as stockpile of statistical methods, modeling techniques, and data science tools. The content itself includes everything from educational vignettes on specific topics, to tailored functions and modeling pipelines built to enhance and optimize analyses, to notes and code from various data science conferences, to general data science utilities. This will remain a work in progress, and I welcome all contributions and constructive criticism. If you have a suggestion or request, please use the "Issues" tab and I will endeavor to respond expeditiously!

Note: GitHub often has trouble rendering larger .ipynb files in particular. If you find that you are unable to view one of the jupyter notebooks linked below, I recommend copy and pasting the result into jupyter's nbviewer, which will take you to a viewable link like this one here for my "Visualization with Plotly" notebook. Note that if you want to ensure that you are viewing the most up-to-date version of the notebook with nbviewer, you should add ?flush_cache=true to the end of the generated URL as is described here; otherwise, your link risks being slightly out-of-date.

Table of Contents

  1. Playground and Basics
    1. Rough Notes from ISLR Exercises -- R
    2. Rough Notes from Python Data Scientist Track -- Python
  2. Exploratory Data Analysis (EDA) and Visualization
    1. Practical Data Visualization with Python (Full Course) -- Python
    2. EDA and Basic Viz. -- R
    3. Visualizing Geographic Data -- Python
    4. Radar Charts -- Python
  3. Hypothesis Testing
    1. Kolmogorov-Smirnov Test (KS Test) -- R
    2. Useful Hypothesis Testing Functions -- R
  4. Classification
    1. Logistic Regression (Ridge and Lasso Methods Included) -- R
    2. Useful Classification Functions -- R
    3. Basic Tree Models -- R
    4. KNN -- R
  5. Regression
    1. Linear Regression -- Python
  6. Reinforcement Learning
  7. Text Mining and Natural Language Processing (NLP)
    1. Basic Texting Mining and NLP -- R
  8. Time Series
    1. Time Series Forecasting with Facebook's Prophet Package -- Python
  9. Notes and Material from Data Science Conferences
    1. PyData 2018 DC Conference (Notes and Tutorial Code) -- Python
    2. Max Khun / RStudio Supervised Learning 2019 DC Conference -- R
    3. PyCon 2019 Conference (Notes and Session Code) -- Python
  10. Utilities
    1. HTML File Appender (Using Beautiful Soup) -- Python

Contribution Info

All are welcome and encouraged to contribute to this repository. My only request is that you include a detailed description of your contribution, that your code be thoroughly-commented, and that you test your contribution locally with the most recent version of the master branch integrated prior to submitting the PR.

Popular Data Science Projects
Popular Machine Learning Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Machine Learning
Natural Language Processing
Data Science
Data Visualization
Reinforcement Learning
Data Mining
Logistic Regression
Statistical Analysis