Data Science At Scale

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
Alternatives To Data Science At Scale
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Cudf6,93635 months ago31October 12, 20231,001apache-2.0C++
cuDF - GPU DataFrame Library
Koalas3,2911169 months ago47October 19, 2021112apache-2.0Python
Koalas: pandas API on Apache Spark
Stumpy2,901165 months ago28August 21, 202357otherPython
STUMPY is a powerful and scalable Python library for modern time series analysis
Datascience Anthology Pydata302
7 years ago4unlicense
PyData, The Complete Works of
Stupid Itertools Tricks Pydata124
8 years agoPython
code for my "stupid itertools tricks" talk from pydata seattle 2015
Pyvtreat11329 months ago42September 28, 20232otherPython
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
Data Science At Scale95
3 years ago4mitJupyter Notebook
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
Tales Science Data40
6 months ago77mitJupyter Notebook
WORK UNDER RESTRUCTURING
Geopython35
6 years agoJupyter Notebook
Awesome Python Videos18
4 years ago1mit
Learn and watch python, machine learning, and data science videos from conferences (PyCon, PyData, SciPy, PyBay, EuroPython)
Alternatives To Data Science At Scale
Select To Compare


Alternative Project Comparisons
Popular Pydata Projects
Popular Data Science Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Jupyter Notebook
Tutorials
Data Science
Anaconda
Binder
Pydata