Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Sparkit Learn | 1,054 | 5 | 3 years ago | 13 | June 24, 2015 | 35 | apache-2.0 | Python | ||
PySpark + Scikit-learn = Sparkit-learn | ||||||||||
Tdigest | 332 | 9 | 19 | 2 years ago | 14 | August 27, 2016 | 12 | mit | Python | |
t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark | ||||||||||
Replay | 109 | 1 | 3 months ago | 14 | November 24, 2023 | 13 | apache-2.0 | Python | ||
A Comprehensive Framework for Building End-to-End Recommendation Systems with State-of-the-Art Models | ||||||||||
Spark With Python | 98 | 4 years ago | mit | Jupyter Notebook | ||||||
Fundamentals of Spark with Python (using PySpark), code examples | ||||||||||
Song Playlist Recommendation | 43 | a year ago | 1 | HTML | ||||||
This project was a joint effort by Lucas De Oliveira, Chandrish Ambati, and Anish Mukherjee to create a song and playlist embeddings for recommendations in a distributed fashion using a 1M playlist dataset by Spotify. | ||||||||||
Dlsa | 33 | 6 months ago | 2 | gpl-3.0 | Python | |||||
Distributed least squares approximation (dlsa) implemented with Apache Spark | ||||||||||
Pyspark Algorithms | 33 | 4 years ago | 2 | other | Python | |||||
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2 | ||||||||||
Spark Xarray | 8 | 6 years ago | 1 | December 06, 2023 | 4 | apache-2.0 | Jupyter Notebook | |||
This is an experimental project that seeks to integrate PySpark and xarray for Climate Data Analysis. | ||||||||||
Databrickstraining | 6 | 5 years ago | gpl-3.0 | Python | ||||||
Repository for Microsoft Databricks Training Events - Hosted by BlueGranite |