Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Synapseml | 4,967 | 6 | 10 days ago | 12 | November 27, 2023 | 335 | mit | Scala | ||
Simple and Distributed Machine Learning | ||||||||||
Machine Learning | 2,607 | 4 months ago | 6 | mit | HTML | |||||
:earth_americas: machine learning tutorials (mainly in Python3) | ||||||||||
Spark Py Notebooks | 1,515 | a year ago | 9 | other | Jupyter Notebook | |||||
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks | ||||||||||
Optimus | 1,446 | 17 days ago | 32 | June 19, 2022 | 29 | apache-2.0 | Python | |||
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark | ||||||||||
Hopsworks | 1,041 | 3 months ago | 1 | September 11, 2019 | 12 | agpl-3.0 | Java | |||
Hopsworks - Data-Intensive AI platform with a Feature Store | ||||||||||
Pyspark Example Project | 1,034 | a year ago | 11 | Python | ||||||
Example project implementing best practices for PySpark ETL jobs and applications. | ||||||||||
Kuwala | 610 | 2 years ago | 22 | apache-2.0 | JavaScript | |||||
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times | ||||||||||
Pandapy | 483 | 3 years ago | 22 | January 25, 2020 | 2 | Python | ||||
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai) | ||||||||||
Datacompy | 339 | 10 | 3 months ago | 20 | November 15, 2023 | 16 | apache-2.0 | Python | ||
Pandas and Spark DataFrame comparison for humans and more! | ||||||||||
Sk Dist | 283 | 2 | a year ago | 12 | May 14, 2020 | 8 | apache-2.0 | Python | ||
Distributed scikit-learn meta-estimators in PySpark |