Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Data Science Ipython Notebooks | 25,668 | a year ago | 34 | other | Python | |||||
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. | ||||||||||
Trino | 9,118 | 29 | a year ago | 83 | November 30, 2023 | 2,496 | apache-2.0 | Java | ||
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io) | ||||||||||
Vaex | 8,309 | 2 | 29 | 6 months ago | 69 | July 21, 2023 | 508 | mit | Python | |
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀 | ||||||||||
Catboost | 7,564 | 12 | a year ago | 20 | September 19, 2023 | 539 | apache-2.0 | Python | ||
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU. | ||||||||||
Pachyderm | 6,035 | 1 | a year ago | 613 | December 04, 2023 | 897 | apache-2.0 | Go | ||
Data-Centric Pipelines and Data Versioning | ||||||||||
Feast | 5,770 | 28 | 2 months ago | 116 | September 07, 2023 | 149 | apache-2.0 | Python | ||
The Open Source Feature Store for Machine Learning | ||||||||||
Synapseml | 5,108 | 6 | 13 days ago | 12 | November 27, 2023 | 335 | mit | Scala | ||
Simple and Distributed Machine Learning | ||||||||||
Koalas | 3,291 | 1 | 16 | 2 years ago | 47 | October 19, 2021 | 112 | apache-2.0 | Python | |
Koalas: pandas API on Apache Spark | ||||||||||
Data Science Roadmap | 2,445 | a year ago | 3 | mit | ||||||
Data Science Roadmap from A to Z | ||||||||||
Arcticdb | 1,614 | 3 | 2 months ago | 35 | December 07, 2023 | 260 | other | C++ | ||
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem. |