Awesome Bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.
Alternatives To Awesome Bigdata
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Data Science Ipython Notebooks25,668
6 months ago34otherPython
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Awesome Bigdata12,759
2 months ago38mit
A curated list of awesome big data frameworks, ressources and other awesomeness.
Trino9,118293 months ago83November 30, 20232,496apache-2.0Java
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Vaex8,161229a month ago69July 21, 2023508mitPython
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Catboost7,564123 months ago20September 19, 2023539apache-2.0Python
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
H2o 36,61862333 months ago49August 09, 20232,746apache-2.0Jupyter Notebook
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Pachyderm6,03513 months ago613December 04, 2023897apache-2.0Go
Data-Centric Pipelines and Data Versioning
Feast5,053283 months ago116September 07, 2023149apache-2.0Python
Feature Store for Machine Learning
Synapseml4,96067 days ago12November 27, 2023335mitScala
Simple and Distributed Machine Learning
Koalas3,2911167 months ago47October 19, 2021112apache-2.0Python
Koalas: pandas API on Apache Spark
Alternatives To Awesome Bigdata
Select To Compare


Alternative Project Comparisons
Popular Data Science Projects
Popular Big Data Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Database
Awesome List
Data Science
Data Visualization
Big Data
Data Analytics
Stream Processing
Distributed Database
Data Warehouse
Data Stream
Streaming Data