Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
Spark	37,661	2,394	939	3 months ago	46	May 09, 2021	186	apache-2.0	Scala
Apache Spark - A unified analytics engine for large-scale data processing
Cookbook	12,557			4 months ago			111	apache-2.0
The Data Engineering Cookbook
God Of Bigdata	8,483			9 months ago			3
专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Iceberg	5,179			3 months ago	3	October 29, 2022	1,485	apache-2.0	Java
Apache Iceberg
Bigdl	4,728		10	3 months ago	16	April 19, 2021	958	apache-2.0	Jupyter Notebook
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using bigdl-llm
Sparkinternals	4,665			3 years ago			27
Notes talking about the design and implementation of Apache Spark
Tensorflowonspark	3,851	5		10 months ago	32	April 21, 2022	13	apache-2.0	Python
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Spark Nlp	3,578		30	3 months ago	134	December 08, 2023	43	apache-2.0	Scala
State of the Art Natural Language Processing
Roaringbitmap	3,308	435	124	3 months ago	187	September 22, 2023	89	apache-2.0	Java
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Tablesaw, and many others
Koalas	3,291	1	16	7 months ago	47	October 19, 2021	112	apache-2.0	Python
Koalas: pandas API on Apache Spark

Alternatives To Awesome Spark

Select To Compare

Spark ⭐ 37,661

Apache Spark - A unified analytics engine for large-scale data processing

dependent packages 939total releases 46most recent commit 3 months ago

Cookbook ⭐ 12,557

The Data Engineering Cookbook

most recent commit 4 months ago

God Of Bigdata ⭐ 8,483

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.

most recent commit 9 months ago

Iceberg ⭐ 5,179

Apache Iceberg

total releases 3most recent commit 3 months ago

Bigdl ⭐ 4,728

Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using bigdl-llm

dependent packages 10total releases 16most recent commit 3 months ago

Sparkinternals ⭐ 4,665

Notes talking about the design and implementation of Apache Spark

most recent commit 3 years ago

Tensorflowonspark ⭐ 3,851

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

total releases 32most recent commit 10 months ago

Spark Nlp ⭐ 3,578

State of the Art Natural Language Processing

dependent packages 30total releases 134most recent commit 3 months ago

Roaringbitmap ⭐ 3,308

A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Tablesaw, and many others

dependent packages 124total releases 187most recent commit 3 months ago

Koalas ⭐ 3,291

Koalas: pandas API on Apache Spark

dependent packages 16total releases 47most recent commit 7 months ago

Suggest An Alternative To awesome-spark

Alternative Project Comparisons

Awesome Spark vs Spark

Awesome Spark vs Cookbook

Awesome Spark vs God Of Bigdata

Awesome Spark vs Iceberg

Awesome Spark vs Bigdl

Awesome Spark vs Sparkinternals

Awesome Spark vs Tensorflowonspark

Awesome Spark vs Spark Nlp

Awesome Spark vs Roaringbitmap

Awesome Spark vs Koalas

Popular Spark Projects

Data Science Ipython Notebooks ⭐ 25,668

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

most recent commit 7 months ago

Redash ⭐ 24,479

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

dependent packages 3total releases 2latest release May 05, 2020most recent commit 3 months ago

Docker_practice ⭐ 23,279

Learn and understand Docker&Container technologies, with real DevOps practice!

total releases 9latest release December 01, 2021most recent commit 4 months ago

Data Engineering Zoomcamp ⭐ 19,461

Free Data Engineering course!

most recent commit 3 months ago

Bigdata Notes ⭐ 14,872

大数据入门指南 :star:

most recent commit 4 months ago

Popular Apache Projects

Superset ⭐ 58,778

Apache Superset is a Data Visualization and Data Exploration Platform

dependent packages 21total releases 6latest release April 18, 2023most recent commit 3 days ago

Echarts ⭐ 58,775

Apache ECharts is a powerful, interactive charting and data visualization library for browser

dependent packages 6,345total releases 119latest release July 18, 2023most recent commit 18 days ago

Awesome Cpp ⭐ 53,034

A curated list of awesome C++ (or C) frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.

most recent commit 3 months ago

Awesome Android Ui ⭐ 47,955

A curated list of awesome Android UI/UX libraries

most recent commit 5 months ago

Acme.sh ⭐ 35,147

A pure Unix shell script implementing ACME client protocol

most recent commit 3 months ago

Popular Data Processing Categories