Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Data Engineering Zoomcamp | 19,461 | 3 months ago | 27 | Jupyter Notebook | ||||||
Free Data Engineering course! | ||||||||||
Cookbook | 12,557 | 4 months ago | 111 | apache-2.0 | ||||||
The Data Engineering Cookbook | ||||||||||
Dagster | 9,467 | 2 | 133 | 3 months ago | 585 | December 07, 2023 | 2,343 | apache-2.0 | Python | |
An orchestration platform for the development, production, and observation of data assets. | ||||||||||
Mage Ai | 6,324 | 3 months ago | 314 | December 06, 2023 | 189 | apache-2.0 | Python | |||
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data. | ||||||||||
Risingwave | 5,799 | 3 months ago | 14 | December 07, 2023 | 1,010 | apache-2.0 | Rust | |||
The distributed streaming database. Engineered to offer the simplest and most cost-efficient way for stream processing and management. | ||||||||||
Awesome Opensource Data Engineering | 1,331 | 4 months ago | 9 | other | ||||||
An Awesome List of Open-Source Data Engineering Projects | ||||||||||
Pyspark Example Project | 1,034 | a year ago | 11 | Python | ||||||
Example project implementing best practices for PySpark ETL jobs and applications. | ||||||||||
Around Dataengineering | 926 | 2 years ago | 2 | Python | ||||||
A Data Engineering & Machine Learning Knowledge Hub | ||||||||||
Zingg | 828 | 3 months ago | 1 | June 01, 2022 | 76 | agpl-3.0 | Java | |||
Scalable identity resolution, entity resolution, data mastering and deduplication using ML | ||||||||||
Blaze | 784 | 3 months ago | 14 | apache-2.0 | Rust | |||||
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core. |