Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Doris | 11,047 | 3 days ago | 8 | September 27, 2023 | 2,332 | apache-2.0 | Java | |||
Apache Doris is an easy-to-use, high performance and unified analytics database. | ||||||||||
Dagster | 9,467 | 2 | 133 | 2 months ago | 585 | December 07, 2023 | 2,343 | apache-2.0 | Python | |
An orchestration platform for the development, production, and observation of data assets. | ||||||||||
Mage Ai | 6,324 | 2 months ago | 314 | December 06, 2023 | 189 | apache-2.0 | Python | |||
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data. | ||||||||||
Aws Glue Samples | 1,334 | 5 months ago | 37 | mit-0 | Python | |||||
AWS Glue code samples | ||||||||||
Pyspark Example Project | 1,034 | a year ago | 11 | Python | ||||||
Example project implementing best practices for PySpark ETL jobs and applications. | ||||||||||
Zingg | 828 | 2 months ago | 1 | June 01, 2022 | 76 | agpl-3.0 | Java | |||
Scalable identity resolution, entity resolution, data mastering and deduplication using ML | ||||||||||
Goodreads_etl_pipeline | 593 | 4 years ago | mit | Python | ||||||
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform. | ||||||||||
Aws Glue Libs | 568 | 8 months ago | 96 | other | Python | |||||
AWS Glue Libraries are additions and enhancements to Spark for ETL operations. | ||||||||||
Metorikku | 536 | a year ago | 126 | February 27, 2023 | 65 | mit | Scala | |||
A simplified, lightweight ETL Framework based on Apache Spark | ||||||||||
Spark Excel | 421 | 3 | 6 | 2 months ago | 43 | February 22, 2021 | 83 | apache-2.0 | Scala | |
A Spark plugin for reading and writing Excel files |