Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
Airflow	34,468		320	9 days ago	169	November 27, 2023	890	apache-2.0	Python
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Airbyte	12,918		11	3 months ago	311	December 08, 2023	5,111	other	Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Dagster	9,467	2	133	3 months ago	585	December 07, 2023	2,343	apache-2.0	Python
An orchestration platform for the development, production, and observation of data assets.
Benthos	7,564		12	12 days ago	51	November 24, 2023	432	mit	Go
Fancy stream processing made operationally mundane
Mage Ai	6,324			3 months ago	314	December 06, 2023	189	apache-2.0	Python
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.
Cloudquery	5,380		7	3 months ago	345	May 22, 2023	236	mpl-2.0	Go
The open source high performance data integration platform built for developers.
Kestra	5,257		4	3 months ago	58	November 28, 2023	464	apache-2.0	Java
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Aws Sdk Pandas	3,779		65	a month ago	143	November 13, 2023	34	apache-2.0	Python
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Quadratic	2,485			3 months ago			124	mit	Rust
Quadratic \| Data Science Spreadsheet with Python & SQL
Incubator Devlake	2,322			3 months ago	207	November 29, 2023	111	apache-2.0	Go
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

Alternatives To Recap

Select To Compare

Airflow ⭐ 34,468

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

dependent packages 320total releases 169most recent commit 9 days ago

Airbyte ⭐ 12,918

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

dependent packages 11total releases 311most recent commit 3 months ago

Dagster ⭐ 9,467

An orchestration platform for the development, production, and observation of data assets.

dependent packages 133total releases 585most recent commit 3 months ago

Benthos ⭐ 7,564

Fancy stream processing made operationally mundane

dependent packages 12total releases 51most recent commit 12 days ago

Mage Ai ⭐ 6,324

🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.

total releases 314most recent commit 3 months ago

Cloudquery ⭐ 5,380

The open source high performance data integration platform built for developers.

dependent packages 7total releases 345most recent commit 3 months ago

Kestra ⭐ 5,257

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

dependent packages 4total releases 58most recent commit 3 months ago

Aws Sdk Pandas ⭐ 3,779

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

dependent packages 65total releases 143most recent commit a month ago

Quadratic ⭐ 2,485

Quadratic | Data Science Spreadsheet with Python & SQL

most recent commit 3 months ago

Incubator Devlake ⭐ 2,322

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

total releases 207most recent commit 3 months ago

Suggest An Alternative To recap

Alternative Project Comparisons

Recap vs Aws Sdk Pandas

Recap vs Quadratic

Recap vs Incubator Devlake

Popular Etl Projects

Tidb ⭐ 35,604

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://tidbcloud.com/free-trial

dependent packages 149total releases 1,289latest release April 07, 2022most recent commit 3 months ago

Doris ⭐ 11,243

Apache Doris is an easy-to-use, high performance and unified analytics database.

total releases 8latest release September 27, 2023most recent commit 25 days ago

Pentaho Kettle ⭐ 7,194

Pentaho Data Integration ( ETL ) a.k.a Kettle

most recent commit 3 months ago

Steampipe ⭐ 6,061

Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.

dependent packages 3total releases 519latest release December 05, 2023most recent commit 3 months ago

Orchest ⭐ 3,876

Build data pipelines, the easy way 🛠️

total releases 19latest release December 13, 2022most recent commit a year ago

Popular Data Engineering Projects

Superset ⭐ 58,778

Apache Superset is a Data Visualization and Data Exploration Platform

dependent packages 21total releases 6latest release April 18, 2023most recent commit 10 days ago

Made With Ml ⭐ 35,496

Learn how to design, develop, deploy and iterate on production-grade ML applications.

total releases 5latest release May 15, 2019most recent commit 5 months ago

Applied Ml ⭐ 24,828

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

most recent commit 7 months ago

Data Engineering Zoomcamp ⭐ 19,461

Free Data Engineering course!

most recent commit 3 months ago

Prefect ⭐ 14,603

Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines

dependent packages 152total releases 249latest release December 08, 2023most recent commit 11 days ago

Popular Data Processing Categories