Spark Etl Atlas

A small project to show how to add lineage to Atlas when using Spark as ETL tool
Alternatives To Spark Etl Atlas
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Doris11,243
3 months ago8September 27, 20232,332apache-2.0Java
Apache Doris is an easy-to-use, high performance and unified analytics database.
Dagster9,46721335 months ago585December 07, 20232,343apache-2.0Python
An orchestration platform for the development, production, and observation of data assets.
Mage Ai6,324
5 months ago314December 06, 2023189apache-2.0Python
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.
Aws Glue Samples1,334
8 months ago37mit-0Python
AWS Glue code samples
Pyspark Example Project1,034
2 years ago11Python
Example project implementing best practices for PySpark ETL jobs and applications.
Zingg828
5 months ago1June 01, 202276agpl-3.0Java
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Goodreads_etl_pipeline593
4 years agomitPython
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Aws Glue Libs568
a year ago96otherPython
AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Metorikku536
a year ago126February 27, 202365mitScala
A simplified, lightweight ETL Framework based on Apache Spark
Spark Excel421365 months ago43February 22, 202183apache-2.0Scala
A Spark plugin for reading and writing Excel files
Alternatives To Spark Etl Atlas
Select To Compare


Alternative Project Comparisons
Popular Etl Projects
Popular Spark Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Jupyter Notebook
Table
Spark
Hive
Etl