Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for etl apache spark
apache-spark
x
etl
x
18 search results found
Goodreads_etl_pipeline
⭐
593
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Cuelake
⭐
266
Use SQL to build ELT pipelines on a data lakehouse.
Hydrograph
⭐
138
A visual ETL development and debugging tool for big data
Cobrix
⭐
131
A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Flowman
⭐
85
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Spark
⭐
65
Open Source D-APM (Data-Application Performance Monitoring) for Apache Spark
Spark Etl
⭐
62
Apache Spark based ETL Engine
Apachespark
⭐
59
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Datapipelines Essentials Python
⭐
45
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Spark Ref Architecture
⭐
38
Reference Architectures for Apache Spark
Daflow
⭐
24
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Sql Based Etl With Apache Spark On Amazon Eks
⭐
23
A solution that provides declarative data processing capability, and workflow orchestration automation to help your business users (such as analysts and data scientists) access their data and create meaningful insights without the need for manual IT processes.
Sparklanes
⭐
16
A lightweight data processing framework for Apache Spark
Datacooker Etl
⭐
10
Data transformation framework for ETL processing with SQL-like syntax and GIS extensions, based on Apache Spark
Apache Spark Etl Pipeline Example
⭐
8
Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computing.
Spark Etl Framework
⭐
7
A generic ETL framework with Spark_SQL for transforming data by constructing pipelines with Yaml/Json/Xml
Spark Databricks
⭐
6
🔥 Master Apache Spark & Databricks! Dive into a world of big data with exclusive insights from Udemy courses, personal notes, and practical guides. Whether you're starting out or scaling new heights in data engineering, this is your ultimate resource hub! 🌟🚀
Stock_streaming_pipeline_project
⭐
5
Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transformed data into Glue database and created real-time dashboards using Power BI and Tableau with Athena. The pipeline is orchestrated using Airflow.
Spark Structured Streaming Kafka
⭐
5
Spark Structured Streaming + Kafka + Delta pipeline.
Related Searches
Python Etl (814)
Scala Apache Spark (497)
1-18 of 18 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.