Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for spark data pipeline
data-pipeline
x
spark
x
11 search results found
Dagster
⭐
9,467
An orchestration platform for the development, production, and observation of data assets.
Mage Ai
⭐
6,324
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.
Mleap
⭐
1,479
MLeap: Deploy ML Pipelines to Production
Zdh_web
⭐
379
大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批
Scalable Data Science Platform
⭐
153
Content for architecting a data science platform for products using Luigi, Spark & Flask.
Smart Data Lake
⭐
87
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Delta Architecture
⭐
66
Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline
Udacity Data Engineer Nanodegree
⭐
64
Classwork projects and home works done through Udacity data engineering nano degree
Datapipelines Essentials Python
⭐
45
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Trembita
⭐
43
Model complex data transformation pipelines easily
Spark Transformers
⭐
37
Spark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
Debussy_concert
⭐
29
Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and pipelines.
Nebula Exchange
⭐
26
NebulaGraph Exchange is an Apache Spark application to parse data from different sources to NebulaGraph in a distributed environment. It supports both batch and streaming data in various formats and sources including other Graph Databases, RDBMS, Data warehouses, NoSQL, Message Bus, File systems, etc.
Jobanalytics_and_search
⭐
22
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Spark Movies Etl
⭐
21
Spark data pipeline that ingests and transforms movie ratings data.
Pramen
⭐
20
Resilient data pipeline framework running on Apache Spark
Sparkplug
⭐
20
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
Data Pipeline Project
⭐
18
Data pipeline project
Marshmallow Pyspark
⭐
12
Marshmallow serializer integration with pyspark
Data Paths
⭐
11
Fake Data Pipeline
⭐
10
Data Generators -> Kafka -> Spark Streaming -> PostgreSQL -> Grafana
Awesome Data Pipeline
⭐
6
Awesome list for datapipeline
Udacity Data Engineering Nanodegree
⭐
5
This is a repository to hold the files and notebooks produced throughout my Udacity's Nanodegree Data Engineering program.
Related Searches
Scala Spark (3,279)
Python Spark (2,053)
Java Spark (1,587)
Apache Spark (1,207)
Spark Hadoop (1,188)
Jupyter Notebook Spark (1,151)
Spark Kafka (985)
Spark Streaming (817)
Spark Pyspark (812)
Spark Hdfs (573)
1-11 of 11 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.