Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for hadoop data engineering
data-engineering
x
hadoop
x
14 search results found
Cookbook
⭐
12,557
The Data Engineering Cookbook
Awesome Opensource Data Engineering
⭐
1,331
An Awesome List of Open-Source Data Engineering Projects
Data Engineering Interview Questions
⭐
554
More than 2000+ Data engineer interview questions.
Cascading
⭐
321
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms. Please see https://github.com/cwensel/cascading for access to all WIP branches.
Flowman
⭐
85
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Waimak
⭐
73
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Apachespark
⭐
59
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Spark Distcp
⭐
18
A re-implementation of Hadoop DistCP in Apache Spark
Huemul Bigdatagovernance
⭐
10
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, largos de textos, máximos/mínimos de números y fechas, valores únicos y valores por default. También permite clasificar los campos en aplicabilidad de der
Spooq
⭐
8
Awesome Data Pipeline
⭐
6
Awesome list for datapipeline
Data Engineer Portfolio
⭐
6
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Sparklyclean
⭐
6
Optimal distributed data deduplication and supervised learning pipeline using Apache Spark
Data Engineering Project With Hdfs And Kafka
⭐
5
Data Engineering Project with Hadoop HDFS and Kafka
Related Searches
Java Hadoop (2,117)
Spark Hadoop (1,188)
Hadoop Hdfs (1,082)
Hadoop Mapreduce (851)
Shell Hadoop (772)
Python Hadoop (761)
Hadoop Hive (703)
1-14 of 14 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.