Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for docker etl
docker
x
etl
x
44 search results found
Orchest
⭐
3,876
Build data pipelines, the easy way 🛠️
Open Semantic Search
⭐
741
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Redun
⭐
464
Yet another redundant workflow engine
Abc
⭐
455
Power of appbase.io via CLI, with nifty imports from your favorite data sources
Neosync
⭐
413
A developer-first way to create high-fidelity synthetic data or anonymize sensitive data and sync it across all environments for testing, fine-tuning or model training.
Smooks
⭐
377
Extensible data integration Java framework for building XML and non-XML fragment-based applications
Beginner_de_project
⭐
276
Beginner data engineering project - batch edition
Usaspending Api
⭐
273
Server application to serve U.S. federal spending data via a RESTful API
Etl
⭐
135
LinkedPipes ETL is an RDF based, lightweight ETL tool
Aws Ecs Airflow
⭐
110
Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Deployml_course
⭐
93
Репозиторий для открытого курса «Промышленная эксплуатация моделей машинного обучения»
Openrefine Batch
⭐
70
Shell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Discreetly
⭐
70
ETLy is an add-on dashboard service on top of Apache Airflow.
Openrefine Client
⭐
57
The OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, Mac). It is also available via Docker Hub, PyPI and Binder.
Etl Light
⭐
38
A light Kafka to HDFS/S3 ETL library based on Apache Spark
Ethereum_analytical_db
⭐
25
Ethereum Analytical Database - Ethereum data access solution that can be used for analytics and application development. The solution works on a fast DB - Clickhouse.
Hotsub
⭐
23
Command line tool to run batch jobs concurrently with ETL framework on AWS or other cloud computing resources
Aws Glue Docker
⭐
22
🐋 Docker image for AWS Glue Spark/Python
Aws_glue_etl_docker
⭐
20
Helper library to run AWS Glue ETL scripts docker container for local testing of development in a Jupyter notebook
Airflow Tutorial
⭐
19
Tutorial like code for how to deploy airflow using docker and how to use the DockerOperator.
Openrefine Docker
⭐
19
OpenRefine is a free, open source power tool for working with messy data and improving it. This repository contains Dockerbuild files for automated builds.
Tap Airbyte Wrapper
⭐
17
A Singer tap that wraps Airbyte sources allowing them to be consumed by Singer targets
Covid News
⭐
16
A data engineering personal project for applying some of my skills
Ghcn D
⭐
14
Data Pipeline from the Global Historical Climatology Network DataSet
Atd Data Publishing
⭐
13
Python scripts for Austin Transportation's ETL tasks
Tezos Etl
⭐
12
Python scripts for ETL (extract, transform and load) jobs for Tezos blocks, balance updates, and operations
Rhizome
⭐
12
Generic ETL Engine / Visualization Framework for Polio and Beyond!
Airflowjob
⭐
11
Airflow POC demo : 1) env set up 2) airflow DAG 3) Spark/ML pipeline | #DE
Nifi
⭐
10
Production Grade Nifi & Nifi Registry. Deploy for VM (Virtual Machine) with Terraform + Ansible, Helm & Helmfile for Kubernetes (EKS)
Greatex
⭐
10
A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.
Diem
⭐
10
DIEM Data Integration Engine Multipurpose
Covalent Awsbatch Plugin
⭐
10
Executor plugin interfacing Covalent with AWS Batch
Covalent Kubernetes Plugin
⭐
9
Executor plugin interfacing Covalent with Kubernetes
Data Engineering
⭐
9
This is an all-in-one repository for Data Engineers, ideal for beginners & interview preparation, which includes Python as the main Programing language incorporating MySQL, MongoDB and Docker
Etl Rest Server
⭐
9
This project hosts scripts to generate flat tables used for reporting purposes.
Bigdatawarehouse
⭐
8
Coronalytics
⭐
7
A full ETL pipeline to visualize the corona virus outbreak.
Covalent Ecs Plugin
⭐
7
Executor plugin interfacing Covalent with Amazon ECS Fargate
Orbyter Demo
⭐
7
30daysofairflow
⭐
7
30 Days of Airflow
Elt_docker_deploy
⭐
5
Airflow Etl Mssql Sample
⭐
5
Airflow ETL MS SQL Sample Project
Stock_streaming_pipeline_project
⭐
5
Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transformed data into Glue database and created real-time dashboards using Power BI and Tableau with Athena. The pipeline is orchestrated using Airflow.
Celo Etl
⭐
5
Python scripts for ETL (extract, transform and load) jobs for Celo blockchain blocks, transactions and more coming.
Related Searches
Shell Docker (20,193)
Docker Dockerfile (16,395)
Python Docker (16,341)
Javascript Docker (10,426)
Golang Docker (7,370)
Php Docker (6,192)
Java Docker (6,071)
Docker Nginx (5,238)
Docker Kubernetes (5,155)
Docker Postgresql (4,363)
1-44 of 44 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.