Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python data engineering
data-engineering
x
python
x
247 search results found
De Zoomcamp Project
⭐
10
My personal project for data engineering zoomcamp
Machinelearningfoundations
⭐
10
Repository containing the teaching/learning materials of the Machine Learnng Foundations Course by Sumudu Tennakoon
Data Pipeline With Dbt Using Airflow On Gcp
⭐
10
This project demonstrates how to build and automate an ETL pipeline using DAGs in Airflow and load the transformed data to Bigquery. There are different tools that have been used in this project such as Astro, DBT, GCP, Airflow, Metabase.
Business_closures_de_pipeline
⭐
10
Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database
Data Engineering
⭐
9
Common data manipulations in different languages and frameworks.
Data Engineering Bus Tracker
⭐
9
Data engineering project using UK Bus Open Data Service (BODS) to calculate late buses in real-time for any selected region in England. Prefect, Docker, Terraform, Google CloudRun, BigQuery and Streamlit
Cv
⭐
9
Pydag
⭐
9
Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag
Cis_households
⭐
9
Data engineering pipeline for the household COVID-19 Infection Survey (CIS)
Data Engineering
⭐
9
This is an all-in-one repository for Data Engineers, ideal for beginners & interview preparation, which includes Python as the main Programing language incorporating MySQL, MongoDB and Docker
Data Engineering
⭐
9
Wraps the DB by opening a REST API for storing and retrieving documents info & recommendations
Evolving_basic
⭐
9
Contains basic things (Data structure, Algorithm, Cracking coding Interview Q&A...etc) for Data engineers.
Dagger
⭐
9
Define sophisticated data pipelines with Python and run them on different distributed systems (such as Argo Workflows).
Predicting Retail Churn With Azure Ml Studio
⭐
9
Challenge to job: Data Scientist
Faizs Data Portofolio
⭐
9
This documentation is like a quick snapshot of my project in the data field, showing off my skills and know-how in this area.
Fastapi Your Data
⭐
8
Development of a modular and scalable backend template for Cloud data exposure using FastAPI, SQLAlchemy and Alembic
Prefect Alert
⭐
8
A decorator that sends alert when a Prefect flow fails
Pyspark Template
⭐
8
A Python PySpark Projet with Poetry
Pyverse Exploring Python Frameworks
⭐
8
This repository is the Ultimate guide to exploring and mastering Python Libraries & frameworks, collection of code and guide by me, Tushar!
Data Engineering Onboarding Starter
⭐
8
This repository contains a 10 step program to enter the world of Data Engineering
Spooq
⭐
8
Pydantic Benchmarks
⭐
8
Benchmarks for newer versions of Pydantic v2, written in Rust 🦀
Data Science Bootcamp
⭐
8
Scripts for some exercises and mini-projects in the context of DIT's 400h Bootcamp on Python, Data Visualisation, Web Scraping, Natural Language Processing, Data Engineering, Machine Learning, and Deep Learning
Clusterless
⭐
8
Clusterless is a tool for scheduling decentralized, scalable, and secure data pipelines for continuously arriving data, across clouds.
Data Engineering
⭐
7
Code for my blogs on Data Engineering
Soda Github Action
⭐
7
⚡ Prevent downstream data quality issues by integrating the Soda Library into your CI/CD pipeline.
Allstate Claims Severity
⭐
7
Udacity Machine Learning Engineer Nanodegree capstone proposal.
Mojap Arrow Pd Parser
⭐
7
Conforms pandas to "correct" datatypes to ensure data in/out using CSV, JSONL and Parquet is read the same (using arrow).
Data Engineering Interviews
⭐
7
Data engineering interviews Q&A for data community by data community
Au Azure Databricks
⭐
7
NHSX AU-Data Engineering - Azure Databricks Analytics
Final Project End To End Banking Campaign Pipeline
⭐
7
Final Project for IYKRA Data Fellowship 8 Program, creating an end-to-end banking campaign pipeline using lambda architecture (providing acess to batch and stream processing)
Common_datasets
⭐
7
Common-datasets is a GitHub repository dedicated to providing a wide collection of common datasets for practicing and learning data science and machine learning.
Batchdata
⭐
7
Batch data processing with luigi, 90min workshop at PyCon Balkan 2018, Belgrade.
Data Engineer Challenge
⭐
7
Challenge Data Engineer
Stock Price Prediction Spark Cassandra
⭐
6
This is a data pipeline for predicting stock prices using Apache Spark, Apache Cassandra, and machine learning techniques. It collects and preprocesses stock data from Alpha Vantage API, engineers features, trains models, and performs data analysis and predictions.
Data Engineer Portfolio
⭐
6
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Babbling.fish
⭐
6
My personal blog about Data Engineering. Powered by Gatsby.
Data Engineering Bootcamp
⭐
6
Data Engineering Bootcamp
Dataengineering Youtube Project
⭐
6
Data Engineering Youtube Project
Machinealgobox
⭐
6
Explore common ML algorithms, from scratch implementations to real-world use cases, Each algorithm is accompanied by clear explanations, code implementations, and real-world use cases, enabling you to grasp their underlying principles and apply them to different problem domains.
Unblind
⭐
6
Proyecto para el Datatón anticorrupción 2022 - By Dataket 🔥
Data Careers Handbook 2024
⭐
6
Data Career Handbook for all
Realtime Market Data Pipeline
⭐
6
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.
Porto Seguro Safe Driver Prediction
⭐
6
Predict if a driver will file an insurance claim next year. (Kaggle Competition)
Mojap Metadata
⭐
6
Schema definitions and management of our metadata used by the Data Engineering Team at MoJ
Spark Databricks
⭐
6
🔥 Master Apache Spark & Databricks! Dive into a world of big data with exclusive insights from Udemy courses, personal notes, and practical guides. Whether you're starting out or scaling new heights in data engineering, this is your ultimate resource hub! 🌟🚀
Datacrafter
⭐
6
NoSQL extract, transform, load (ETL) toolkit with Python
Data.engineers.lunch
⭐
6
Resources from weekly Zoom lunches revolving around Data Engineering. Hosted by Anant Corporation.
Batchsat
⭐
6
This project aims to build a software that can download Sentinel-2 satellite images in batch for data analysis.
Route1io Python Connectors
⭐
5
Connectors for interacting with popular APIs and services used in marketing analytics via clean and concise Python code.
Data_infra_repo
⭐
5
Collections of POC/dev data infrastructure. | #SE
Gcp Airflow Foundations
⭐
5
Opinionated framework based on Airflow 2.0 for building pipelines to ingest data into a BigQuery data warehouse
Wbz
⭐
5
A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler transform (BWT) and Move to front (MTF) to improve the Huffman compression. For now, this tool only will be focused on compressing .csv files, and other files on tabular format.
Spark Structured Streaming Kafka
⭐
5
Spark Structured Streaming + Kafka + Delta pipeline.
Chartai
⭐
5
A Streamlit powered GPT-3 Application that allows you to chat with tabular data. In addition to AI Chart creation, insights are given too.
Data Engineering Project With Hdfs And Kafka
⭐
5
Data Engineering Project with Hadoop HDFS and Kafka
Stock Market Real Time Data Pipeline With Apache Kafka And Cassandra
⭐
5
A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apache Kafka and stored in a local Cassandra database.
Udacity_nanodegree Data_analyst
⭐
5
Udacity Nanodegree - Data Analyst - Wrangling, Exploring, Analyzing, and Visualizing Data
Workspace
⭐
5
This repository provides containerized applications and microservices for the Information Systems and Databases Course @ Instituto Superior Técnico
Predict Which Customers A Call Center Should Contact
⭐
5
Predict which customers should a call-center call for greater assertiveness in a sale
Leauge Of Legends Challenger Stats
⭐
5
Leauge of Legends Challenger Stats
Codepack
⭐
5
CodePack - A Python package to easily make, run, and manage workflows
Dados Artigo Sbse 2023
⭐
5
Esse repositório reúne os dados utilizados para o desenvolvimento das simulações, bem os resultados obtidos, expostos no artigo "Uma Avaliação em Série-Temporal Quase-Estática da Capacidade de Hospedagem de Geração FV em Redes de Distribuição", submetido ao SBSE 2023.
Udacity Data Engineering Nanodegree
⭐
5
This is a repository to hold the files and notebooks produced throughout my Udacity's Nanodegree Data Engineering program.
Rawbuilder
⭐
5
an elegant datasets factory
Prefecto
⭐
5
Library of Prefect tasks and utilities.
Data_linter
⭐
5
Docker image used to automatically validate data
Aiscalator
⭐
5
Tools to streamline Jupyter Notebook Prototypes into robust Data Products
Ds001 Scraping To Analysis Extra Store
⭐
5
✨ The current project is a basic process pipeline for extraction, transformation, loading, analysis and presentation. All of this was done using appropriate web scraping, data analysis/presentation and database tools.
Analytics_data_where_house
⭐
5
An analytics engineering sandbox focusing on real estates prices in Cook County, IL
Movie Recommendation Als Spark
⭐
5
EC-JANE Entertainment launches an innovative film recommendation project utilizing MovieLens data. This project aims to incorporate Big Data analysis and machine learning to enhance movie suggestions, leveraging Apache Spark, Elasticsearch, and a Flask API to provide a personalized and dynamic user experience.
Related Searches
Python Machine Learning (20,195)
Python Flask (17,643)
Python Jupyter Notebook (17,055)
Python Dataset (14,792)
Python Docker (14,113)
Python Deep Learning (13,092)
Python Database (10,521)
Python Natural Language Processing (9,064)
Python Artificial Intelligence (8,580)
Python Amazon Web Services (7,946)
201-247 of 247 search results
< Previous
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.