Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Data Engineering Zoomcamp | 13,734 | 8 days ago | 48 | Jupyter Notebook | ||||||
Free Data Engineering course! | ||||||||||
Prefect | 12,096 | 1 | 70 | 3 hours ago | 162 | July 05, 2022 | 518 | apache-2.0 | Python | |
The easiest way to orchestrate and observe your data pipelines | ||||||||||
Lakefs | 3,430 | 1 | 4 hours ago | 62 | June 15, 2022 | 566 | apache-2.0 | Go | ||
lakeFS - Data version control for your data lake | Git for data | ||||||||||
Everything Tech | 372 | a year ago | apache-2.0 | Go | ||||||
A collection of online resources to help you on your Tech journey. | ||||||||||
Dataplane | 129 | 2 months ago | 33 | other | JavaScript | |||||
Dataplane is an Airflow inspired data platform with additional data mesh capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end. | ||||||||||
Movalytics Data Warehouse | 74 | 3 years ago | Python | |||||||
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow | ||||||||||
Cowait | 54 | 3 months ago | 53 | April 01, 2022 | 39 | apache-2.0 | Python | |||
Containerized distributed programming framework for Python | ||||||||||
Towardsdataengineering | 52 | 4 months ago | 7 | Python | ||||||
This repo contains commands that data engineers use in day to day work. | ||||||||||
Camelboilerplate | 42 | 2 years ago | Java | |||||||
A Spring Boot Camel boilerplate that aims to consume events from Apache Kafka, process it and send to a PostgreSQL database. | ||||||||||
Rtdl | 39 | 8 months ago | mit | Go | ||||||
rtdl makes it easy to build and maintain a real-time data lake |
#course-data-engineering
channelSyllabus
All the materials of the course are freely available, so that you can take the course at your own pace
The best way to get support is to use DataTalks.Club's Slack. Join the #course-data-engineering
channel.
To make discussions in Slack more organized:
Note: NYC TLC changed the format of the data we use to parquet. But you can still access the csv files here.
Putting everything we learned to practice
To get the most out of this course, you should feel comfortable with coding and command line and know the basics of SQL. Prior experience with Python will be helpful, but you can pick Python relatively fast if you have experience with other programming languages.
Prior experience with data engineering is not required.
For this course, you'll need to have the following software installed on your computer:
See Week 1 for more details about installing these tools
Thanks to the course sponsors for making it possible to create this course
Do you want to support our course and our community? Please reach out to [email protected]