Data Engineering Howto

A list of useful resources to learn Data Engineering from scratch
Alternatives To Data Engineering Howto
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Data Engineering Howto2,949
6 months ago4
A list of useful resources to learn Data Engineering from scratch
Serpytor18
a year ago5mitPython
A distributed, low-code, end-to-end data collection and analysis tool for data folks. Take the pain out of data collection from your pipeline!
Bridgefour16
9 months agoScala
Bridge Four is a simple, functional, effectful, single-leader, multi worker, distributed compute system optimized for embarrassingly parallel workloads.
Dagger912 years ago11September 30, 2021apache-2.0Python
Define sophisticated data pipelines with Python and run them on different distributed systems (such as Argo Workflows).
Sparklyclean6
4 years agomitScala
Optimal distributed data deduplication and supervised learning pipeline using Apache Spark
Alternatives To Data Engineering Howto
Select To Compare


Alternative Project Comparisons
Popular Data Engineering Projects
Popular Distributed Systems Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Scala
Distributed Systems
Data Engineering
Airflow