Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Data Engineering Howto | 2,662 | 7 months ago | 5 | |||||||
A list of useful resources to learn Data Engineering from scratch | ||||||||||
Serpytor | 18 | 8 months ago | 5 | mit | Python | |||||
A distributed, low-code, end-to-end data collection and analysis tool for data folks. Take the pain out of data collection from your pipeline! | ||||||||||
Bridgefour | 16 | 2 months ago | Scala | |||||||
Bridge Four is a simple, functional, effectful, single-leader, multi worker, distributed compute system optimized for embarrassingly parallel workloads. | ||||||||||
Dagger | 9 | 1 | 2 years ago | 11 | September 30, 2021 | apache-2.0 | Python | |||
Define sophisticated data pipelines with Python and run them on different distributed systems (such as Argo Workflows). | ||||||||||
Sparklyclean | 6 | 3 years ago | mit | Scala | ||||||
Optimal distributed data deduplication and supervised learning pipeline using Apache Spark |