Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Numaflow | 866 | 1 | 4 months ago | 37 | November 03, 2023 | 101 | apache-2.0 | Go | ||
Kubernetes-native platform to run massively parallel data/streaming jobs | ||||||||||
Dataflowjavasdk | 853 | 249 | 14 | 3 years ago | 38 | June 26, 2018 | 54 | |||
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines. | ||||||||||
Fondant | 293 | 4 months ago | 24 | December 12, 2023 | 52 | apache-2.0 | Python | |||
Production-ready data processing made easy and shareable | ||||||||||
Forte | 215 | 12 | a year ago | 14 | June 29, 2022 | 104 | apache-2.0 | Python | ||
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/ | ||||||||||
Batchflow | 195 | 4 months ago | 15 | August 01, 2023 | 33 | apache-2.0 | Python | |||
BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory. | ||||||||||
Dolphinnext | 92 | 2 years ago | 34 | gpl-3.0 | PHP | |||||
A graphical user interface for distributed data processing of high throughput genomics | ||||||||||
Breast Cancer Risk Prediction | 83 | 3 years ago | 1 | mit | Jupyter Notebook | |||||
Classification of Breast Cancer diagnosis Using Support Vector Machines | ||||||||||
Sentieon Scripts | 53 | 5 months ago | 3 | bsd-2-clause | Shell | |||||
Helper scripts for biological data processing from Sentieon | ||||||||||
Five Dollar Genome Analysis Pipeline | 40 | 4 years ago | 4 | bsd-3-clause | wdl | |||||
Workflows used for WGS data processing -- replaced by https://github.com/gatk-workflows/gatk4-genome-processing-pipeline | ||||||||||
Barnard59 | 20 | 1 | 5 | 4 months ago | 16 | June 08, 2022 | 63 | JavaScript | ||
An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management. |