Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Petastorm | 1,693 | 8 | 6 months ago | 86 | February 03, 2023 | 174 | apache-2.0 | Python | ||
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code. | ||||||||||
Tech.ml.dataset | 616 | 4 months ago | 251 | January 05, 2021 | 10 | epl-1.0 | Clojure | |||
A Clojure high performance data processing system | ||||||||||
Skale | 398 | 2 | 2 | 3 years ago | 29 | June 27, 2017 | apache-2.0 | JavaScript | ||
High performance distributed data processing engine | ||||||||||
Rumble | 194 | a year ago | 4 | December 03, 2019 | 134 | other | Java | |||
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more | ||||||||||
Bigdata Playground | 154 | 5 years ago | 4 | apache-2.0 | TypeScript | |||||
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL | ||||||||||
Functions | 35 | 4 months ago | 7 | apache-2.0 | Jupyter Notebook | |||||
MLRun template functions and examples |