Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
Petastorm	1,693		8	6 months ago	86	February 03, 2023	174	apache-2.0	Python
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Tech.ml.dataset	616			4 months ago	251	January 05, 2021	10	epl-1.0	Clojure
A Clojure high performance data processing system
Skale	398	2	2	3 years ago	29	June 27, 2017		apache-2.0	JavaScript
High performance distributed data processing engine
Rumble	194			a year ago	4	December 03, 2019	134	other	Java
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark \| Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) \| No install required (just a jar to download) \| Declarative Machine Learning and more
Bigdata Playground	154			5 years ago			4	apache-2.0	TypeScript
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Functions	35			4 months ago			7	apache-2.0	Jupyter Notebook
MLRun template functions and examples

Alternatives To Functions

Select To Compare

Petastorm ⭐ 1,693

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

dependent packages 8total releases 86most recent commit 6 months ago

Tech.ml.dataset ⭐ 616

A Clojure high performance data processing system

total releases 251most recent commit 4 months ago

Skale ⭐ 398

High performance distributed data processing engine

dependent packages 2total releases 29most recent commit 3 years ago

Rumble ⭐ 194

⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

total releases 4most recent commit a year ago

Bigdata Playground ⭐ 154

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

most recent commit 5 years ago

Functions ⭐ 35

MLRun template functions and examples

most recent commit 4 months ago

Suggest An Alternative To functions

Alternative Project Comparisons

Functions vs Petastorm

Functions vs Tech.ml.dataset

Functions vs Skale

Functions vs Rumble

Functions vs Bigdata Playground

Popular Machine Learning Projects

Tensorflow ⭐ 180,196

An Open Source Machine Learning Framework for Everyone

dependent packages 78total releases 46latest release October 23, 2019most recent commit 4 months ago

Transformers ⭐ 124,049

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

dependent packages 2,484total releases 125latest release November 15, 2023most recent commit 2 months ago

Pytorch ⭐ 74,794

Tensors and Dynamic neural networks in Python with strong GPU acceleration

dependent packages 8,272total releases 39latest release November 15, 2023most recent commit 4 months ago

Netdata ⭐ 67,808

The open-source observability platform everyone needs!

most recent commit 2 months ago

Ml For Beginners ⭐ 63,698

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

most recent commit 5 months ago

Popular Parquet Projects

Iceberg ⭐ 5,179

Apache Iceberg

total releases 3latest release October 29, 2022most recent commit 4 months ago

Dsq ⭐ 3,401

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

total releases 2latest release October 20, 2022most recent commit 8 months ago

Roapi ⭐ 2,969

Create full-fledged APIs for slowly moving datasets without writing a single line of code.

total releases 17latest release March 20, 2022most recent commit 5 months ago

Parquet Mr ⭐ 2,296

Apache Parquet

dependent packages 208total releases 17latest release May 12, 2023most recent commit 4 months ago

Qsv ⭐ 2,079

CSVs sliced, diced & analyzed.

total releases 148latest release November 20, 2023most recent commit 4 months ago

Popular Machine Learning Categories

Natural Language Processing

Neural Network

Neural

Computer Vision

Convolutional Neural Networks

Opencv

Get A Weekly Email With Trending Projects For These Categories

No Spam. Unsubscribe easily at any time.

Jupyter Notebook

Machine Learning

Parquet

Privacy | About | Terms | Follow Us On Twitter

Downloads, Dependent Repos, Dependent Packages, Total Releases, Latest Releases data powered by Libraries.io.