Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Petastorm | 1,693 | 8 | 5 months ago | 86 | February 03, 2023 | 174 | apache-2.0 | Python | ||
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code. | ||||||||||
Spark Py Notebooks | 1,515 | a year ago | 9 | other | Jupyter Notebook | |||||
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks | ||||||||||
Spark Iforest | 147 | 4 years ago | 1 | apache-2.0 | Scala | |||||
Isolation Forest on Spark | ||||||||||
Dampr | 101 | 4 months ago | 9 | July 03, 2019 | other | Python | ||||
Python Data Processing library | ||||||||||
Phrase At Scale | 84 | 5 years ago | 2 | Python | ||||||
Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English | ||||||||||
Sparkdataset | 28 | 2 years ago | 3 | November 01, 2021 | mit | Jupyter Notebook | ||||
Instant search for and access to many datasets in Pyspark. | ||||||||||
Isarn Sketches Spark | 27 | 2 years ago | 19 | June 20, 2020 | 6 | apache-2.0 | Scala | |||
Routines and data structures for using isarn-sketches idiomatically in Apache Spark | ||||||||||
Detecting Malicious Url Machine Learning | 23 | 6 years ago | 1 | Jupyter Notebook | ||||||
Nyc Taxi Analysis | 17 | 6 years ago | Jupyter Notebook | |||||||
Analyzing 200 GB of NYC taxi dataset. | ||||||||||
Pyspark | 15 | 4 years ago | mit | Python | ||||||
spark (scala and python) |