Sparkdataset

Instant search for and access to many datasets in Pyspark.
Alternatives To Sparkdataset
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Petastorm1,69385 months ago86February 03, 2023174apache-2.0Python
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Spark Py Notebooks1,515
a year ago9otherJupyter Notebook
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Spark Iforest147
4 years ago1apache-2.0Scala
Isolation Forest on Spark
Dampr101
4 months ago9July 03, 2019otherPython
Python Data Processing library
Phrase At Scale84
5 years ago2Python
Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English
Sparkdataset28
2 years ago3November 01, 2021mitJupyter Notebook
Instant search for and access to many datasets in Pyspark.
Isarn Sketches Spark27
2 years ago19June 20, 20206apache-2.0Scala
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Detecting Malicious Url Machine Learning23
6 years ago1Jupyter Notebook
Nyc Taxi Analysis17
6 years agoJupyter Notebook
Analyzing 200 GB of NYC taxi dataset.
Pyspark15
4 years agomitPython
spark (scala and python)
Alternatives To Sparkdataset
Select To Compare


Alternative Project Comparisons
Popular Pyspark Projects
Popular Dataset Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Jupyter Notebook
R
Dataset
Benchmark
Spark
Data Analysis
Data Mining
Data Engineering
Pyspark