Rdds Dataframes Datasets Presentation 2016

Source for "RDDs, DataFrames and Datasets in Apache Spark" NEScala presentation
Alternatives To Rdds Dataframes Datasets Presentation 2016
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Spark37,6612,3949393 months ago46May 09, 2021186apache-2.0Scala
Apache Spark - A unified analytics engine for large-scale data processing
Synapseml4,96764 days ago12November 27, 2023335mitScala
Simple and Distributed Machine Learning
Bigdl4,728103 months ago16April 19, 2021958apache-2.0Jupyter Notebook
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using bigdl-llm
Sparkinternals4,665
2 years ago27
Notes talking about the design and implementation of Apache Spark
Spark Nlp3,578303 months ago134December 08, 202343apache-2.0Scala
State of the Art Natural Language Processing
Coolplayspark3,447
2 years ago35Scala
酷玩 Spark: Spark 源代码解析、Spark 类库等
Koalas3,2911167 months ago47October 19, 2021112apache-2.0Python
Koalas: pandas API on Apache Spark
Spark Notebook3,147
a year ago207apache-2.0JavaScript
Interactive and Reactive Data Science using Scala and Spark.
Deequ3,04463 months ago37November 09, 2023141apache-2.0Scala
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Analytics Zoo2,59234 months ago1July 29, 2022533apache-2.0Jupyter Notebook
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
Alternatives To Rdds Dataframes Datasets Presentation 2016
Select To Compare


Alternative Project Comparisons
Popular Spark Projects
Popular Apache Spark Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Css
Dataset
Scala
Slides
Spark
Apache Spark