An open source framework for building data analytic applications.
Alternatives To Cdap
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Deequ3,04466 months ago37November 09, 2023141apache-2.0Scala
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Spark Cassandra Connector1,929109226 months ago81April 08, 202125apache-2.0Scala
DataStax Connector for Apache Spark to Apache Cassandra
Petastorm1,69388 months ago86February 03, 2023174apache-2.0Python
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Spark Py Notebooks1,515
a year ago9otherJupyter Notebook
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
66 months ago22January 29, 201788mitC#
C# and F# language binding and extensions to Apache Spark
Spark Movie Lens757
3 years ago10otherJupyter Notebook
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Cdap735566 months ago23September 01, 202398otherJava
An open source framework for building data analytic applications.
5 years ago1Python
Machine learning resources,including algorithm, paper, dataset, example and so on.
Complete Life Cycle Of A Data Science Project499
6 months ago4mit
Whylogs Java17923 years ago5November 01, 20202apache-2.0Java
Profile and monitor your ML data pipeline end-to-end
Alternatives To Cdap
Select To Compare

Alternative Project Comparisons
Popular Dataset Projects
Popular Spark Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Spark Streaming