Databricks Datascience Titanic

A walk-through of data science basics using PySpark, MLflow and the Titanic dataset
Alternatives To Databricks Datascience Titanic
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Petastorm1,69385 months ago86February 03, 2023174apache-2.0Python
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Spark Py Notebooks1,515
a year ago9otherJupyter Notebook
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Spark Iforest147
4 years ago1apache-2.0Scala
Isolation Forest on Spark
Dampr101
4 months ago9July 03, 2019otherPython
Python Data Processing library
Phrase At Scale84
5 years ago2Python
Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English
Sparkdataset28
2 years ago3November 01, 2021mitJupyter Notebook
Instant search for and access to many datasets in Pyspark.
Isarn Sketches Spark27
2 years ago19June 20, 20206apache-2.0Scala
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Detecting Malicious Url Machine Learning23
6 years ago1Jupyter Notebook
Nyc Taxi Analysis17
7 years agoJupyter Notebook
Analyzing 200 GB of NYC taxi dataset.
Pyspark15
4 years agomitPython
spark (scala and python)
Alternatives To Databricks Datascience Titanic
Select To Compare


Alternative Project Comparisons
Popular Dataset Projects
Popular Pyspark Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Machine Learning
Dataset
Data Science
Pyspark