Pyspark Ml

Gathers data science and machine learning problem solving using PySpark and Hadoop.
Alternatives To Pyspark Ml
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Ibis3,40424294 months ago68December 10, 2023157apache-2.0Python
The flexibility of Python with the scale and performance of modern SQL.
Devops Python Tools709
4 months ago37mitPython
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Sagemaker Spark285
28 months ago36August 26, 202234apache-2.0Scala
A Spark library for Amazon SageMaker.
Spark Jupyter Aws255
7 years ago2Jupyter Notebook
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Aut128
10 months ago27November 17, 20223apache-2.0Scala
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Spark With Python98
4 years agomitJupyter Notebook
Fundamentals of Spark with Python (using PySpark), code examples
Apachespark59
2 years agoPython
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Big_data55
4 months agomitJupyter Notebook
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
Spark Training52
3 years ago3Jupyter Notebook
Repository used for Spark Trainings
Datapipelines Essentials Python45
a year ago1apache-2.0Python
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Alternatives To Pyspark Ml
Select To Compare


Alternative Project Comparisons
Popular Hadoop Projects
Popular Pyspark Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Jupyter Notebook
Hadoop
Mnist
Imdb
Pyspark