Dataflow Runner

Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
Alternatives To Dataflow Runner
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Aws Glue Samples1,334
8 months ago37mit-0Python
AWS Glue code samples
Spark Redshift514415 years ago10November 01, 2016134apache-2.0Scala
Redshift data source for Apache Spark
Agile_data_code_2435
a year ago7mitJupyter Notebook
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Sagemaker Spark285
29 months ago36August 26, 202234apache-2.0Scala
A Spark library for Amazon SageMaker.
Spark Jupyter Aws255
7 years ago2Jupyter Notebook
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Spark Knn Recommender113
7 years ago4mitPython
Item and User-based KNN recommendation algorithms using PySpark
Spark Example Project106
8 years ago4apache-2.0Scala
A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Scalding Example Project85
10 years ago3apache-2.0Scala
The Scalding WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Awesome Recommendation Engine43
6 years ago1apache-2.0Scala
The purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Udacity Data Engineering42
4 years ago1Jupyter Notebook
Udacity Data Engineering Nano Degree (DEND)
Alternatives To Dataflow Runner
Select To Compare


Alternative Project Comparisons
Popular Amazon Projects
Popular Spark Projects
Popular Companies Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Golang
Amazon
Spark
Hadoop
Data Flow
Flink
Golang Application