Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for spark emr
emr
x
spark
x
66 search results found
Spark Nlp
⭐
3,578
State of the Art Natural Language Processing
Spark Jobserver
⭐
2,837
REST job server for Apache Spark
Spark
⭐
1,963
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Goodreads_etl_pipeline
⭐
593
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Sagemaker Spark
⭐
285
A Spark library for Amazon SageMaker.
Beginner_de_project
⭐
276
Beginner data engineering project - batch edition
Spark Jupyter Aws
⭐
255
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Aws Glue Data Catalog Client For Apache Hive Metastore
⭐
184
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog
Learning Hadoop And Spark
⭐
160
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Emr Serverless Samples
⭐
124
Example code for running Spark and Hive jobs on EMR Serverless.
Variantspark
⭐
121
machine learning for genomic variants
Spark Knn Recommender
⭐
113
Item and User-based KNN recommendation algorithms using PySpark
Spark Example Project
⭐
106
A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Scalding Example Project
⭐
85
The Scalding WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Spark_scala_ml_examples
⭐
75
Spark 2.0 Scala Machine Learning examples
Sparksteps
⭐
68
⭐ CLI tool to launch Spark jobs on AWS EMR
Terraform Aws Emr Cluster
⭐
67
Terraform module to provision an Elastic MapReduce (EMR) cluster on AWS
Vectorpipe
⭐
60
Convert Vector data to VectorTiles with GeoTrellis.
Sbt Lighter
⭐
55
SBT plugin for Apache Spark on AWS EMR
Emr Bootstrap Spark
⭐
49
AWS bootstrap scripts for Mozilla's flavoured Spark setup.
Edc Mod1 Exercise Igti
⭐
42
Exercícios do módulo 1 - Bootcamp EDC - IGTI 2021
Spark Plug
⭐
40
scala driver for launching Amazon EMR jobs
Mastering Machine Learning On Aws
⭐
35
Mastering Machine Learning on AWS, published by Packt
Telemetry Analysis Service
⭐
33
Telemetry Analysis Service
Spark Flamegraph
⭐
30
Easy CPU Profiling for Apache Spark applications
Basin
⭐
29
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Knn
⭐
23
Spark Knn Recommender
Spark Search
⭐
20
Spark Search - high performance advanced search features based on Apache Lucene
Dataflow Runner
⭐
19
Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
Spark And Mllib Projects
⭐
18
This repository contains Spark, MLlib, PySpark and Dataframes projects
S3 Inventory Usage Examples
⭐
17
Examples demonstrating how to use Amazon S3 Inventory to analyze your S3 storage using Spark and EMR.
Starting Bigdata Aws
⭐
16
Pyspark Emr
⭐
15
A toolset to streamline running spark python on EMR
Telemetry Streaming
⭐
15
Spark Streaming ETL jobs for Mozilla Telemetry
Sparkemrbootstrap
⭐
14
Files to help make new spark EMR Bootstraps
Aws Emr Examples
⭐
14
Some AWS EMR examples
Pyspark S3 Parquet Example
⭐
13
This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apache Spark instance running on AWS EMR, which will run a SQLContext to create a temporary table using a DataFrame. SQL queries will then be possible against the temporary table.
Sparkwarc
⭐
13
Load WARC files into Apache Spark with sparklyr
Geotrellis Landsat Emr Demo
⭐
12
Process landsat imagery on EMR, serve them out to a web application that does NDVI/NDWI on the fly
Spark Emr
⭐
12
spark-emr
Amazon Emr Optimize Data Processing
⭐
12
Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark
Nyc_taxi_pipeline
⭐
12
Design/Implement stream/batch architecture on NYC taxi data | #DE
Project
⭐
11
Emr Demo
⭐
10
Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.
Chicago Taxi Trips Analysis
⭐
10
Analysis of City Of Chicago Taxi Trip Dataset Using AWS EMR, Spark, PySpark, Zeppelin and Airbnb's Superset
Spark Boilerplate
⭐
10
A boilerplate for spark projects with docker support for local development and scripts for emr support.
Spark2demo
⭐
10
Communitydetection Spark Aws
⭐
9
A Spark application, written in Python, to figure out strongly connected components with Bi-directional Label Propagation algorithm. This project implemented an 1.3GB Twitter network dataset on AWS EMR cluster.
Cassandra Gdelt Queries
⭐
8
A Cassandra Architecture for GDELT Database 🌍
Til
⭐
8
Today I Learned
Sbt Spark Ec2 Plugin
⭐
8
Sbt plugin to submit Spark jobs
Sparksnake
⭐
8
Improving the development of Spark applications deployed as jobs on AWS services like Glue and EMR
Aws Etl
⭐
7
This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/data/blob/main/A it's a zipped file with some .csvs inside that we will apply transformations.
Distcomputing
⭐
6
Emr Scripts
⭐
6
Shell scripts for AWS EMR clusters
Killzombiezeppelinsandsparkshells
⭐
6
Kill those Zombie Zeppelins & Spark Shells !
Sparkov
⭐
6
Markov Chain based fraud detection system in Spark.
Benchmark For Spark
⭐
6
benchmark-for-spark
Spark Sessions
⭐
6
Examples for how to split sets of time based events into sessions using Spark
Android_malware_capstone
⭐
6
Investigation of Android mobile Malware using the SherLock dataset
Alluxio Emr Bootstrap
⭐
5
bootstrap script for Alluxio on EMR
Spark_r_ml_examples
⭐
5
Spark 2.0 R/SparkR Machine Learning examples
Flint
⭐
5
Main repository of the Flint project for Spark and Amazon EMR.
Sparkling Water Emr
⭐
5
Launch Sparkling Water on EMR
Airflow Dags
⭐
5
Ddapp
⭐
5
FULL stack data science project (tech currently utilized: AWS/boto3/EMR/EC2/S3, Python, PySpark (Spark SQL and MLlib), and Flask/Flask RESTPlus)
Related Searches
Scala Spark (3,279)
Python Spark (2,053)
Java Spark (1,587)
Apache Spark (1,207)
Spark Hadoop (1,188)
Jupyter Notebook Spark (1,151)
Spark Kafka (985)
Spark Streaming (817)
Spark Pyspark (812)
Shell Spark (705)
1-66 of 66 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.