Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Elastic Mapreduce Ruby | 86 | 9 years ago | 8 | apache-2.0 | Ruby | |||||
Amazon's elastic mapreduce ruby client. Ruby 1.9.X compatible | ||||||||||
Lemur | 85 | 6 years ago | 8 | apache-2.0 | Clojure | |||||
Lemur is a tool to launch hadoop jobs locally or on EMR, based on a configuration file, referred to as a jobdef. The jobdef file describes your EMR cluster, local environment, pre- and post-actions and zero or more "steps". | ||||||||||
Rail | 70 | 3 years ago | 26 | other | Python | |||||
Scalable RNA-seq analysis | ||||||||||
Social Graph Analysis | 56 | 12 years ago | other | Python | ||||||
Social Graph Analysis using Elastic MapReduce and PyPy | ||||||||||
Elasticrawl | 50 | 1 | 7 years ago | 10 | February 15, 2017 | 1 | mit | Ruby | ||
Launch AWS Elastic MapReduce jobs that process Common Crawl data. | ||||||||||
Terraform Aws Emr Cluster | 35 | 4 years ago | 3 | apache-2.0 | HCL | |||||
A Terraform module to create an Amazon Web Services (AWS) Elastic MapReduce (EMR) cluster. | ||||||||||
Cc Helloworld | 33 | 9 years ago | 1 | Java | ||||||
CommonCrawl Hello World example | ||||||||||
Emrio | 30 | 9 years ago | Python | |||||||
Elastic MapReduce instance optimizer | ||||||||||
Ceteri Mapred | 19 | 12 years ago | Python | |||||||
MapReduce examples | ||||||||||
Spark Emr | 17 | 10 years ago | 9 | Scala | ||||||
Spark Elastic MapReduce bootstrap and runnable examples. |
A Terraform module to create an Amazon Web Services (AWS) Elastic MapReduce (EMR) cluster.
data "template_file" "emr_configurations" {
template = "${file("configurations/default.json")}"
}
module "emr" {
source = "github.com/azavea/terraform-aws-emr-cluster?ref=0.1.0"
name = "DatarpocCluster"
vpc_id = "vpc-20f74844"
release_label = "emr-5.9.0"
applications = [
"Hadoop",
"Ganglia",
"Spark",
"Zeppelin",
]
configurations = "${data.template_file.emr_configurations.rendered}"
key_name = "hector"
subnet_id = "subnet-e3sdf343"
instance_groups = [
{
name = "MasterInstanceGroup"
instance_role = "MASTER"
instance_type = "m3.xlarge"
instance_count = "1"
},
{
name = "CoreInstanceGroup"
instance_role = "CORE"
instance_type = "m3.xlarge"
instance_count = "1"
bid_price = "0.30"
},
]
bootstrap_name = "runif"
bootstrap_uri = "s3://elasticmapreduce/bootstrap-actions/run-if"
bootstrap_args = []
log_uri = "s3n://.../"
project = "Something"
environment = "Staging"
}
name
- Name of EMR clustervpc_id
- ID of VPC meant to house clusterrelease_label
- EMR release version to use (default: emr-5.8.0
)applications
- A list of EMR release applications (default: ["Spark"]
)configurations
- JSON array of EMR application configurationskey_name
- EC2 Key pair namesubnet_id
- Subnet used to house the EMR nodesinstance_groups
- List of objects for each desired instance group (see section below)bootstrap_name
- Name for the bootstrap actionbootstrap_uri
- S3 URI for the bootstrap action scriptbootstrap_args
- A list of arguments to the bootstrap action script (default: []
)log_uri
- S3 URI of the EMR log destination, must begin with s3n://
and end with trailing slashesproject
- Name of project this cluster is for (default: Unknown
)environment
- Name of environment this cluster is targeting (default: Unknown
)[
{
name = "MasterInstanceGroup"
instance_role = "MASTER"
instance_type = "m3.xlarge"
instance_count = 1
},
{
name = "CoreInstanceGroup"
instance_role = "CORE"
instance_type = "m3.xlarge"
instance_count = "1"
bid_price = "0.30"
},
]
id
- The EMR cluster IDname
- The EMR cluster namemaster_public_dns
- The EMR master public FQDNmaster_security_group_id
- Security group ID of the master instance/sslave_security_group_id
- Security group ID of the slave instance/s