Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Data Engineering Interview Questions | 554 | 7 months ago | ||||||||
More than 2000+ Data engineer interview questions. | ||||||||||
Sql Scripts | 291 | 9 months ago | 2 | mit | Shell | |||||
100+ SQL Scripts - PostgreSQL, MySQL, Google BigQuery, MariaDB, AWS Athena. DevOps / DBA / Analytics / performance engineering. Google BigQuery ML machine learning classification. | ||||||||||
Aws Glue Data Catalog Client For Apache Hive Metastore | 184 | 4 months ago | 42 | apache-2.0 | Java | |||||
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions | ||||||||||
Emr Serverless Samples | 124 | 3 months ago | 6 | mit-0 | Python | |||||
Example code for running Spark and Hive jobs on EMR Serverless. | ||||||||||
Streamx | 95 | 5 years ago | 26 | apache-2.0 | Java | |||||
kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3) | ||||||||||
Luigi Warehouse | 73 | 7 years ago | other | Python | ||||||
A luigi powered analytics / warehouse stack | ||||||||||
Terraform Aws Emr Cluster | 67 | a year ago | 3 | apache-2.0 | HCL | |||||
Terraform module to provision an Elastic MapReduce (EMR) cluster on AWS | ||||||||||
Csds Material | 38 | 6 years ago | 1 | Java | ||||||
Course material for the Computer Systems for Data Science class at Columbia | ||||||||||
Devops Golang Tools | 33 | 9 months ago | mit | Shell | ||||||
DevOps Golang tools | ||||||||||
Aws Account Operator | 30 | 2 | 3 months ago | 46 | April 22, 2021 | 3 | apache-2.0 | Go | ||
Operator to manage pool of AWS accounts for Hive |