Big Data Systems Intelligence Analytics Labs Summer 2022

Labs for Big Data and Intelligent Analytics
Alternatives To Big Data Systems Intelligence Analytics Labs Summer 2022
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Xg2xg12,278
15 days ago36
by ex-googlers, for ex-googlers - a lookup table of similar tech & services
Cas10,0222306a day ago127September 04, 20221apache-2.0Java
Apereo CAS - Identity & Single Sign On for all earthlings and beyond.
Hudi Resources429
2 days ago
汇总Apache Hudi相关资料
Aws Glue Data Catalog Client For Apache Hive Metastore156
3 months ago36apache-2.0Java
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
Distributed Dataset107
3 years ago19bsd-3-clauseHaskell
A distributed data processing framework in Haskell.
Aws Signing Request Interceptor901556 months ago20October 24, 20184mitJava
Request Interceptor for Apache Client that signs the request for AWS
Timeseries83
5 years ago1apache-2.0Jupyter Notebook
Deep Learning repo for timeseries and sequence data
Beanstalk Nginx Php Fpm70
5 years ago10mitShell
How to replace apache with nginx and php-fpm on AWS beanstalk
Cloudoffice66
4 months ago1apache-2.0HCL
Cloudoffice deploys Nextcloud and OnlyOffice automatically with LetsEncrypt HTTPS certificates. Text and video instructions included. Six compatible cloud providers, or via Ubuntu/Raspberry Pi. Cloud provider deployments include low-cost object storage integration (e.g. S3).
Aws Concurrent Data Orchestration Pipeline Emr Livy66
4 years ago5apache-2.0Python
This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concurrent data pipeline by using Amazon EMR and Apache Livy. This pipeline is orchestrated by Apache Airflow.
Alternatives To Big Data Systems Intelligence Analytics Labs Summer 2022
Select To Compare


Alternative Project Comparisons
Readme

Big-Data-Systems-Intelligence-Analytics-Labs-Summer-2022

DAMG7245

Big Data Systems & Intelligence Analytics Labs - Summer 2022

Requirements

You can choose any of the cloud platforms, but most of the tutorials will be based on AWS!

  • Signup for an AWS Account here.
  • Signup for an GCP Account here.
  • Python 3.7+

Labs Index:

  1. Lab 1 : How to setup a data science project with CookieCutter
  2. Lab 2: How to document with Google CodeLabs

Google Cloud Cheetsheet https://googlecloudcheatsheet.withgoogle.com/

Popular Amazon Web Services Projects
Popular Apache Projects
Popular Cloud Computing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Shell
Amazon Web Services
Apache
Google Cloud Platform
Airflow