Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for hadoop pyspark
hadoop
x
pyspark
x
35 search results found
Ibis
⭐
3,404
The flexibility of Python with the scale and performance of modern SQL.
Devops Python Tools
⭐
709
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Gather Deployment
⭐
347
Gathers Python deployment, infrastructure and practices.
Sagemaker Spark
⭐
285
A Spark library for Amazon SageMaker.
Spark Jupyter Aws
⭐
255
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Aut
⭐
128
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Apachespark
⭐
59
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Big_data
⭐
55
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
Spark Training
⭐
52
Repository used for Spark Trainings
Datapipelines Essentials Python
⭐
45
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Openspark
⭐
39
The out-of-the-box environment to for Hadoop/Spark applications
Basin
⭐
29
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Springboard Data Science Immersive
⭐
23
Nyc Taxi Analysis
⭐
17
Analyzing 200 GB of NYC taxi dataset.
Hadoop
⭐
16
Rasppi Cluster
⭐
14
An efficient quick-start tool to build a Raspberry Pi (or Debian-based) Cluster with popular ecosystem like Hadoop, Spark
Pyspark K8s Example
⭐
14
Nyc_taxi_pipeline
⭐
12
Design/Implement stream/batch architecture on NYC taxi data | #DE
Py Hadoop Tutorial
⭐
11
Source Material for using Python and Hadoop together
Anaconda
⭐
10
python gift package
Dijkstra Hadoop Spark
⭐
10
Dijkstra Algorithm - Python Hadoop Streaming and Pyspark
Docker Jupyter Spark
⭐
9
Docker image for Jupyter notebooks with PySpark
Hackathonclt2019
⭐
9
Tensorflow Spark Docker
⭐
9
contains Tensorflow + HADOOP + SPARK, to make it easy to running TensorFlow on Spark via Docker.
Rock Health Python
⭐
8
Code for Rock Health Python-for-Hadoop overview
Cloudera_material
⭐
7
Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.
Spark On K8s
⭐
7
Presenting 3 ways to run Spark over containers, this project is recommended to those who seek to explore Big Data out of a Hadoop Cluster.
Datascience Playground
⭐
6
A scalable, cloud-ready environment for Data Science using Docker
Big Data Cluster
⭐
6
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center and pgAdmin. This cluster is solely intended for usage in a development environment. Do not use it to run any production workloads.
Pyspark Ml
⭐
6
Gathers data science and machine learning problem solving using PySpark and Hadoop.
Hadoop Hive Spark Docker
⭐
5
Hadoop-Hive-Spark cluster + Jupyter on Docker
Sparkintro
⭐
5
Spark Traffic
⭐
5
使用Spark批量处理离线交通大数据
Hackathonclt2018
⭐
5
Related Searches
Java Hadoop (2,130)
Spark Hadoop (1,188)
Hadoop Hdfs (1,095)
Hadoop Mapreduce (852)
Spark Pyspark (773)
Shell Hadoop (772)
Python Hadoop (761)
Hadoop Hive (703)
Python Pyspark (689)
1-35 of 35 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.