Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for hadoop apache spark
apache-spark
x
hadoop
x
49 search results found
Spark
⭐
37,661
Apache Spark - A unified analytics engine for large-scale data processing
Bigdl
⭐
4,728
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using bigdl-llm
Docker Spark
⭐
1,783
Apache Spark docker image
Dr Elephant
⭐
1,301
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Livy
⭐
911
Livy is an open source REST interface for interacting with Apache Spark from anywhere
Flintrock
⭐
627
A command-line tool for launching Apache Spark clusters.
Dist Keras
⭐
611
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Spline
⭐
553
Data Lineage Tracking And Visualization Solution
Spark Jupyter Aws
⭐
255
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Learning Hadoop And Spark
⭐
160
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Hdfs_fdw
⭐
131
PostgreSQL foreign data wrapper for HDFS
Griffon Vm
⭐
129
Griffon Data Science Virtual Machine
Aut
⭐
128
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Mongo Spark
⭐
93
Example application on how to use mongo-hadoop connector with Spark
Flowman
⭐
85
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Spork
⭐
84
Pig on Apache Spark
Docker Spark
⭐
77
🚢 Docker image for Apache Spark
Euphoria
⭐
74
Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.
Apachespark
⭐
59
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Serverless Spark Workshop
⭐
56
Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service
Datapipelines Essentials Python
⭐
45
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Awesome Tools
⭐
32
curated list of awesome tools and libraries for specific domains
Netapp Hadoop Nfs Connector
⭐
29
This projects provides a NFSv3 connector for Hadoop. Using the connector, Apache Hadoop and Apache Spark can use NFSv3 server as their storage backend.
Sparkproject
⭐
26
Using Apache Spark in an ArcMap Toolbox
Daflow
⭐
24
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Sparkucx
⭐
23
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Learn Hadoop And Spark
⭐
22
This repository focuses on gathering and making a curated list resources to learn Hadoop for FREE.
Cloud Integration
⭐
21
Spark cloud integration: tests, cloud committers and more
Spark Kubernetes
⭐
19
Apache Spark on Kubernetes
Mmtf Spark
⭐
19
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Spark Distcp
⭐
18
A re-implementation of Hadoop DistCP in Apache Spark
Couchbase Spark Samples
⭐
16
Examples on how to use the Couchbase Spark Connector
Fulgurite
⭐
16
A library to read and write GeoTIFF images using Apache Spark
Bigdata Projects
⭐
14
Student projects in Big Data field.
Spark Jetty Server
⭐
13
Recipes and examples for Apache Spark
Spark Benchmarks
⭐
12
Benchmarking suite for Apache Spark
Docker Spark Native Yarn
⭐
12
Spark
⭐
10
Netflix branches of Apache Spark
Real Time Risk Management System
⭐
9
Finance Group
Redrock V2
⭐
8
RedRock v2 Repository
Bigdata
⭐
8
빅데이터 pipeline 구성 요소 기술들에 관한 coding 실습 및 연구
K8s Bigdata
⭐
8
Apache Spark with HDFS cluster within Kubernetes
Hadoop Hands On
⭐
8
Learning how to tame the Big Data with Hadoop and related technologies
Apache Spark Build Pipeline
⭐
7
Docker container, equipped with all necessary tools to Build Apache Spark and generate RPMs
Diy A Cluster
⭐
6
How to Do-It-Yourself A Cluster for Spark & Hadoop
Bigdata
⭐
6
小白大数据学习笔记,学习路线,技术路线
Sparkjavaexamples
⭐
5
Apache Spark Basics - Java Examples
Related Searches
Java Hadoop (2,117)
Spark Hadoop (1,188)
Hadoop Hdfs (1,082)
Hadoop Mapreduce (851)
Shell Hadoop (772)
Python Hadoop (761)
Hadoop Hive (703)
Apache Hadoop (514)
Scala Apache Spark (497)
1-49 of 49 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.