Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for spark hdfs
hdfs
x
spark
x
212 search results found
Spark Emr
⭐
17
Spark Elastic MapReduce bootstrap and runnable examples.
Bidmach_spark
⭐
16
Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).
Spark2 Etl Examples
⭐
16
A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0
Spark Cnn
⭐
16
CS848 Final Project (using spark to speed up CNN)
Hdfs Spark Hive Dev Setup
⭐
15
This repository contains makescript and instruction on how to setup local hdfs+spark+hive setup.
Yandex Big Data Engineering
⭐
15
Spark Fits
⭐
15
FITS data source for Spark SQL and DataFrames
Bigdata
⭐
15
小白大数据学习笔记 ⭐
Minispark
⭐
15
Java implementation of a mini Spark-like framework named MiniSpark that can run on top of a HDFS cluster. MiniSpark supports operators including Map, FlatMap, MapPair, Reduce, ReduceByKey, Collect, Count, Parallelize, Join and Filter.
Featurestore
⭐
15
Building blocks and patterns for building data prep transformations and feature engineering in Spark.
Cloud Local
⭐
14
Install script for a local 1 node cloud...no excuses folks
Bdp
⭐
14
A Big Data Platform Prototype Project
Sparkphoenix
⭐
14
Spark Example using Phoenix to interact with HBase
Big Data Course
⭐
14
Practice course on Big Data
Copybookinputformat
⭐
14
Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...
Bigkube
⭐
14
Minikube for big data with Scala and Spark
Bigdata Fun
⭐
14
A complete (distributed) BigData stack, running in containers
Local Hashicorp Stack
⭐
14
Local Hashicorp Stack for DevOps Development without Hypervisor or Cloud
Hdfs Geohex
⭐
13
(Web)Mapping Elephants with Sparks
Spark Playground
⭐
13
Playground for experimenting with Apache Spark
Bigdata_docker
⭐
13
Big Data Docker Data Science Spark Spark3 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook
Taller_sparkr
⭐
12
Taller SparkR para las Jornadas de Usuarios de R
Camus Compressor
⭐
12
Camus Compressor merges files created by Camus and saves them in a compressed format.
Sparkfaultbench
⭐
12
A Spark Reliability Testing Suite
Cmsspark
⭐
12
General purpose framework to run CMS experiment workflows on HDFS/Spark platform
Cloudera Framework
⭐
12
Spark Benchmarks
⭐
12
Benchmarking suite for Apache Spark
Spring Boot Spark Integration Demo
⭐
12
Demo on how to integrate Spring Data JPA, Apache Spark and GraphX with Java and Scala mixed codes
Bigdataguide
⭐
11
秋招自学上岸,自学太难了,想总结一份详细的大数据开发资料,包括基础 | 架构 | 源码,让更多自学的伙伴少走弯路。 有相关问题可以添加公众号:大数据老刘,联系老刘!
Fm
⭐
11
using FM latent vectors as embedding features
Dcos Jupyterlab Service
⭐
11
JupyterLab Notebook for Mesosphere DC/OS
Spark_mllib_algorithm_1.6.0
⭐
11
Spark Mllib 1.6.0版本算法封装
Git Influencer
⭐
11
Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Network.
Cca175 Exam Preparation
⭐
11
Cloudera CCA175 Spark and Hadoop Developer exam preparation
Spark Tpcds Benchmark
⭐
11
Utility for benchmarking changes in Spark using TPC-DS workloads
Easterbunny
⭐
11
EasterBunny数据分析
Artmosphere
⭐
11
Data Engineering Project at Insight
Sparknow
⭐
11
Deploy Spark on OpenStack. Now!
Literate Computing Hadoop
⭐
11
Literate Computing for Reproducible Infrastructure - Hadoop Practice
Masterdatcom_bdcc_practice
⭐
10
Practice and Workshop on BigData and Cloud Computing using Docker Containers and OpenNebula. HDFS, hadoop and spark+R
Tpcds
⭐
10
TPC-DS benchmarks including data generation with Spark and queries with Spark
Smartfd
⭐
10
SmartFD: Efficient and Scalable Functional Dependency Discovery on Distributed Data-Parallel Platforms
Tis Ansible
⭐
10
TIS deployment script
Bigdata Etl Pipeline
⭐
10
The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a complete data pipeline with all components seamlessly set up and ready to use
Spark On Yarn Cluster
⭐
10
A Procedure To Create A Yarn Cluster Based on Docker, Run Spark, And Do TPC-DS Performance Test.
Bigdata20180301
⭐
10
巨量資料導論 上課資料
Hadoop On Kubernetes
⭐
10
hadoop on kubernetes. It contains the configuration of HDFS and Yarn
Docker Mesos Pyspark Hdfs
⭐
9
example of a simulated multi-node mesos/(py)spark cluster using docker containers
Telecom Streaming
⭐
9
Telecom scenarios implemented with streaming techniques
Hackathonclt2019
⭐
9
Uhp
⭐
9
uhp for ucweb
Dgst
⭐
9
DGST: Efficient and Scalable Generalized Suffix Tree Construction on Apache Spark
Bigdatademo
⭐
9
The demo of using Kafka, Spark, Hive, Cassandra, etc by using Docker. It produces the production ready environment for any kinds of big data project relates to Hadoop ecosystem
Bigdata Docker
⭐
9
Run Hadoop Cluster within Docker Containers.
Imb Sampling Ros_and_rus
⭐
9
Spark implementations of two data sampling methods (random oversampling and random undersampling) for imbalanced classification datasets
Fastunfolding
⭐
9
Amoeba
⭐
9
Spark2 H2o R Zeppelin
⭐
9
A stack for data mining using Spark2, H2O, R and Zeppelin running on Cloudera Hadoop Distribution
Lambda_poc
⭐
8
example lambda architecture using Kafka, Spark, Cassandra, Hadoop
Spark Yarn Hadoop Cluster Vagrant
⭐
8
Vagrant project to spin up a cluster of 4 nodes with Spark, YARN and Hadoop
Geotrellis Geomesa Template Project
⭐
8
Tutorial with Spark, GeoTrellis and GeoMesa examples
Blaspark
⭐
8
Distributed linear algebra operations using Apache Spark
Geotrellis Ec2 Cluster
⭐
8
Scripts to deploy a GeoTrellis Spark cluster on EC2
Docker Spark Yarn Cluster Mode
⭐
8
Run Spark 2.0.2 on YARN and HDFS inside docker container in Multi-Node Cluster mode
Hadoop Hands On
⭐
8
Learning how to tame the Big Data with Hadoop and related technologies
Streamsx.sparkmllib
⭐
8
Toolkit for real-time scoring using Apache Spark MLLib library
Hands On Hadoop
⭐
8
Hadoop, MapReduce, HDFS, Spark, Pig, Hive, HBase, MongoDB, Cassandra, Flume - the list goes on! Over 25 technologies.
Bigdata
⭐
8
빅데이터 pipeline 구성 요소 기술들에 관한 coding 실습 및 연구
Vagrant Jilla Hadoop
⭐
8
Vagrant setup to spin up vm hadoop cluster
2018 Hadoop
⭐
7
存放代码资源,交流大数据开发技术。共同成长,一同进步。
Spark Kuromoji Tokenizer
⭐
7
Kuromoji Tokenizer for Spark DataFrames
Etl Processes Using Sqoop Hadoop Hive Spark And Scala
⭐
7
I implemented various ETL processes like loading the data using sqoop from mysql to hdfs, transform the data using Spark and Scala, perform analytics using Spark and Scala and loading the data back to HDFS.
Spark Tpc Ds
⭐
7
Spark job for the TPC-DS benchmark
Tidyr.big
⭐
7
Scalable backend for tidyr
Docker Hdfs Alluxio Spark
⭐
7
Docker images and deployment configurations for a cluster of HDFS, Alluxio and Spark. Focusing on data locality. Support Openshift 3.4, and more comming.
Spark Kubernetes Demo
⭐
7
Spark on Kubernetes for Demo
Inazuma
⭐
7
spark + kuromoji + d3.js = 誰でも簡単できる「つぶやきビッグデータ」
Example Spark Scala Read And Write From Hdfs
⭐
7
Spark All Pairs Shortest Path
⭐
7
Distributedml
⭐
6
Distributed Machine Learning for Stock Price Prediction
Spark Es Csv
⭐
6
spark export hdfs file to json or csv
Easynotes
⭐
6
EasyNotes(简记)- sync with gitbook.
Big Data Stack
⭐
6
Hadoop-based Big Data stack (hdfs, yarn, spark, etc)
Hadoop
⭐
6
Infraestructura para Big Data : Hadoop + NiFi +Spark + Hive usando Docker
Big Data Cluster
⭐
6
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center and pgAdmin. This cluster is solely intended for usage in a development environment. Do not use it to run any production workloads.
Sparkdatalineagecapture
⭐
6
Capture the logical plan from Spark (SQL)
Big Data Knowledge
⭐
6
📖大数据相关知识集锦
Docker Single Node Hadoop
⭐
6
This docker is used to create a single node hadoop with yarn activated
Map_reduce Ntua
⭐
6
Lab exercise of Advanced Topics in Database Systems course in NTUA regarding Map Reduce
Fantasysportsleagues
⭐
6
Implementation of a website that tracks fantasy sports leagues.
Spark Twitter Example
⭐
6
Spark example app that demonstrates, on a broad level, the various aspects of Spark.
Distributable_docker_sql_on_hadoop
⭐
6
Toy Hadoop cluster combining various SQL-on-Hadoop variants
Loganalysis
⭐
6
日志分析项目
Virgo Spark Cluster
⭐
6
Docker Images for the Virgo Spark Cluster. Distribution including HDFS, YARN, Hive, Spark 2.3+
Bigdata Platform
⭐
6
End to end big data project, that aims to show how to implement different big data layers, from the infrastructure layer to the end user one. [HADOOP][Spark][Kafka][Cassandra][Ansible][Jupyter
Abrane
⭐
6
Sahab Cloud Service
Bigdata
⭐
6
小白大数据学习笔记,学习路线,技术路线
Bigdata Ecosystem Architecture
⭐
6
Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.
Cluster In A Box
⭐
5
Contains a Dockerised Spark cluster including Cassandra, YARN, HDFS and Zeppelin. For education only.
Hadoopsparkeigenfaces
⭐
5
SVD computation via Hadoop and Spark for Eigenfaces face recognition
Related Searches
Scala Spark (3,279)
Python Spark (2,053)
Java Spark (1,587)
Jupyter Notebook Spark (1,268)
Apache Spark (1,207)
Spark Hadoop (1,188)
Hadoop Hdfs (1,075)
Spark Kafka (985)
Spark Streaming (817)
Spark Pyspark (812)
101-200 of 212 search results
< Previous
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.