Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for scala hdfs
hdfs
x
scala
x
102 search results found
Bigdata Notes
⭐
14,872
大数据入门指南 ⭐
Tensorflowonspark
⭐
3,851
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Sparta
⭐
526
Real Time Analytics and Data Pipelines based on Spark Streaming
Distributed Graph Analytics
⭐
135
Distributed Graph Analytics (DGA) is a compendium of graph analytics written for Bulk-Synchronous-Parallel (BSP) processing frameworks such as Giraph and GraphX. The analytics included are High Betweenness Set Extraction, Weakly Connected Components, Page Rank, Leaf Compression, and Louvain Modularity.
Hsuntzu
⭐
134
HDFS compress tar zip snappy gzip uncompress untar codec hadoop spark
Cobrix
⭐
131
A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Correlation Approximation
⭐
90
Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets
Cuesheet
⭐
85
A framework for writing Spark 2.x applications in a pretty way
Scala Hadoop
⭐
70
Using Hadoop with Scala
Sparkplugins
⭐
70
Code and examples of how to write and deploy Apache Spark Plugins with Spark 3.x. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics systems with user-provided monitoring probes.
Stratio Connector Hdfs
⭐
68
(DEPRECATED) HDFS
Sparkmultitool
⭐
66
Tools for spark which we use on the daily basis
Monix Connect
⭐
60
A set of connectors for Monix. 🔛
Textgrounder
⭐
60
A system for connecting language to space and time.
Sparkstreaming.sessionization
⭐
51
NRT Sessionization with Spark Streaming landing on HDFS and putting live stats in HBase
Akka Persistence Hbase
⭐
49
An HBase backed Journal for Akka's experimental persistence / event-sourcing
Speedo
⭐
49
Parallelizing Stochastic Gradient Descent for Deep Convolutional Neural Network
Spdt
⭐
46
Streaming Parallel Decision Tree
Locis
⭐
44
Implementation of "A Parallel Spatial Co-location Mining Algorithm Based on MapReduce" paper
Spark Parquet Thrift Example
⭐
44
Example Spark project using Parquet as a columnar store with Thrift objects.
Neo4j Dbpedia Importer
⭐
43
DBpedia.org RDF to CSV for import into Neo4j
Sparkoscope
⭐
43
Enabling Spark Optimization through Cross-stack Monitoring and Visualization
Seahorse Workflow Executor
⭐
41
Spark Scala Maven Boilerplate Project
⭐
40
This is a skeleton of a Scala project with maven to start using Spark
Etl Light
⭐
38
A light Kafka to HDFS/S3 ETL library based on Apache Spark
Xxhadoop
⭐
37
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stream Loader
⭐
30
Components for building stream loaders from Kafka to arbitrary storages
Pucket
⭐
29
Bucketing and partitioning system for Parquet
Starlake
⭐
29
Starlake is an On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Topnotch
⭐
29
A framework for systematically quality controlling big data.
Enceladus
⭐
28
Dynamic Conformance Engine
Snackfs
⭐
27
HDFS compatible Distributed Filesystem backed Cassandra
Sparkhbaseexample
⭐
26
Spark code to analyze HBase Snapshots
Wasp
⭐
25
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Hadooplearning
⭐
25
全套大数据基础学习教程,包含最基础的centos、maven。大数据主要包含hdfs、mr、yarn
Spark Example
⭐
24
spark mllib example
Streamingstopgraceful
⭐
23
Example to show how to stop the Spark Streaming Application Gracefully
Spark Workshop
⭐
22
Code examples and docker environment for Spark
Kafka Spark Streaming
⭐
22
Project for reading data from kafka and writing to kafka and HBase with kerberos
Spark_log_data
⭐
21
Flume-to-Spark-Streaming Log Parser
Knn_is
⭐
21
Spark Gdb
⭐
20
A library for parsing and querying an Esri File Geodatabase with Apache Spark.
Offlineesindexgenerator
⭐
19
Offline Elasticsearch index generator
Spark Notes
⭐
18
Note anything during writing spark or scala
Conductor
⭐
18
Efficient, distributed downloads of large files from S3 to HDFS using Spark.
Spark Emr
⭐
17
Spark Elastic MapReduce bootstrap and runnable examples.
Kiji Express
⭐
17
Salesforce2hadoop
⭐
16
Import Salesforce data into Hadoop HDFS in Avro format
Spark2 Etl Examples
⭐
16
A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0
Featurestore
⭐
15
Building blocks and patterns for building data prep transformations and feature engineering in Spark.
Bigdata
⭐
15
小白大数据学习笔记 ⭐
Spark Fits
⭐
15
FITS data source for Spark SQL and DataFrames
Bigkube
⭐
14
Minikube for big data with Scala and Spark
Snackfs Release
⭐
14
The GA Release of SnackFS
Sparkphoenix
⭐
14
Spark Example using Phoenix to interact with HBase
Spark Playground
⭐
13
Playground for experimenting with Apache Spark
Bigdata_docker
⭐
13
Big Data Docker Data Science Spark Spark3 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook
Cloud Scale Bwamem
⭐
13
Spark Benchmarks
⭐
12
Benchmarking suite for Apache Spark
Sparkfaultbench
⭐
12
A Spark Reliability Testing Suite
Spring Boot Spark Integration Demo
⭐
12
Demo on how to integrate Spring Data JPA, Apache Spark and GraphX with Java and Scala mixed codes
Easterbunny
⭐
11
EasterBunny数据分析
Spark_mllib_algorithm_1.6.0
⭐
11
Spark Mllib 1.6.0版本算法封装
Oculus
⭐
10
Oculus is a hadoop based video fingerprinting system, using scala, scalding and ffmpeg
Smartfd
⭐
10
SmartFD: Efficient and Scalable Functional Dependency Discovery on Distributed Data-Parallel Platforms
Hbase Mr Pof
⭐
9
A proof of concept prototype of new HBase + Hadoop Map Reduce integration
Dgst
⭐
9
DGST: Efficient and Scalable Generalized Suffix Tree Construction on Apache Spark
Imb Sampling Ros_and_rus
⭐
9
Spark implementations of two data sampling methods (random oversampling and random undersampling) for imbalanced classification datasets
Cassandra Summit Demo
⭐
9
Hadoop integration demo for the Cassandra Summit
Fastunfolding
⭐
9
Bigdatademo
⭐
9
The demo of using Kafka, Spark, Hive, Cassandra, etc by using Docker. It produces the production ready environment for any kinds of big data project relates to Hadoop ecosystem
Telecom Streaming
⭐
9
Telecom scenarios implemented with streaming techniques
Lambda_poc
⭐
8
example lambda architecture using Kafka, Spark, Cassandra, Hadoop
Digwords
⭐
8
Cloudera Cca175
⭐
8
CCA Spark and Hadoop Developer Certification
Etl Processes Using Sqoop Hadoop Hive Spark And Scala
⭐
7
I implemented various ETL processes like loading the data using sqoop from mysql to hdfs, transform the data using Spark and Scala, perform analytics using Spark and Scala and loading the data back to HDFS.
Spark All Pairs Shortest Path
⭐
7
Inazuma
⭐
7
spark + kuromoji + d3.js = 誰でも簡単できる「つぶやきビッグデータ」
Example Spark Scala Read And Write From Hdfs
⭐
7
Spark Kuromoji Tokenizer
⭐
7
Kuromoji Tokenizer for Spark DataFrames
Coheel
⭐
6
A library for the automatic detection and disambiguation of knowledge base entity mentions in texts.
Easynotes
⭐
6
EasyNotes(简记)- sync with gitbook.
Spark Es Csv
⭐
6
spark export hdfs file to json or csv
Bigdata
⭐
6
小白大数据学习笔记,学习路线,技术路线
Sparkdatalineagecapture
⭐
6
Capture the logical plan from Spark (SQL)
Camus2kafka
⭐
6
Take Kafka topics that were previously persisted in Hadoop through Camus and push them back into Kafka.
Search Demo
⭐
6
Fantasysportsleagues
⭐
6
Implementation of a website that tracks fantasy sports leagues.
Spark Twitter Example
⭐
6
Spark example app that demonstrates, on a broad level, the various aspects of Spark.
Loganalysis
⭐
6
日志分析项目
Dl4j Demo
⭐
5
Example Spark Scala Read And Write From Hive
⭐
5
Spark By Example
⭐
5
Explore Spark API using a set of comprehensive examples.
Omnidatahouse
⭐
5
Utilities for OMNILab data warehouse.
Hadoop Watcher
⭐
5
Keedio's Hadoop Watcher is a functionality for watching a hdfs path for changes.
Las Vpe Platform
⭐
5
ISEE Video Parsing and Evaluation (VPE) Platform.
Sz Metro
⭐
5
深圳地铁大数据客流分析系统
Hadoop Installation
⭐
5
Instructions on setting up Hadoop, HDFS, java, sbt, kafka, scala, spark and flume on Ubuntu 18.04
Cdrec
⭐
5
Cross Domain Recommender
Hadoopio
⭐
5
Scala/Java library to conveniently interact with Avro files stored in Hadoop HDFS.
Related Searches
Scala Sbt (4,178)
Scala Spark (3,279)
Scala Akka (2,120)
Java Scala (1,794)
Scala Play Framework (1,309)
Hadoop Hdfs (1,082)
Plugin Scala (1,079)
Scala Kafka (969)
Scala Functional Programming (942)
Scala Scalajs (887)
1-100 of 102 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.