Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for scala hadoop
hadoop
x
scala
x
162 search results found
Spark
⭐
37,661
Apache Spark - A unified analytics engine for large-scale data processing
Bigdata Notes
⭐
14,872
大数据入门指南 ⭐
Deeplearning4j
⭐
13,597
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...
It_book
⭐
8,543
本项目收藏这些年来看过或者听过的一些不错的常用的上千本书籍,没准你想找的书就在这里呢,包含了互联网行
Bigdl
⭐
4,728
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using bigdl-llm
Tensorflowonspark
⭐
3,851
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Bigdataguide
⭐
2,355
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Szt Bigdata
⭐
2,055
深圳地铁大数据客流分析系统🚇🚄🌟
Kyuubi
⭐
1,849
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Docker Spark
⭐
1,783
Apache Spark docker image
Movie_recommend
⭐
1,441
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Carbondata
⭐
1,401
High performance data store solution
Data Algorithms Book
⭐
973
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Livy
⭐
911
Livy is an open source REST interface for interacting with Apache Spark from anywhere
Spline
⭐
553
Data Lineage Tracking And Visualization Solution
Spark Redshift
⭐
514
Redshift data source for Apache Spark
Scoobi
⭐
485
A Scala productivity framework for Hadoop.
Graphx
⭐
353
Former GraphX development repository. GraphX has been merged into Apache Spark; please submit pull requests there.
Sagemaker Spark
⭐
285
A Spark library for Amazon SageMaker.
Sparkonhbase
⭐
277
SparkOnHBase
Parquet4s
⭐
267
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Sparkstreaming
⭐
253
Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志
Bigdata
⭐
219
大数据处理相关技术学习之路(持续更新中...)。 Bigdata整理 --> 慢慢滴~ 大数据相关技术包括离线处理,实时处理,OLAP等,如hadoop、spark、flink、hive、
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Docker Flink
⭐
157
Apache Flink docker image
Aliyun Emapreduce Datasources
⭐
157
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Sparktraining
⭐
140
Examples for Spark Training in chinahadoop.cn
Eel Sdk
⭐
140
Big Data Toolkit for the JVM
Logvision
⭐
136
分布式实时日志分析与入侵检测系统
Xlearning Xdml
⭐
101
extremely distributed machine learning
Schedoscope
⭐
95
Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or whatever you choose to call your Hadoop data warehouse these days.
Correlation Approximation
⭐
90
Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets
Smart Data Lake
⭐
87
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Guacamole
⭐
86
Spark-based variant calling, with experimental support for multi-sample somatic calling (including RNA) and local assembly
Flowman
⭐
85
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Docker Spark
⭐
77
🚢 Docker image for Apache Spark
Waimak
⭐
73
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Scala Hadoop
⭐
70
Using Hadoop with Scala
Sparkplugins
⭐
70
Code and examples of how to write and deploy Apache Spark Plugins with Spark 3.x. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics systems with user-provided monitoring probes.
Spark Gpu
⭐
61
Spark GPU and SIMD Support
Spark Submit Ui
⭐
60
This is a based on playframwork for submit spark app
Textgrounder
⭐
60
A system for connecting language to space and time.
Scalding Tutorial
⭐
55
The Scalding tutorial as a standalone SBT project
Spark Training
⭐
52
Repository used for Spark Trainings
Til
⭐
51
Today I Learned
Scalding Workshop
⭐
48
A half-day workshop on Scalding, the Scala API for Cascading
Hadoop Spark Hive Cluster Docker
⭐
45
hadoop-spark-hive-cluster-docker
Docker Spark Cluster
⭐
44
A Spark cluster setup running on Docker containers
Sparkoscope
⭐
43
Enabling Spark Optimization through Cross-stack Monitoring and Visualization
Neo4j Dbpedia Importer
⭐
43
DBpedia.org RDF to CSV for import into Neo4j
Yuzhouwan
⭐
42
Code Library for My Blog
Scalahadoop
⭐
42
A wrapper for Hadoop in Scala
Hbrdd
⭐
39
一个为spark批量导入数据到hbase的库
Spark1.52
⭐
38
Spark源代码中文注释
Hia Examples
⭐
38
Hadoop In Action Examples
Xxhadoop
⭐
37
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Sparkdemo
⭐
34
spark全示例代码(java、scala) Spark most full instance code DEMO (java、scala)
Mastering Scala Machine Learning
⭐
32
Mastering-Scala-Machine-Learning
Akkeeper
⭐
31
An easy way to deploy your Akka services to a distributed environment.
Scala Hadoop Example
⭐
31
A translation of the WordCount example from the Hadoop tutorial from Java to Scala.
Enceladus
⭐
28
Dynamic Conformance Engine
Alicloud Hbase Spark Examples
⭐
27
Big Data Analytics With Hadoop 3
⭐
25
Big Data Analytics with Hadoop 3 published by Packt
Wasp
⭐
25
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Varys
⭐
25
Varys: Efficient Clairvoyant Coflow Scheduler
Hadooplearning
⭐
25
全套大数据基础学习教程,包含最基础的centos、maven。大数据主要包含hdfs、mr、yarn
Daflow
⭐
24
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Movies Analytics In Spark And Scala
⭐
24
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Practical Data Science With Hadoop And Spark
⭐
23
Sparkucx
⭐
23
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Darwin
⭐
22
Avro Schema Evolution made easy
Cloud Integration
⭐
21
Spark cloud integration: tests, cloud committers and more
Hadoop Scalding Nojartool
⭐
20
Hadoop Tool implementation which enables extreme productivity - running MR jobs on your cluster right from your sbt shell!
Spark Notes
⭐
18
Note anything during writing spark or scala
Spark Distcp
⭐
18
A re-implementation of Hadoop DistCP in Apache Spark
Divolte Examples
⭐
17
Usage examples for Divolte collector
Fastml4j
⭐
16
Fast Scala and nd4j based machine learning framework
Couchbase Spark Samples
⭐
16
Examples on how to use the Couchbase Spark Connector
Fulgurite
⭐
16
A library to read and write GeoTIFF images using Apache Spark
Bigdata
⭐
15
小白大数据学习笔记 ⭐
Scamr
⭐
15
A Hadoop map reduce framework for Scala.
Featurestore
⭐
15
Building blocks and patterns for building data prep transformations and feature engineering in Spark.
Workshop
⭐
15
Bigdata Learning
⭐
14
大数据学习,主要涉及Kafka、ZooKeeper、Hive、HBase、Spark
Pulse
⭐
14
phData Pulse application log aggregation and monitoring
Scaldingunit
⭐
14
TDD utils for Scalding developers
Aurasparktraining
⭐
14
Spark Training Example For Aura.cn
Aiqiyi Sparkstreaming
⭐
14
SparkStreaming爱奇艺实时流统计及可视化展示
Yunshu Notes R
⭐
14
使用SpringBoot开发的基于HBASE的大数据存储分布式云计算笔记(后端)
Snackfs Release
⭐
14
The GA Release of SnackFS
Smoke
⭐
14
Run Spark jobs interactively from the web
Spark Jetty Server
⭐
13
Recipes and examples for Apache Spark
Spark Emr
⭐
12
spark-emr
Sparkfaultbench
⭐
12
A Spark Reliability Testing Suite
Nyc_taxi_pipeline
⭐
12
Design/Implement stream/batch architecture on NYC taxi data | #DE
Opensourceteams All
⭐
12
所有项目汇总
Bigdata News
⭐
12
基于Spark2.2新闻网大数据实时系统项目
Easterbunny
⭐
11
EasterBunny数据分析
Mahout With Scala
⭐
11
mahout-scala-api-samples
Related Searches
Scala Sbt (4,178)
Scala Spark (3,279)
Scala Akka (2,120)
Java Hadoop (2,117)
Java Scala (1,794)
Scala Play Framework (1,309)
Spark Hadoop (1,188)
Hadoop Hdfs (1,082)
Plugin Scala (1,079)
Scala Kafka (969)
1-100 of 162 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.