Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for spark jar
jar
x
spark
x
107 search results found
Spark Jobserver
⭐
2,837
REST job server for Apache Spark
Pkpmspark
⭐
697
awesome 三维数据挖掘 数据分析 & 推荐
Metorikku
⭐
536
A simplified, lightweight ETL Framework based on Apache Spark
Spark Solr
⭐
440
Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
Spark Fast Tests
⭐
385
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
Connectors
⭐
383
This library allows Scala and Java-based projects (including Apache Flink, Apache Hive, Apache Beam, and PrestoDB) to read from and write to Delta Lake.
Spark Jobserver
⭐
348
REST job server for Spark. Note that this is *not* the mainline open source version. For that, go to https://github.com/spark-jobserver/spark-jobserver This fork now serves as a semi-private repo for Ooyala.
Sparklint
⭐
293
A tool for monitoring and tuning Spark jobs for efficiency.
Transport
⭐
288
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
Sagemaker Spark
⭐
285
A Spark library for Amazon SageMaker.
Jpmml Sparkml
⭐
265
Java library and command-line application for converting Apache Spark ML pipelines to PMML
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Sbt Spark Package
⭐
137
Sbt plugin for Spark packages
Cobrix
⭐
131
A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Spark Ts Examples
⭐
110
Spark TS Examples
Ispark
⭐
104
An Apache Spark-shell backend for IPython
Sbt Spark Submit
⭐
86
sbt plugin for spark-submit
Splash
⭐
86
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Guacamole
⭐
86
Spark-based variant calling, with experimental support for multi-sample somatic calling (including RNA) and local assembly
Spark Llap
⭐
82
Geosparktemplateproject
⭐
55
Template projects for GeoSpark, GeoSpark-SQL, GeoSpark-Viz
Sparkstreamingapps
⭐
51
A spark sbt blueprint to build your own spark apps off of (for cloud native runtime, see the kube/spark examples)
Spark Scala Maven Example
⭐
50
Example Maven configuration for a Spark, Scala project
Hydra Spark
⭐
46
Sparkjobserverclient
⭐
45
Java Client of the Spark Job Server implementing the arranged Rest APIs
Simr
⭐
45
Spark In MapReduce (SIMR) - launching Spark applications on existing Hadoop MapReduce infrastructure
Spark Cluster Deployment
⭐
43
Automates Spark standalone cluster tasks with Puppet and Fabric.
Product Category Predict
⭐
43
商品类目预测,使用 Spring Boot 开发框架和 Spark MLlib 机器学习框架,通过 TF-IDF 和 Bayes 算法,训练出一个商品类目预测模型。该模型可以根据商品名称自动预测出商品类目。项目对外提供 RESTFul 接口。
Sqoop On Spark
⭐
42
Sqoop on Apache Spark Engine
Dblink
⭐
38
Distributed Bayesian Entity Resolution in Apache Spark
Sope
⭐
37
Apache Spark ETL Utilities
Rdf2x
⭐
35
RDF2X converts big RDF datasets to the relational database model, CSV, JSON and ElasticSearch.
Gatk Protected
⭐
34
Obsolete/Legacy GATK repository -- go to https://github.com/broadinstitute/gatk instead
Streamliner Starter
⭐
33
Starter project for building MemSQL Streamliner Pipelines
Cipher
⭐
33
基于hdfs spark的视频非结构化数据计算
Docker Spark Submit
⭐
32
Docker image to submit Spark applications
Spark Job Rest
⭐
32
Alicloud Hbase Spark Examples
⭐
27
Spark Yarn
⭐
25
Launch Spark clusters on YARN
Spark Cassandra Example
⭐
25
Example usage of spark cassandra connector
Sbt Spark
⭐
24
Simple SBT plugin to configure Spark applications
Spark Example
⭐
24
spark mllib example
Sparkucx
⭐
23
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Streamliner Examples
⭐
23
Example code for building your own MemSQL Streamliner Pipelines
Gospark
⭐
23
Go bindings for Apache Spark
Dac
⭐
23
A Distributed Associative Classifier for Apache Spark, mirror of
Book Examples
⭐
21
Examples from Learning Hadoop 2 (Packt Publishing, 2015)
Knn_is
⭐
21
Spark_log_data
⭐
21
Flume-to-Spark-Streaming Log Parser
Idocuments
⭐
20
收集与 Java 开发相关的文档,包括基础系统服务(大数据、流计算、NoSQL 等)、专业名词、jar 包、开发工具等文档,持续更新……
Mapr Sparkml Streaming Uber
⭐
20
Spark Udf
⭐
18
Spark Emr
⭐
17
Spark Elastic MapReduce bootstrap and runnable examples.
Address Index Data
⭐
16
Spark2 Etl Examples
⭐
16
A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0
Tiledb Spark
⭐
15
Spark interface to the TileDB storage manager
Dl4j_coursera
⭐
15
Spark Example Project
⭐
15
A Spark WordCount example as a standalone SBT project
Itemset Mining
⭐
15
Probabilistic Itemset Mining
Copybookinputformat
⭐
14
Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...
Sparksql Scalapb Test
⭐
14
Test for SparkSQL ScalaPB
Sparkatscale
⭐
12
SparkAtScale
Kira
⭐
12
Kira is an astronomy image processing toolkit implemented with Apache Spark.
Octopufs
⭐
11
OctopuFS library helps managing cloud storage, ADLSgen2 specifically. It allows you to operate on files (moving, copying, setting ACLs) in very efficient manner. Designed to work on databricks, but should work on any other platform as well.
Spark Pubsub
⭐
11
Google Cloud Pubsub connector for Spark Streaming
Intelqatcodec
⭐
11
Gcore Spark
⭐
11
Implementation of the G-CORE graph query language on Spark
Spark Jobserver.g8
⭐
11
giter8 template for Spark Jobserver
Spark2demo
⭐
10
Spark Project Template
⭐
10
Template of a Spark project, with IDEA support, bundling with assembly, examples, ...
Spark Cass
⭐
10
Spark and Cassandra docker image and test files
Amazon S3 Tagging Spark Util
⭐
10
Kafka Sparkstreaming Hbase
⭐
10
Sparksql Stats
⭐
10
基于PySpark库,使用SparkSql连接MYSQL数据库并对数据进行统计分析的基础架构
Spark Boilerplate
⭐
10
A boilerplate for spark projects with docker support for local development and scripts for emr support.
Spark Submitter Console
⭐
9
A web application for submitting spark application
Kudusparklyr
⭐
9
A Kudu extension for Sparklyr
Bpmn.ai Ui
⭐
9
Easy setup and control of your bpmn.ai data flow
Bigdatabench Spark
⭐
9
BigDataBench Spark workloads
Spark Kafka Sink
⭐
9
A Kafka metric sink for Apache Spark
Product Relation Mining
⭐
9
商品关联关系挖掘,使用Spring Boot开发框架和Spark MLlib机器学习框架,通过FP-Growth算法,分析用户的购物车商品数据,挖掘商品之间的关联关系
Spark Kudu Up And Running
⭐
9
Spark on Kudu up and running samples
Bigpetstore Data Generator
⭐
9
Data Generator for BigPetStore
Spark Statsd
⭐
7
Divolte Spark
⭐
7
Utilities for using data created by Divolte collector in Spark, Spark Streaming and PySpark
Sensoranalytics
⭐
7
Example Spark Scala Read And Write From Hdfs
⭐
7
Tidyr.big
⭐
7
Scalable backend for tidyr
Incubator Rocketmq Externals
⭐
6
Spark Python Scala Udf
⭐
6
Demonstrates calling a Scala UDF from Python using spark-submit with an EGG and JAR
Spark_codebase
⭐
6
Collection of Spark core, streaming, sql, mllib examples & applications with base line unit tests
Avrotoparquet
⭐
6
Command line converter for Apache Avro to Apache Parquet file formats
Sbt Spark Plugin
⭐
6
A simple Sbt plugin used to fill spark-submit jar list
Errorsnrt
⭐
6
NRT detection of web-traffic anomalies using Flume, Spark Streaming and Impala
Mshift
⭐
6
MongoDB to Redshift data transfer using Apache Spark.
Spark Streaming Example
⭐
5
Example for a spark-streaming application using Kafka, Cassandra and Stateful Stream Processing
Spark Docker
⭐
5
run hadoop and spark cluster in docker
Geo_pyspark
⭐
5
Spark Statestore Rocksdb
⭐
5
Spark Statestore using RocksDB
Geomesa Twitter
⭐
5
Collect and ingest twitter streaming data into geomesa
Related Searches
Java Jar (7,863)
Scala Spark (3,279)
Python Spark (2,053)
Java Spark (1,587)
Apache Spark (1,207)
Spark Hadoop (1,188)
Jupyter Notebook Spark (1,151)
Spark Kafka (985)
Spark Streaming (817)
Spark Pyspark (812)
1-100 of 107 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.