Spark As Service Using Embedded Server

This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server
Alternatives To Spark As Service Using Embedded Server
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
a day ago93September 16, 2022183mitC++
SRS is a simple, high efficiency and realtime video server, supports RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH and GB28181.
Node Fetch8,220219,66825,410a day ago86July 31, 2022173mitJavaScript
A light-weight module that brings the Fetch API to Node.js
Node Spdy2,725292,1663863 years ago212April 04, 202062JavaScript
SPDY server on Node.js
Piping Server2,671
23 days ago106September 04, 202217mitTypeScript
Infinitely transfer between every device over pure HTTP with pipes or browsers
4 days ago74mitC++
modern c++(c++20), cross-platform, header-only, easy to use http framework
Squbs1,3816a year ago12January 18, 202169apache-2.0Scala
Akka Streams & Akka HTTP for Large-Scale Production Deployments
Embedio1,30133263 months ago153March 11, 202040otherC#
A tiny, cross-platform, module based web server for .NET
5 days ago19gpl-3.0C
µStreamer - Lightweight and fast MJPEG-HTTP streamer
Download1,14743,0521,568a year ago73April 02, 202056mitJavaScript
Download and extract files
2 months ago6lgpl-2.1C
Web Framework to build REST APIs, Webservices or any HTTP endpoint in C language. Can stream large amount of data, integrate JSON data with Jansson, and create websocket services
Alternatives To Spark As Service Using Embedded Server
Select To Compare

Alternative Project Comparisons


This application comes as Spark2.1-REST-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server.

1. Central Idea

I wanted to build an interactive REST api service on top of my ApacheSpark application which serves use-cases like:

- Load the trained model in SparkSession and quickly do the prediction for user given query._
- Have your big-data cached in cluster and provide user an endpoint to query it.
- Run some recurrent spark queries with varying parameters.

As you can see that the core of the application is not primarily a web-application OR browser-interaction but to have REST service performing big-data cluster-computation on ApacheSpark.

2. Akka-HTTP as apt-fit:

With Akka-Http, you normally don’t build your application on top of Akka HTTP, but you build your application on top of whatever makes sense and use Akka HTTP merely for the HTTP integration needs. So, I found Akka-HTTP to be right fit for the usecases mentioned above.

3. Architecture

3.1 To demo this, I've configured following four routes:

  1. homepage - http://localhost:8001 - says "hello world"
  2. version - http://localhost:8001/version - queries shared SparkSession and tells "spark version"
  3. activeStreams - http://localhost:8001/activeStreams - tells how many spark streams are active currently
  4. count - http://localhost:8001/count - random spark job to count number of elements in a sequence.

Following picture illustrates the routing of a HttpRequest:

4. Building

It uses Scala 2.11, Spark 2.1 and Akka-Http

mvn clean install

5. Execution

We can start our application as stand-alone jar like this:

mvn exec:java

5.1 cmd-line-args

Optionally, you can provide configuration params like spark-master, akka-port etc from command line. To see the list of configurable params, just type:

mvn exec:java -Dexec.args="--help" 
mvn exec:java -Dexec.args=“-h"
Help content will look something like this:
This application comes as Spark2.1-REST-Service-Provider using an embedded,
Reactive-Streams-based, fully asynchronous HTTP server (i.e., using akka-http).
So, this application needs config params like AkkaWebPort to bind to, SparkMaster
and SparkAppName

Usage: spark-submit spark-as-service-using-embedded-server.jar [options]
  -h, --help
  -m, --master <master_url>                    spark://host:port, mesos://host:port, yarn, or local. Default: local
  -n, --name <name>                            A name of your application. Default: SparkAsRestService
  -p, --akkaHttpPort <portnumber>              Port where akka-http is binded. Default: 8001

5.2 Tweak Default cmd-line args

There are 2 ways to change the default param values:

  1. Update src/main/resources/application.conf file directly. Build and then Run
  2. mvn exec:java -Dexec.args="--master <master> --name <spark-app-name> --akkaHttpPort <port-to-which-akka-should-listen-to>"

6. References


Popular Http Projects
Popular Stream Projects
Popular Networking Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Rest Api
Http Server
Apache Spark
Reactive Streams
Akka Http