Summingbird Example

Example of SummingBird in hybrid mode
Alternatives To Summingbird Example
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Cqrs Manager For Distributed Reactive Services278
4 years ago4apache-2.0Java
Experimental CQRS and Event Sourcing service
Kukulcan91
2 years agoapache-2.0Scala
A REPL for Apache Kafka
Apps30
2 months ago73February 01, 20238Scala
Thimble19
2 years ago12February 09, 2020epl-2.0Clojure
Clojure sandbox for Streaming Data Platforms
Messenjer8
5 years ago1apache-2.0Clojure
A simple project to demonstrate building a streaming messaging app with Clojure, Kafka, Re-Frame, Pedestal, and Component. This corresponds to a walkthrough published on LambdaStew.com
Franzy Examples8
7 years ago2epl-1.0Clojure
Examples and more for Franzy, a suite of Kafka libraries including producers, consumers, admin, validation, config, and more.
Rotera8
4 years ago5otherHaskell
Persistent rotating queue
Streampunk7
10 months ago1apache-2.0Java
Polyglot Programmable Shell for Kafka
Scoop7
5 years agoClojure
Speaking Kafka's protocol from Clojure REPL
Kc Repl3
6 months ago10epl-1.0Clojure
An interactive, command line tool for exploring Kafka clusters
Alternatives To Summingbird Example
Select To Compare


Alternative Project Comparisons
Readme

summingbird-example

Example of SummingBird in hybrid mode.

I'm currently improving it to be used as a production-ready bootstrap for any job.

See logged issues for expected improvements. Feel free as well to add any other!

Prerequisite

  • Get Kafka 0.7.2 here:

Unzip it anywhere and follow instructions until step 4: http://kafka.apache.org/07/quickstart.html

To test it, run a producer and a consumer in two separate shell windows and write anything to the producer shell window. You should see what was sent in the consumer shell window.

Note: the example is currently incompatible with Kafka 0.8.x due to KafkaSpout being available only for version 0.7.x. Some examples are available but not production ready.

On OSX you can use brew to get it:

   brew install memcached

Installation

  • First get summingbird from my cloned repository: sdjamaa/summingbird This clone uses version 0.8.0 of Storehaus library as there is a compatibility issue with Memcached store.

  • Build the project using the following command:

    cd summingbird
   ./sbt update compile
  • Compile summingbird-example project:
   cd summingbird-example
   ./sbt update compile

Configuration

Configuration strings are located in src/main/resources/application.properties file.

All configuration strings are retrieved from package objects.

  • Change all paths for Scalding. Storm and memcached configuration values should be the same (default local mode).

Running everything

  • Start ZooKeeper, Kafka server and a Kafka producer: follow instructions from step 2 and step 3 (http://kafka.apache.org/07/quickstart.html)

  • Start memcached service: go to memcached folder and run memcached command

  • Start example "service":

    ./sbt "company-hybrid-example/run --local"

This starts Storm and Scalding platforms

  • Start example console in another term window to send data and test Kafka (this will be soon replaced by a real test producer)
    ./sbt "company-hybrid-example/console"

You can now send messages from your Kafka producer which will be consumed by Summingbird.

To test the Storm consumer, type in the Scala REPL:

    scala> import com.company.summingbird.client._
    import com.company.summingbird.client._
    scala> HybridClient.stormLookup("timestamp") // This tests the Storm store linked to the ClientStore
    res1: Option[Long] = Some(2)
    scala> HybridClient.processHadoop // This tests the Storm store linked to the ClientStore
    scala> HybridClient.lookup("timestamp") // This inserts a file containing 2 lines with timestamp and 1 line with another json key/value pair
    [... lot of logs ...]
    scala> HybridClient.lookup("timestamp") // This queries the ClientStore to retrieved merged values between Storm and Scalding stores
    res2: Option[Long] = Some(4)

To test the Scalding consumer, type in the Scala REPL (to be replaced by a real consumer App as well)

    scala> HybridClient.processHadoop
    [... some logging ...]
    scala> HybridClient.hadoopLookup
    Results : Right((BatchID.11636597,ReaderFn(<function1>)))
    scala> ScaldingRunner.queryFiles()
    14/04/02 00:36:35 INFO compress.CodecPool: Got brand-new decompressor
    14/04/02 00:36:35 INFO compress.CodecPool: Got brand-new decompressor
    14/04/02 00:36:35 INFO compress.CodecPool: Got brand-new decompressor
    lolstamp : 1
    timestamp : 2

Troubleshooting

For Storm: make sure everything is already launched (Kafka + memcached)

For Scalding: change directories in Scalding package object and create a file to be consumed (in a JSON format) somewhere.

Popular Kafka Projects
Popular Repl Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Scala
Kafka
Repl
Memcached
Storm
Kafka Producer