Awesome Open Source
Awesome Open Source

Bigdata Playground

The aim is to create a Batch/Streaming/ML/WebApp stack where you can test your jobs locally or to submit them to the Yarn resource manager. We are using Docker to build the environment and Docker-Compose to provision it with the required components (Next step using Kubernetes). Along with the infrastructure, We are check that it works with 4 projects that just probes everything is working as expected. The boilerplate is based on a sample search flight Web application.

Installation

If you are on mac then, you can use package manager like brew to install sbt on your machine:

$ brew install sbt

For other systems, you can refer to manual instructions from sbt website http://www.scala-sbt.org/0.13/tutorial/Manual-Installation.html.

If you are on mac then, you can use package manager like brew to install maven on your machine:

$ brew install maven

For other systems, you can refer to manual instructions from maven website https://maven.apache.org/install.html.

Install Docker by following the instructions for mac, linux, or windows.

docker network create vnet
npm install yarn -g
cd webapp && yarn && cd client && yarn && cd ../server && yarn && cd ../ && npm run build:dev && cd ../
cd batch/spark && sbt clean package assembly && cd ../..

cd batch/hadoop && mvn clean package && cd ../..
cd streaming/spark && sbt clean assembly && cd ../..
cd streaming/flink && sbt clean assembly && cd ../..
cd streaming/storm && mvn clean package && cd ../..
cd docker
docker-compose -f mongo.yml -f zookeeper.yml -f kafka.yml -f hadoop-hbase.yml -f flink.yml up -d
docker-compose -f dev/webapp.yml up -d
docker-compose -f dev/batch-spark.yml up -d
docker-compose -f dev/batch-hadoop.yml up -d
docker-compose -f dev/streaming-spark.yml up -d
docker-compose -f dev/streaming-flink.yml up -d
docker-compose -f dev/streaming-storm.yml up -d

Create your Twitter app on https://apps.twitter.com

export TWITTER_CONSUMER_KEY=<TWITTER_CONSUMER_KEY>
export TWITTER_CONSUMER_SECRET=<TWITTER_CONSUMER_SECRET>
export TWITTER_CONSUMER_ACCESS_TOKEN=<TWITTER_CONSUMER_ACCESS_TOKEN>
export TWITTER_CONSUMER_ACCESS_TOKEN_SECRET=<TWITTER_CONSUMER_ACCESS_TOKEN_SECRET>
docker-compose -f dev/ml-spark.yml up -d

Interactions / OnGoing

Contributing

Pull requests are welcome.

Support

Please raise tickets for issues and improvements at https://github.com/Chabane/bigdata-playground/issues

License

This example is released under version 2.0 of the Apache License.

Alternatives To Bigdata Playground
Select To Compare


Alternative Project Comparisons
Related Awesome Lists
Top Programming Languages
Top Projects

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Python (806,114
Typescript (245,560
Docker (97,273
Angular (54,857
Machine Learning (37,040
Mongodb (29,754
Scala (28,656
Kubernetes (24,680
Apache (16,413
Graphql (15,704
Kafka (9,561
Hadoop (5,352
Big Data (2,670
Twitter Api (1,961
Hbase (1,604
Apache Spark (1,192
Avro (1,188
Flink (1,105
Parquet (643
Spark Sql (376
Spark Streaming (366
Kops (83
Apache Flink (69