Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Asakusafw | 113 | 2 years ago | 41 | apache-2.0 | Java | |||||
Asakusa Framework | ||||||||||
Spark Sandbox | 27 | 4 years ago | Scala | |||||||
A playground for Spark jobs. | ||||||||||
Predictiveanalatics Using Dl4j And Apachespark | 26 | 6 years ago | Scala | |||||||
Predictive analatics using deepLearning4j and Spark | ||||||||||
Oreilly 2018 | 23 | 5 years ago | 2 | Scala | ||||||
Spark Cassandra Collabfiltering | 22 | a year ago | apache-2.0 | Java | ||||||
Collaborative filtering with MLLib on Spark based on data in Cassandra | ||||||||||
Mastering Apache Spark 2x | 19 | 2 months ago | mit | Scala | ||||||
Mastering Apache Spark 2x, published by Packt | ||||||||||
Apache Spark 2 Data Processing And Real Time Analytics | 18 | 2 months ago | 2 | mit | Scala | |||||
Master complex big data processing, stream analytics, and machine learning with Apache Spark | ||||||||||
Large Scale Machine Learning With Spark | 18 | 5 months ago | mit | |||||||
Code repository for Large Scale Machine Learning with Spark by Packt | ||||||||||
Opentsdb Spark | 17 | 7 years ago | mit | XSLT | ||||||
Access OpenTSDB data from Spark | ||||||||||
Setting Up Spark | 16 | 6 years ago | Scala | |||||||
Quick guide for setting up Spark project running on a local cluster |
This tutorial can either be run in spark-shell or in an IDE (IntelliJ or Scala IDE for Eclipse)
Below are the steps for the setup.
Java/JDK 1.8+ has to be installed on the laptop before proceeding with the steps below.
Download Spark 2.3.1 from here : http://spark.apache.org/downloads.html
Direct Download link : https://www.apache.org/dyn/closer.lua/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz
tar -zxvf spark-2.3.1-bin-hadoop2.7.tgz
export PATH=$PATH:/<path_to_downloaded_spark>/spark-2.3.1-bin-hadoop2.7/bin
Unzip spark-2.3.1-bin-hadoop2.7.tgz
Add the spark bin directory to Path : ...\spark-2.3.1-bin-hadoop2.7\bin
When pasting larger sections of the code in spark-shell, use the below:
scala> :paste
If you prefer to use IDE over spark-shell, below are the steps.
You can either use IntelliJ or Scala IDE for Eclipse.
Have the following downloaded before the session
Nice to have
hadoop fs -copyToLocal /strata-nyc/transferlearning.tgz .
tar -zxvf transferlearning.tgz
pip install tensorflow
pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.3.0-cp27-none-linux_x86_64.whl
pip install numpy scipy
pip install scikit-learn
pip install pillow
pip install h5py
pip install keras