how to choose which version of spark be used in HDP 2.5?
There are two versions of Spark in HDP 2.5, Spark 1.6 and Spark 2.0. I don't know how I can specify the version of Spark to be used. Can anyone advise me how to do that? Ambari admin console?
Also I would like to submit job to Spark 2.0 from my application instead of spark-submit. What should I specify for the master url in the new SparkSession?
Here is an example for a user who submits jobs using spark-submit
under /usr/bin
:
- Navigate to a host where Spark 2.0 is installed.
- Change to the Spark2 client directory:
cd /usr/hdp/current/spark2-client/
- Set the
SPARK_MAJOR_VERSION
environment variable to 2:export SPARK_MAJOR_VERSION=2
- Run the Spark Pi example:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/jars/spark-examples*.jar 10