在spark开发中会遇到spark-submit脚本的编写,作为小白的我就在这里对spark-submit进行简单的说明。
vi wordcount.sh
/usr/local/spark/bin/spark-submit \
--class cn.spark.study.core.wordCountCluster \
--num-executors 3 \
--driver-memory 100m \
--executor-memory 100m \
--executor-cores 3 \
--master spark://192.168.1.107:7077
/usr/local/SparkTest-0.0.1-SNAPSHOT-jar-with-dependencies.jar \
首先执行spark-submit,并传入参数class,指定要执行的class文件,num-executors是指明让spark启动几个executor来执行,driver-memory指明driver的内存是多少,executor-memory指明每一个executor的内存是多少,executor-cores指明executor-core是几个,master用来指定standalone模式下spark的集群;/usr/local/SparkTest-0.0.1-SNAPSHOT-jar-with-dependencies.jar指明jar包的位置。
默认wordcount.sh是不能执行的,改变权限chmod 777 wordcount.sh
执行./wordcount.sh