首先
1、 spark-env.sh中添加
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=3 -Dspark.history.
fs.logDirectory=hdfs://xiaoqi0:9000/sparkeventlog"
2、spark-defaults.conf
spark.eventLog.enabled=true
spark.eventLog.dir=hdfs://xiaoqi0:9000/sparkeventlog
spark.eventLog.compress=true
3、
curl -X POST http://xiaoqi0:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
"action" : "CreateSubmissionRequest","appArgs" : [ "100" ],
"appResource" : "file:/usr/local/app/spark-2.2.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.2.0.jar",
"clientSparkVersion" : "2.2.0",
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass" : "org.apache.spark.examples.SparkPi",
"sparkProperties" : {
"spark.jars" : "file:/usr/local/app/spark-2.2.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.2.0.jar",
"spark.driver.supervise" : "false",
"spark.app.name" : "MyJob",
"spark.eventLog.enabled": "true",
"spark.eventLog.dir":"hdfs://xiaoqi0:9000/sparkeventlog",
"spark.submit.deployMode" : "client",
"spark.master" : "spark://xiaoqi0:6066"
}
}'
获取状态
curl http://spark-cluster-ip:6066/v1/submissions/status/driver-20151008145126-0000
杀死任务
curl -X POST http://spark-cluster-ip:6066/v1/submissions/kill/driver-20151008145126-0000
经测试,虽然配置文件中已经指定了目录sparkeventlog,但是如果curl请求提交任务时如果不配置spark.eventLog.dir,日志依然会输出到本地默认的/tmp/spark-events目录。配置了就会生成到指定的目录,这样sparksubmit提交的和restful提交的都制定到同一个目录,historyserver会都识别出来