linux spark cluster:
spark-master
spark-worker-1
win开发环境下运行spark
连接到cluster,但是debug模式下,在配置conf下可以成功连接执行,但是感觉不同环境下联合调试不成功,其实是可以成功的,可以打开master查看执行进程和日志就知道了 。结果如下:
"/usr/lib/jvm/java-8-openjdk-amd64//bin/java" \
"-cp" "/mnt/geoSpark/*:/spark/conf/:/spark/jars/*:/etc/hadoop/:/opt/hadoop-3.2.1/share/hadoop/common/lib/*:/opt/hadoop-3.2.1/share/hadoop/common/*:/opt/hadoop-3.2.1/share/hadoop/hdfs/:/opt/hadoop-3.2.1/share/hadoop/hdfs/lib/*:/opt/hadoop-3.2.1/share/hadoop/hdfs/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/*:/opt/hadoop-3.2.1/share/hadoop/yarn/:/opt/hadoop-3.2.1/share/hadoop/yarn/lib/*:/opt/hadoop-3.2.1/share/hadoop/yarn/*" \
"-Xmx1024M" "-Dspark.driver.port=53110" \ "org.apache.spark.executor.CoarseGrainedExecutorBackend" \
"--driver-url" "spark://CoarseGrainedScheduler@172.16.0.162:53110" \
"--executor-id" "0" \
"--hostname" "192.168.240.12" \
"--cores" "96" \
"--app-id" "app-20220120140458-0039" \
"--worker-url" "spark://Worker@192.168.240.12:42339"
win idea下的spark 作为driver,执行日志
2022-01-20 14:04:58,428 INFO worker.ExecutorRunner: Launch command: "/usr/lib/jvm/java-8-openjdk-amd64//bin/java" "-cp" "/mnt/geoSpark/*:/spark/conf/:/spark/jars/*:/etc/hadoop/:/opt/hadoop-3.2.1/share/hadoop/common/lib/*:/opt/hadoop-3.2.1/share/hadoop/common/*:/opt/hadoop-3.2.1/share/hadoop/hdfs/:/opt/hadoop-3.2.1/share/hadoop/hdfs/lib/*:/opt/hadoop-3.2.1/share/hadoop/hdfs/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/*:/opt/hadoop-3.2.1/share/hadoop/yarn/:/opt/hadoop-3.2.1/share/hadoop/yarn/lib/*:/opt/hadoop-3.2.1/share/hadoop/yarn/*" "-Xmx1024M" "-Dspark.driver.port=53110" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@172.16.0.162:53110" "--executor-id" "0" "--hostname" "192.168.240.12" "--cores" "96" "--app-id" "app-20220120140458-0039" "--worker-url" "spark://Worker@192.168.240.12:42339"
2022-01-20 14:05:32,318 INFO worker.Worker: Asked to kill executor app-20220120140458-0039/0
2022-01-20 14:05:32,318 INFO worker.ExecutorRunner: Runner thread for executor app-20220120140458-0039/0 interrupted
2022-01-20 14:05:32,319 INFO worker.ExecutorRunner: Killing process!
2022-01-20 14:05:32,755 INFO worker.Worker: Executor app-20220120140458-0039/0 finished with state KILLED exitStatus 143
2022-01-20 14:05:32,756 INFO shuffle.ExternalShuffleBlockResolver: Clean up non-shuffle and non-RDD files associated with the finished executor 0
2022-01-20 14:05:32,756 INFO shuffle.ExternalShuffleBlockResolver: Executor is not registered (appId=app-20220120140458-0039, execId=0)
2022-01-20 14:05:32,756 INFO shuffle.ExternalShuffleBlockResolver: Application app-20220120140458-0039 removed, cleanupLocalDirs = true
2022-01-20 14:05:32,756 INFO worker.Worker: Cleaning up local directories for application app-20220120140458-0039
2022-01-20 14:06:08,352 INFO worker.Worker: Asked to launch executor app-20220120140608-0040/0 for testSpark
2022-01-20 14:06:08,355 INFO spark.SecurityManager: Changing view acls to: root
2022-01-20 14:06:08,355 INFO spark.SecurityManager: Changing modify acls to: root
2022-01-20 14:06:08,355 INFO spark.SecurityManager: Changing view acls groups to:
2022-01-20 14:06:08,355 INFO spark.SecurityManager: Changing modify acls groups to:
2022-01-20 14:06:08,355 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
Worker 作为executor ,其执行日志可以查看logs.
其中spark worker默认的端口是8081,我改成了6081,尽量保持原来默认端口最好
TODO:2022.1.23
如何命令行查看日志