一、前言
从版本2.3.0起,Spark开始支持使用Kubernetes作为native的资源调度器,现在Spark一共支持如下四种资源调度方式:
- Standalone Deploy Mode
- Apache Mesos
- Hadoop YARN
- Kubernetes
现在使用Kubernetes作为原生调度器还只是一个试验功能,并且需要如下前提条件:
- Spark 2.3+
- Kubernetes 1.6+
- 有增删改查POD的能力
- Kubernetes配置了DNS
同传统的Spark运行方式一样,通过spark-submit向Kubernetes提交任务,只是将master设置为Kubernetes的master的地址就可使用Kubernetes的scheduler对spark的任务进行调度。在提交任务之后,会先启动一个driver POD,driver和Kubernetes沟通并启动一系列executor POD执行任务。在任务完成之后,所有executor POD都会被删除,但是driver POD会被保留,并处于complete状态,不占用任何内存和CPU资源,所有的log和结果都可以在drvier POD中找到。
转载自https://blog.csdn.net/cloudvtech
二、在Kubernetes运行Spark任务
1. 下载spark-2.3.0-bin-hadoop2.7.tgz
wget http://archive.apache.org/dist/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz
2. 构建docker镜像
cd spark-2.3.0-bin-hadoop2.7
docker build -t 192.168.56.10:5000/spark:2.3.0 -f kubernetes/dockerfiles/spark/Dockerfile .
docker push 192.168.56.10:5000/spark:2.3.0
3. 建立Spark的运行service account和cluster role
kubectl create serviceaccount spark
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
4. submit Pi计算任务
bin/spark-submit \
--master k8s://https://192.168.56.10:6443 \ #Kubernetes master地址
--deploy-mode cluster \
--name spark-pi \ #POD名字前缀
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=2 \ #executor POD数目
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ #使用的Kubernetes service account
--conf spark.kubernetes.container.image=192.168.56.10:5000/spark:2.3.0 \ #driver和executor使用的docker image
local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar #Spark任务jar在docker image里面的位置
5. 任务运行
[root@k8s-install-node ~]# kubectl get pods --all-namespaces -o wide | grep spark | grep -v Completed
default spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver 1/1 Running 0 7s 10.244.61.202 k8s-01 <none>
default spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-exec-1 1/1 Running 0 2s 10.244.165.201 k8s-03 <none>
default spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-exec-2 1/1 Running 0 2s 10.244.179.10 k8s-02 <none>
日志
kubectl logs spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver
++ id -u
+ myuid=0
++ id -g
+ mygid=0
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/ash
+ '[' -z root:x:0:0:root:/root:/bin/ash ']'
+ SPARK_K8S_CMD=driver
+ '[' -z driver ']'
+ shift 1
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ sed 's/[^=]*=\(.*\)/\1/g'
+ grep SPARK_JAVA_OPT_
+ env
+ readarray -t SPARK_JAVA_OPTS
+ '[' -n /opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar'
+ '[' -n '' ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=(${JAVA_HOME}/bin/java "${SPARK_JAVA_OPTS[@]}" -cp "$SPARK_CLASSPATH" -Xms$SPARK_DRIVER_MEMORY -Xmx$SPARK_DRIVER_MEMORY -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS $SPARK_DRIVER_CLASS $SPARK_DRIVER_ARGS)
+ exec /sbin/tini -s -- /usr/lib/jvm/java-1.8-openjdk/bin/java -Dspark.driver.port=7078 -Dspark.kubernetes.executor.podNamePrefix=spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8 -Dspark.kubernetes.container.image=192.168.56.10:5000/spark:2.3.0 -Dspark.app.id=spark-8f8a389851e7483e8de1850eb1418856 -Dspark.executor.instances=2 -Dspark.jars=/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar,/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar -Dspark.kubernetes.authenticate.driver.serviceAccountName=spark -Dspark.submit.deployMode=cluster -Dspark.app.name=spark-pi -Dspark.driver.host=spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc -Dspark.driver.blockManager.port=7079 -Dspark.kubernetes.driver.pod.name=spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver -Dspark.master=k8s://https://192.168.56.10:6443 -cp ':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar' -Xms1g -Xmx1g -Dspark.driver.bindAddress=10.244.61.202 org.apache.spark.examples.SparkPi
2018-09-06 10:38:20 INFO SparkContext:54 - Running Spark version 2.3.0
2018-09-06 10:38:20 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-06 10:38:20 INFO SparkContext:54 - Submitted application: Spark Pi
2018-09-06 10:38:20 INFO SecurityManager:54 - Changing view acls to: root
2018-09-06 10:38:20 INFO SecurityManager:54 - Changing modify acls to: root
2018-09-06 10:38:20 INFO SecurityManager:54 - Changing view acls groups to:
2018-09-06 10:38:20 INFO SecurityManager:54 - Changing modify acls groups to:
2018-09-06 10:38:20 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2018-09-06 10:38:21 INFO Utils:54 - Successfully started service 'sparkDriver' on port 7078.
2018-09-06 10:38:21 INFO SparkEnv:54 - Registering MapOutputTracker
2018-09-06 10:38:21 INFO SparkEnv:54 - Registering BlockManagerMaster
2018-09-06 10:38:21 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-09-06 10:38:21 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-09-06 10:38:21 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-ea53142c-5b84-4958-b2de-187cbad3f64b
2018-09-06 10:38:21 INFO MemoryStore:54 - MemoryStore started with capacity 408.9 MB
2018-09-06 10:38:21 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2018-09-06 10:38:21 INFO log:192 - Logging initialized @1888ms
2018-09-06 10:38:21 INFO Server:346 - jetty-9.3.z-SNAPSHOT
2018-09-06 10:38:21 INFO Server:414 - Started @1978ms
2018-09-06 10:38:21 INFO AbstractConnector:278 - Started ServerConnector@7b573144{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-09-06 10:38:21 INFO Utils:54 - Successfully started service 'SparkUI' on port 4040.
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@34645867{/jobs,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4fcee388{/jobs/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6f80fafe{/jobs/job,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@f9879ac{/jobs/job/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@37f21974{/stages,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f4d427e{/stages/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e521c1e{/stages/stage,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@303e3593{/stages/stage/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4ef27d66{/stages/pool,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@362a019c{/stages/pool/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1d9bec4d{/storage,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5c48c0c0{/storage/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@10c8f62{/storage/rdd,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@674c583e{/storage/rdd/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@25f7391e{/environment,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3f23a3a0{/environment/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5ab14cb9{/executors,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5fb97279{/executors/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@439a8f59{/executors/threadDump,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@61861a29{/executors/threadDump/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@31024624{/static,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5d25e6bb{/,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@ce5a68e{/api,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7c041b41{/jobs/job/kill,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7f69d591{/stages/stage/kill,null,AVAILABLE,@Spark}
2018-09-06 10:38:21 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc:4040
2018-09-06 10:38:21 INFO SparkContext:54 - Added JAR /opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar at spark://spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc:7078/jars/spark-examples_2.11-2.3.0.jar with timestamp 1536230301601
2018-09-06 10:38:21 WARN KubernetesClusterManager:66 - The executor's init-container config map is not specified. Executors will therefore not attempt to fetch remote or submitted dependencies.
2018-09-06 10:38:21 WARN KubernetesClusterManager:66 - The executor's init-container config map key is not specified. Executors will therefore not attempt to fetch remote or submitted dependencies.
2018-09-06 10:38:22 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 7079.
2018-09-06 10:38:22 INFO NettyBlockTransferService:54 - Server created on spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc:7079
2018-09-06 10:38:22 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-09-06 10:38:22 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc, 7079, None)
2018-09-06 10:38:22 INFO BlockManagerMasterEndpoint:54 - Registering block manager spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc:7079 with 408.9 MB RAM, BlockManagerId(driver, spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc, 7079, None)
2018-09-06 10:38:22 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc, 7079, None)
2018-09-06 10:38:22 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc, 7079, None)
2018-09-06 10:38:22 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@16073fa8{/metrics/json,null,AVAILABLE,@Spark}
2018-09-06 10:38:23 INFO KubernetesClusterSchedulerBackend:54 - Requesting a new executor, total executors is now 0
2018-09-06 10:38:23 INFO KubernetesClusterSchedulerBackend:54 - Requesting a new executor, total executors is now 0
2018-09-06 10:38:25 INFO KubernetesClusterSchedulerBackend:54 - Executor pod spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-exec-2 ready, launched at k8s-02 as IP 10.244.179.10.
2018-09-06 10:38:25 INFO KubernetesClusterSchedulerBackend:54 - Executor pod spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-exec-1 ready, launched at k8s-03 as IP 10.244.165.201.
2018-09-06 10:38:26 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.244.179.10:47982) with ID 2
2018-09-06 10:38:27 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.244.165.201:51378) with ID 1
2018-09-06 10:38:27 INFO KubernetesClusterSchedulerBackend:54 - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
2018-09-06 10:38:27 INFO BlockManagerMasterEndpoint:54 - Registering block manager 10.244.179.10:44581 with 408.9 MB RAM, BlockManagerId(2, 10.244.179.10, 44581, None)
2018-09-06 10:38:27 INFO BlockManagerMasterEndpoint:54 - Registering block manager 10.244.165.201:46685 with 408.9 MB RAM, BlockManagerId(1, 10.244.165.201, 46685, None)
2018-09-06 10:38:27 INFO SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2018-09-06 10:38:27 INFO DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
2018-09-06 10:38:27 INFO DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2018-09-06 10:38:27 INFO DAGScheduler:54 - Parents of final stage: List()
2018-09-06 10:38:27 INFO DAGScheduler:54 - Missing parents: List()
2018-09-06 10:38:27 INFO DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
2018-09-06 10:38:27 INFO MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 408.9 MB)
2018-09-06 10:38:27 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1181.0 B, free 408.9 MB)
2018-09-06 10:38:27 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc:7079 (size: 1181.0 B, free: 408.9 MB)
2018-09-06 10:38:27 INFO SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039
2018-09-06 10:38:27 INFO DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1))
2018-09-06 10:38:27 INFO TaskSchedulerImpl:54 - Adding task set 0.0 with 2 tasks
2018-09-06 10:38:27 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, 10.244.165.201, executor 1, partition 0, PROCESS_LOCAL, 7865 bytes)
2018-09-06 10:38:27 INFO TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, 10.244.179.10, executor 2, partition 1, PROCESS_LOCAL, 7865 bytes)
2018-09-06 10:38:28 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 10.244.165.201:46685 (size: 1181.0 B, free: 408.9 MB)
2018-09-06 10:38:28 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 10.244.179.10:44581 (size: 1181.0 B, free: 408.9 MB)
2018-09-06 10:38:28 INFO TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 615 ms on 10.244.165.201 (executor 1) (1/2)
2018-09-06 10:38:28 INFO TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 613 ms on 10.244.179.10 (executor 2) (2/2)
2018-09-06 10:38:28 INFO TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool
2018-09-06 10:38:28 INFO DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 0.872 s
2018-09-06 10:38:28 INFO DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 0.939725 s
Pi is roughly 3.145395726978635
2018-09-06 10:38:28 INFO AbstractConnector:318 - Stopped Spark@7b573144{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-09-06 10:38:28 INFO SparkUI:54 - Stopped Spark web UI at http://spark-pi-5b0a5f65b7f832929c83a6c2aa4346a8-driver-svc.default.svc:4040
2018-09-06 10:38:28 INFO KubernetesClusterSchedulerBackend:54 - Shutting down all executors
2018-09-06 10:38:28 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint:54 - Asking each executor to shut down
2018-09-06 10:38:28 INFO KubernetesClusterSchedulerBackend:54 - Closing kubernetes client
2018-09-06 10:38:28 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-09-06 10:38:28 INFO MemoryStore:54 - MemoryStore cleared
2018-09-06 10:38:28 INFO BlockManager:54 - BlockManager stopped
2018-09-06 10:38:28 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2018-09-06 10:38:28 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-09-06 10:38:28 INFO SparkContext:54 - Successfully stopped SparkContext
2018-09-06 10:38:28 INFO ShutdownHookManager:54 - Shutdown hook called
2018-09-06 10:38:28 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-4d2effad-dbde-4076-b357-ad581cec98d6
转载自https://blog.csdn.net/cloudvtech