Spark on kubernetes 环境搭建

Spark on kubernetes环境搭建

实验要求

kubernetes集群版本1.15
spark版本 2.4.5
私建docker仓库已经被kubernetes集群的docker信任
jdk1.8
scala2.12

构建kubernetes

  • 单机测试的话,可以使用minikube
  • 为每一台kubernetes节点,下载jdk和Scala,并配置JAVA_HOME和SCALA_HOME

搭建私有docker镜像仓库

  • 准备一台单独的设备安装docker,或者,在kubernetes集群中寻找一台设备
  • 配置insecure docker仓库,docker镜像仓库本身,以及每一台kubernetes集群中的节点都需要配置
    # 编辑/etc/docker/daemon.json,内容如下
    {
    	"insecure-registries": ["${设备的外网IP}:5000"]
    }
    
  • 重启docker,docker镜像仓库所在的设备和kubernetes的每一台设备都需要重启docker
    systemctl restart docker.service
    
  • 启动docker 私有仓库
    docker run -d --restart=always --name registry-test -p 5000:5000 registry:2
    

构建spark docker images

  • 在一台已经安装了docker的设备上下载spark
    wget http://apache.communilink.net/spark/spark-2.4.5/spark-2.4.5-bin-hadoop2.7.tgz
    
  • build spark docker镜像
    tar -zxvf spark-2.4.5-bin-hadoop2.7.tgz
    cd spark-2.4.5-bin-hadoop2.7
    # ./bin/docker-image-tool.sh -r ${你的镜像仓库地址} -t ${你的镜像tag} build
    ./bin/docker-image-tool.sh -r 10.211.55.24:5000 -t v1 build
    
  • 将build好的image导入到自建的docker镜像仓库中
    # 方式一
    docker push ${你的镜像}
    # 方式二
    ./bin/docker-image-tool.sh -r 10.211.55.24:5000 -t v1 push
    # 方式三
    docker save 10.211.55.24:5000/spark:v1 > tmp.tar
    scp tmp.tar ${你的镜像仓库}/${path}
    ssh ${你的镜像仓库}
    docker load < 10.211.55.24:5000/spark:v1
    docker push 10.211.55.24:5000/spark:v1
    

为spark程序创建serviceaccount和对应的rbac

  • 为spark应用程序构建运行需要的角色和账户
    kubectl create serviceaccount spark
    kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
    

运行spark的example sparkPi

  • 在kubernetes master上运行spark提交脚本
  • 在kubernetes master运行proxy
    kubectl proxy
    
  • spark运行脚本
    bin/spark-submit \
      --master k8s://http://127.0.0.1:8001 \
      --deploy-mode cluster \
      --name spark-pi \
      --class org.apache.spark.examples.SparkPi \
      --conf spark.executor.instances=5 \
      --conf spark.kubernetes.container.image=${你的镜像仓库IP地址}/apache/spark:v1 \
      --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
      /opt/spark/examples/jars/spark-examples_2.11-2.4.5.jar
    

可能会遇到的问题

  • spark无法与kubernetes apiserver连接,或者应该是java无法与apiserver连接
    • 报错
      Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Failed to start websocket
        	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:212)
        	at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
        	at okhttp3.internal.ws.RealWebSocket$2.onFailure(RealWebSocket.java:221)
        	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:215)
        	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
        	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        	at java.lang.Thread.run(Thread.java:748)
        	Suppressed: java.lang.Throwable: waiting here
        		at io.fabric8.kubernetes.client.utils.Utils.waitUntilReady(Utils.java:134)
        		at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.waitUntilReady(WatchConnectionManager.java:350)
        		at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:759)
        		at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:738)
        		at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:69)
        		at org.apache.spark.deploy.k8s.submit.Client$$anonfun$run$1.apply(KubernetesClientApplication.scala:140)
        		at org.apache.spark.deploy.k8s.submit.Client$$anonfun$run$1.apply(KubernetesClientApplication.scala:140)
        		at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542)
        		at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:140)
        		at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:250)
        		at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:241)
        		at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2543)
        		at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
        		at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
        		at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
        		at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        		at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        		at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        		at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
        		at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
        		at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
        Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        	at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
        	at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1959)
        	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:328)
        	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:322)
        	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1614)
        	at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
        	at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052)
        	at sun.security.ssl.Handshaker.process_record(Handshaker.java:987)
        	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1072)
        	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
        	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
        	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
        	at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:319)
        	at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:283)
        	at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:168)
        	at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:257)
        	at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
        	at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
        	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        	at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        	at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        	at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:112)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        	at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:254)
        	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:200)
        	... 4 more
        Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        	at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:397)
        	at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:302)
        	at sun.security.validator.Validator.validate(Validator.java:260)
        	at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
        	at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
        	at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
        	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1596)
        	... 39 more
        Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        	at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
        	at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
        	at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
        	at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:392)
        	... 45 more
      
    • 原因
      kubernetes集群是自建CA的,kubernetes的证书java是不认的
    • 解决方案
      • 目前有一种流传的解决方案:将kubernetes的ca证书导入到java的证书信任链中,我目前还没能成功
      • 另一种解决方案是,使用kubectl proxy开一个http服务器,然后使用http服务器的url做spark-submit的–master
  • 权限问题
    • 报错
         Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated
         	at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794)
         	at org.apache.spark.SparkContext.<init>(SparkContext.scala:493)
         	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
         	at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
         	at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926)
         	at scala.Option.getOrElse(Option.scala:121)
         	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926)
         	at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
         	at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
         	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
         	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         	at java.lang.reflect.Method.invoke(Method.java:498)
         	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
         	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
         	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
         	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
         	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
         	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
         	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
         	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
         Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/pods/spark-pi-1573016872026-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "spark-pi-1573016872026-driver" is forbidden: User "system:serviceaccount:default:default" cannot get resource "pods" in API group "" in the namespace "default".
         	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:478)
         	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:415)
         	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:381)
         	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:344)
         	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:313)
         	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:296)
         	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:801)
         	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:218)
         	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:185)
         	at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57)
         	at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55)
         	at scala.Option.map(Option.scala:146)
         	at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:55)
         	at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89)
         	at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788)
         	... 20 more
      
    • 解决方案
      为spark应用程序创建serviceaccount和rbac
  • 示例程序的java class找不到
    • 原因
      运行在kubernetes的spark程序,找local的jar包是在镜像内部找的
    • 解决方案
      将jar包放到http服务器或者hdfs中

目前尚未解决的问题

  • 目前运行spark指定master只能用kubectl proxy运行一个http服务器,然后spark的–master指定http的url;尚不能直接在–master处指定kubernetes的https url
  • 目前运行的sparkPi的jar包是直接构建到spark的镜像内部的,脚本中写的jar包的脚本是jar包在镜像内部的路径
    • 目前尚不能从http server或者hdfs获取jar包
  • 目前还在考虑如何让kubernetes访问到我的数据存储系统
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 10
    评论
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值