在Kubernetes上部署Presto

在Kubernetes上部署Presto

思路:

  • 上一篇文章中部署的Hive为基础部署Presto
  • Presto集群包含Coordinator和Worker两类节点,节点类型通过容器环境变量设置
  • 节点node.properties配置文件中不设置node.id,节点挂了由Kubernetes重启拉起一个新节点

1、环境介绍

[root@master-0 ~]# kubectl get nodes -o wide
NAME       STATUS    ROLES     AGE       VERSION           EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
master-0   Ready     master    14d       v1.9.2+coreos.0   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://1.13.1
worker-0   Ready     <none>    14d       v1.9.2+coreos.0   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://1.13.1
worker-1   Ready     <none>    14d       v1.9.2+coreos.0   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://1.13.1
[root@master-0 ~]# kubectl get svc -o wide
NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                          AGE       SELECTOR
hadoop-dn-service             ClusterIP   None            <none>        9000/TCP,50010/TCP,50075/TCP                     19h       app=hadoop-dn
hadoop-nn-service             ClusterIP   None            <none>        9000/TCP,50070/TCP                               19h       app=hadoop-nn
hadoop-ui-service             NodePort    10.233.21.71    <none>        8088:32295/TCP,50070:31127/TCP                   19h       app=hadoop-nn
hive-metadata-mysql-service   NodePort    10.233.23.56    <none>        3306:31470/TCP                                   41m       app=hive-metadata-mysql
hive-service                  NodePort    10.233.60.239   <none>        10000:30717/TCP,10002:30001/TCP,9083:32335/TCP   41m       app=hive
kubernetes                    ClusterIP   10.233.0.1      <none>        443/TCP                                          14d       <none>

2、构建镜像

Presto没有官方镜像,这里我基于Centos 7.5和Presto 0.208制作了自己的镜像,Dockerfile如下:

FROM 192.168.101.88:5000/base/centos:7.5.1804
MAINTAINER leichen.china@gmail.com

ADD jdk-8u151-linux-x64.tar.gz /opt
ADD presto-server-0.208.tar.gz /opt

ENV PRESTO_HOME /opt/presto-server-0.208
ENV JAVA_HOME /opt/jdk1.8.0_151
ENV PATH $JAVA_HOME/bin:$PATH

脚本:docker build -t 192.168.101.88:5000/dmcop2/presto-server:dm-0.208 .

3、部署Presto

  • 启动脚本和Presto配置文件
apiVersion: v1
kind: ConfigMap
metadata:
  name: presto-config-cm
  labels:
    app: presto-coordinator
data:
  bootstrap.sh: |-
    #!/bin/bash

    cd /root/bootstrap

    mkdir -p $PRESTO_HOME/etc/catalog

    cat ./node.properties > $PRESTO_HOME/etc/node.properties
    cat ./jvm.config > $PRESTO_HOME/etc/jvm.config
    cat ./config.properties > $PRESTO_HOME/etc/config.properties
    cat ./log.properties > $PRESTO_HOME/etc/log.properties

    sed -i 's/${COORDINATOR_NODE}/'$COORDINATOR_NODE'/g' $PRESTO_HOME/etc/config.properties

    for cfg in ../catalog/*; do
      cat $cfg > $PRESTO_HOME/etc/catalog/${cfg##*/}
    done

    $PRESTO_HOME/bin/launcher run --verbose
  node.properties: |-
    node.environment=production
    node.data-dir=/var/presto/data
  jvm.config: |-
    -server
    -Xmx16G
    -XX:+UseG1GC
    -XX:G1HeapRegionSize=32M
    -XX:+UseGCOverheadLimit
    -XX:+ExplicitGCInvokesConcurrent
    -XX:+HeapDumpOnOutOfMemoryError
    -XX:+ExitOnOutOfMemoryError
  config.properties: |-
    coordinator=${COORDINATOR_NODE}
    node-scheduler.include-coordinator=true
    http-server.http.port=8080
    query.max-memory=10GB
    query.max-memory-per-node=1GB
    query.max-total-memory-per-node=2GB
    discovery-server.enabled=true
    discovery.uri=http://presto-coordinator-service:8080
  log.properties: |-
    com.facebook.presto=INFO

说明:

1、启动脚本执行时,将配置文件覆盖到对应路径,然后根据环境变量COORDINATOR_NODE设置节点类型

2、配置文件config.properties中的discovery.uri设置为coordinator对应的serviceName

  • 配置Hive连接文件
apiVersion: v1
kind: ConfigMap
metadata:
  name: presto-catalog-config-cm
  labels:
    app: presto-coordinator
data:
  hive.properties: |-
    connector.name=hive-hadoop2
    hive.metastore.uri=thrift://hive-service:9083

说明:

1、配置hive.properties文件,指定Hive的ServiceName和metastore端口

2、文件将被挂载到POD内的Catalog目录

  • 部署Presto容器及服务
apiVersion: apps/v1
kind: Deployment
metadata:
  name: presto-coordinator
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: presto-coordinator
  template:
    metadata:
      labels:
        app: presto-coordinator
    spec:
      containers:
        - name: presto-coordinator
          image: 192.168.101.88:5000/dmcop2/presto-server:dm-0.208
          command: ["bash", "-c", "chmod +x /root/bootstrap/bootstrap.sh && /root/bootstrap/bootstrap.sh"]
          ports:
            - name: http-coord
              containerPort: 8080
              protocol: TCP
          env:
            - name: COORDINATOR_NODE
              value: "true"
          volumeMounts:
            - name: presto-config-volume
              mountPath: /root/bootstrap
            - name: presto-catalog-config-volume
              mountPath: /root/catalog
            - name: presto-data-volume
              mountPath: /var/presto/data
          readinessProbe:
            initialDelaySeconds: 10
            periodSeconds: 5
            httpGet:
              path: /v1/cluster
              port: http-coord
      volumes:
        - name: presto-config-volume
          configMap:
            name: presto-config-cm
        - name: presto-catalog-config-volume
          configMap:
            name: presto-catalog-config-cm
        - name: presto-data-volume
          emptyDir: {}
---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: presto-coordinator
  name: presto-coordinator-service
spec:
  ports:
    - port: 8080
      targetPort: http-coord
      name: http-coord
  selector:
    app: presto-coordinator
  type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: presto-worker
spec:
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: presto-worker
  template:
    metadata:
      labels:
        app: presto-worker
    spec:
      initContainers:
        - name: wait-coordinator
          image:  192.168.101.88:5000/dmcop2/presto-server:dm-0.208
          command: ["bash", "-c", "until curl -sf http://presto-coordinator-service:8080/ui/; do echo 'waiting for coordinator started...'; sleep 2; done;"]
      containers:
        - name: presto-worker
          image: 192.168.101.88:5000/dmcop2/presto-server:dm-0.208
          command: ["bash", "-c", "chmod +x /root/bootstrap/bootstrap.sh && /root/bootstrap/bootstrap.sh"]
          ports:
            - name: http-coord
              containerPort: 8080
              protocol: TCP
          env:
            - name: COORDINATOR_NODE
              value: "false"
          volumeMounts:
            - name: presto-config-volume
              mountPath: /root/bootstrap
            - name: presto-catalog-config-volume
              mountPath: /root/catalog
            - name: presto-data-volume
              mountPath: /var/presto/data
          readinessProbe:
            initialDelaySeconds: 10
            periodSeconds: 5
            exec:
              command: ["bash", "-c", "curl -s http://presto-coordinator-service:8080/v1/node | tr ',' '\n' | grep -s $(hostname -i)"]
      volumes:
        - name: presto-config-volume
          configMap:
            name: presto-config-cm
        - name: presto-catalog-config-volume
          configMap:
            name: presto-catalog-config-cm
        - name: presto-data-volume
          emptyDir: {}

说明:

1、Coordinator和Worker分两个Deployment部署,但是使用同一个ConfigMap进行配置

2、启动脚本bootstrap.sh根据环境变量COORDINATOR_NODE动态修改配置文件

3、Coordinator通过HTTP访问/v1/cluster检查容器是否就绪;Worker通过访问Coordinator服务的/v1/node判断是否包含自己检查容器是否就绪

4、使用Service NodePort对外提供访问

4、测试Presto

  • 访问Web UI
[root@master-0 presto]# kubectl get svc
NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                          AGE
hadoop-dn-service             ClusterIP   None            <none>        9000/TCP,50010/TCP,50075/TCP                     19h
hadoop-nn-service             ClusterIP   None            <none>        9000/TCP,50070/TCP                               19h
hadoop-ui-service             NodePort    10.233.21.71    <none>        8088:32295/TCP,50070:31127/TCP                   19h
hive-metadata-mysql-service   NodePort    10.233.23.56    <none>        3306:31470/TCP                                   1h
hive-service                  NodePort    10.233.60.239   <none>        10000:30717/TCP,10002:30001/TCP,9083:32335/TCP   1h
kubernetes                    ClusterIP   10.233.0.1      <none>        443/TCP                                          14d
presto-coordinator-service    NodePort    10.233.50.222   <none>        8080:30418/TCP                                   39s

这里写图片描述

  • 测试Presto客户端连接
[root@master-0 presto]# wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.208/presto-cli-0.208-executable.jar
[root@master-0 presto]# chmod +x presto-cli-0.208-executable.jar 
[root@master-0 presto]# ./presto-cli-0.208-executable.jar --server 192.168.112.240:30418 --catalog hive --schema default
presto:default> select * from abc;
 a 
---
 1 
(1 row)

Query 20180907_030117_00002_bmkxf, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:03 [1 rows, 2B] [0 rows/s, 0B/s]

presto:default> 

这里写图片描述

  • 伸缩
[root@master-0 presto]# kubectl get deployment
NAME                  DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
hive                  1         1         1            1           1h
hive-metadata-mysql   1         1         1            1           1h
presto-coordinator    1         1         1            1           21m
presto-worker         2         2         2            2           21m
[root@master-0 presto]# kubectl scale deployment presto-worker --replicas=3
deployment "presto-worker" scaled
[root@master-0 presto]# kubectl get deployment
NAME                  DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
hive                  1         1         1            1           1h
hive-metadata-mysql   1         1         1            1           1h
presto-coordinator    1         1         1            1           23m
presto-worker         3         3         3            3           23m

这里写图片描述

5、注意事项

  • 配置文件中没有指定node.id,日志中输出值为null,目前测试运行正常,不确定是否存在潜在问题

6、参考资料

在进行Presto性能测试的过程中,出现了一些问题导致Kubernetes部署Presto集群无法发挥出完全的性能,与物理机部署Presto集群相比存在较大差距。这些问题主要包括节点分配不均等。为了解决这些问题,可以采取一些措施来优化性能。 传统方式部署Presto集群是以物理机的方式运行,而在Kubernetes部署Presto集群是以容器的方式运行。尽管Kubernetes部署方式带来了很多优点,但容器方式和物理机方式的性能差异仍是未知的。因此,需要进行对比测试来评估两种不同部署方式的性能差异。 根据测试结果来看,在不同的查询分类中,Kubernetes部署Presto集群的平均查询时间稍微短于物理机部署,两者的时间差基本保持在几秒之间。需要考虑到本次测试环境部署在云服务器上,不同时段使用云服务器会导致性能偏差,因此几秒钟的性能波动是可以接受的。排除了云服务器产生的波动后,可以得出结论,Kubernetes部署与物理机部署Presto集群在性能和效率上几乎没有差别。所以对于不同类型的查询,Kubernetes部署Presto集群与物理机部署相比,在查询速度上几乎没有性能损耗,性能和效率几乎没有损失。 综上所述,通过对比测试可以得出结论,PrestoKubernetes环境下的性能表现良好,并且与传统的物理机部署方式相比几乎没有性能损耗。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* *3* [技术分享 | Presto性能对比测试:Kubernetes部署 VS 物理机部署](https://blog.csdn.net/Alluxio/article/details/127260337)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 100%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值