HIVE LLAP 测试和分析

由于 LLAP 服务一直运行不释放。整个集群可以有一个 LLAP 服务,也可以有多个 LLAP 服务。提交LLAP 服务时指定 LLAP 到哪个队列。每个 LLAP 都有唯一的名称,用户提交作业时指定提交到哪个 LLAP中。

生成LLAP 服务程序

每个用户都可以执行生成 LLAP 服务程序,运行此程序,仅仅根据参数生成运行 LLAP 需要的程序和配置。

hive --service llap --name llap-demo --instances 1 --cache 128m --executors 3 --iothreads 2 --size 1024m --xmx 512m --queue default --loglevel INFO

重点的参数

参数名称参数说明
servicellap, 调用 hive 的 llap service,这个是固定值
nameLLAP 的名称,必须唯一(所有的 LLAP 服务必须用不同的名称)。由于 LLAP 使用 Zookeeper 做服务发现,启动此 LLAP 服务时,注册到 Zookeeper 的相关目录里。
instances容器的个数
cache缓存的大小
executors一个容器内的执行线程数,一个 执行线程处理一个 Task。
iothreads读取数据线程和执行线程是不同的线程。读取数据线程读取数据,并准备成执行线程所需要的列执行的格式
size容器的内存大小,指向 ResourceManager 申请容器的大小。
xmx容器的堆内存大小
queue此 LLAP 服务提交到哪个队列里。
loglevel容器的日志级别

所有的参数

usage: llap
 -a,--args <args>                                         java arguments to the llap instance
 -auxhive,--auxhive <auxhive>                             whether to package the Hive aux jars (true by default)
 -b,--service-am-container-mb <b>                         The size of the service AppMaster container in MB
 -c,--cache <cache>                                       cache size per instance
 -d,--directory <directory>                               Temp directory for jars etc.
 -e,--executors <executors>                               executor per instance
 -H,--help                                                Print help information
 -h,--auxhbase <auxhbase>                                 whether to package the HBase jars (true by default)
    --health-init-delay-secs <health-init-delay-secs>     Delay in seconds after which health percentage is monitored (Default:
                                                          400)
    --health-percent <health-percent>                     Percentage of running containers after which LLAP application is
                                                          considered healthy (Default: 80)
    --health-time-window-secs <health-time-window-secs>   Time window in seconds (after initial delay) for which LLAP application
                                                          is allowed to be in unhealthy state before being killed (Default: 300)
    --hiveconf <property=value>                           Use value for given property. Overridden by explicit parameters
 -i,--instances <instances>                               Specify the number of instances to run this on
 -j,--auxjars <auxjars>                                   additional jars to package (by default, JSON SerDe jar is packaged if
                                                          available)
    --javaHome <javaHome>                                 Path to the JRE/JDK. This should be installed at the same location on all
                                                          cluster nodes ($JAVA_HOME, java.home by default)
 -l,--loglevel <loglevel>                                 log levels for the llap instance
    --logger <logger>                                     logger for llap instance ([RFA], query-routing, console
 -n,--name <name>                                         Cluster name for YARN registry
    --output <output>                                     Output directory for the generated scripts
 -q,--queue <queue>                                       The queue within which LLAP will be started
 -s,--size <size>                                         container size per instance
    --service-appconfig-global <property=value>           Property (key=value) to be set in the global section of the Service
                                                          appConfig
    --service-default-keytab                              try to set default settings for Service AM keytab; mostly for dev testing
    --service-keytab <service-keytab>                     Service AM keytab file name inside service-keytab-dir
    --service-keytab-dir <service-keytab-dir>             Service AM keytab directory on HDFS (where the headless user keytab is
                                                          stored by Service keytab installation, e.g. .yarn/keytabs/llap)
    --service-placement <service-placement>               Service placement policy; see YARN documentation at
                                                          https://issues.apache.org/jira/browse/YARN-1042. This is unnecessary if
                                                          LLAP is going to take more than half of the YARN capacity of a node.
    --service-principal <service-principal>               Service AM principal; should be the user running the cluster, e.g.
                                                          hive@EXAMPLE.COM
 -t,--iothreads <iothreads>                               executor per instance
 -w,--xmx <xmx>                                           working memory size
 -z,--startImmediately                                    immediately start the cluster

生成的文件

执行之后,生成如 “llap-yarn-29Sep2021” 的目录,以当前日期为后缀。里面有三个文件:

  • Yarnfile : Yarn Service 的定义文件。
    run.sh: 执行此命令启动 LLAP 服务。
    llap-29Sep2021.tar.gz: LLAP 服务用的 jar 包。
Yarnfile

Yarnfile 的内容如下:

{
  "name": "llap-demo",
  "version": "1.0.0",
  "queue": "",
  "configuration": {
    "properties": {
      "yarn.service.rolling-log.include-pattern": ".*\\.done",
      "yarn.component.placement.policy" : "4",
      "yarn.container.health.threshold.percent": "80",
      "yarn.container.health.threshold.window.secs": "300",
      "yarn.container.health.threshold.init.delay.secs": "400"
    }
  },
  "components": [
    {
      "name": "llap",
      "number_of_containers": 1,
      "launch_command": "$LLAP_DAEMON_BIN_HOME/llapDaemon.sh start &> $LLAP_DAEMON_TMP_DIR/shell.out",
      "artifact": {
        "id": ".yarn/package/LLAP/llap-29Sep2021.tar.gz",
        "type": "TARBALL"
      },
      "resource": {
        "cpus": 1,
        "memory": "1024"
      },
      "configuration": {
        "env": {
          "JAVA_HOME": "/Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home",
          "LLAP_DAEMON_HOME": "$PWD/lib/",
          "LLAP_DAEMON_TMP_DIR": "$PWD/tmp/",
          "LLAP_DAEMON_BIN_HOME": "$PWD/lib/bin/",
          "LLAP_DAEMON_CONF_DIR": "$PWD/lib/conf/",
          "LLAP_DAEMON_LOG_DIR": "<LOG_DIR>",
          "LLAP_DAEMON_LOGGER": "query-routing",
          "LLAP_DAEMON_LOG_LEVEL": "INFO",
          "LLAP_DAEMON_HEAPSIZE": "512",
          "LLAP_DAEMON_PID_DIR": "$PWD/lib/app/run/",
          "LLAP_DAEMON_LD_PATH": "/usr/local/hadoop/lib/native",
          "LLAP_DAEMON_OPTS": " -Dhttp.maxConnections=4 ",

          "APP_ROOT": "<WORK_DIR>/app/install/",
          "APP_TMP_DIR": "<WORK_DIR>/tmp/"
        }
      }
    }
  ],
  "kerberos_principal" : {
    "principal_name" : "",
    "keytab" : ""
  },
  "quicklinks": {
    "LLAP Daemon JMX Endpoint": "http://llap-0.${SERVICE_NAME}.${USER}.${DOMAIN}:15002/jmx"
  }
}
run.sh

run.sh 先 stop 服务,然后 destroy, 然后重新执行。

#!/bin/bash -e

BASEDIR=$(dirname $0)
yarn app -stop llap-demo
yarn app -destroy llap-demo
hdfs dfs -mkdir -p .yarn/package/LLAP
hdfs dfs -copyFromLocal -f $BASEDIR/llap-29Sep2021.tar.gz .yarn/package/LLAP
yarn app -launch llap-demo $BASEDIR/Yarnfile
llap-${CREATE_DATE}.tar.gz

对 llap-${CREATE_DATE}.tar.gz 解压

bin

包含 service 的运行命令

  • llap-daemon-env.sh
  • llapDaemon.sh
  • runLlapDaemon.sh
config.json

生成 service 的参数,都以 JSON 的格式放到此文件里。

conf 生成的配置目录。其中 llap-daemon-site.xml 包含 LLAP 的参数。包括
core-site.xml			hive-site.xml			llap-udfs.lst
hadoop-metrics2.properties	llap-daemon-log4j2.properties	tez-site.xml
hdfs-site.xml			llap-daemon-site.xml		yarn-site.xml
llap-daemon-site.xml

可以看到,我们命令中输入的参数生成了 llap 服务配置的参数。

<?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration>
<property><name>hive.llap.daemon.service.hosts</name><value>@llap-demo</value><final>false</final><source>CLI direct</source></property>
<property><name>hive.llap.io.memory.size</name><value>134217728</value><final>false</final><source>CLI direct</source></property>
<property><name>hive.llap.daemon.yarn.container.mb</name><value>1024</value><final>false</final><source>CLI direct</source></property>
<property><name>hive.llap.io.threadpool.size</name><value>2</value><final>false</final><source>CLI direct</source></property>
<property><name>hive.llap.daemon.num.executors</name><value>3</value><final>false</final><source>CLI direct</source></property>
<property><name>hive.llap.daemon.memory.per.instance.mb</name><value>512</value><final>false</final><source>CLI direct</source></property>
lib

lib 目录是运行 llap 的 jar 包。

运行 Service

执行 run.sh 文件,可以看到 RerouceManager 上出现了一个 Application。
在这里插入图片描述

PortParameterMean
15002hive.llap.daemon.web.portLLAP daemon web UI port.
15003hive.llap.daemon.output.service.portLLAP daemon output service port
15004hive.llap.management.rpc.portRPC port for LLAP daemon management service.
15551hive.llap.daemon.yarn.shuffle.portYARN shuffle port for LLAP-daemon-hosted shuffle.
0hive.llap.daemon.rpc.portThe LLAP daemon RPC port.

LLAP web

在这里插入图片描述

LLAP zookeeper

从以下可以看到,每个 LLAP 服务都在 /llap-unsecure 的当前用户下有一个目录。workers目录下有两个文件,一个是 slot 文件,一个是 worker 文件。每个容器一个 slot znode,一个 worker znode。打开 slot znode,有一个 UUID。打开 worker znode,有LLAP 容器的相关信息,并且信息中有 “registry.unique.id”:“34850c09-d8b1-415b-8572-139456d476fc” 和 slot znode 的内容对应。

[zk: localhost:2181(CONNECTED) 6] get /llap-unsecure/user-houzhizhen/llap-demo/workers/slot-0000000000 
34850c09-d8b1-415b-8572-139456d476fc
[zk: localhost:2181(CONNECTED) 7] get /llap-unsecure/user-houzhizhen/llap-demo/workers/worker-0000000026 
{"type":"JSONServiceRecord","external":[{"api":"services","addressType":"uri","protocolType":"webui","addresses":[{"uri":"http://localhost:15002"}]}],"internal":[{"api":"llap","addressType":"host/port","protocolType":"hadoop/IPC","addresses":[{"host":"localhost","port":"46480"}]},{"api":"llapmng","addressType":"host/port","protocolType":"hadoop/IPC","addresses":[{"host":"localhost","port":"15004"}]},{"api":"shuffle","addressType":"host/port","protocolType":"tcp","addresses":[{"host":"localhost","port":"15551"}]},{"api":"llapoutputformat","addressType":"host/port","protocolType":"hadoop/IPC","addresses":[{"host":"localhost","port":"15003"}]}],"hive.llap.daemon.container.id":"container_1632897605333_0007_01_000002","hive.llap.daemon.yarn.container.mb":"2048","hive.llap.auto.auth":"false","hive.llap.io.allocator.mmap":"false","hive.llap.io.use.lrfu":"true","hive.llap.io.memory.size":"134217728","hive.llap.management.rpc.port":"15004","hive.llap.allow.permanent.fns":"true","hive.llap.daemon.rpc.port":"46480","hive.llap.daemon.web.ssl":"false","hive.llap.auto.max.input.size":"10737418240","hive.llap.io.lrfu.lambda":"1.0E-6","hive.llap.daemon.nm.address":"localhost:38742","llap.daemon.metrics.sessionid":"40fc27da-f0d3-458b-9059-d46c8dc32132","hive.llap.auto.enforce.vectorized":"true","hive.llap.daemon.service.refresh.interval.sec":"60s","hive.llap.io.orc.time.counters":"true","hive.llap.auto.max.output.size":"1073741824","hive.llap.io.allocator.direct":"true","registry.unique.id":"34850c09-d8b1-415b-8572-139456d476fc","hive.llap.daemon.web.port":"15002","hive.llap.object.cache.enabled":"true","hive.llap.execution.mode":"all","hive.llap.daemon.yarn.shuffle.port":"15551","hive.llap.daemon.output.service.port":"15003","hive.llap.daemon.download.permanent.fns":"false","hive.llap.io.memory.mode":"cache","hive.llap.daemon.task.scheduler.wait.queue.size":"10","hive.llap.daemon.memory.per.instance.mb":"1024","hive.llap.auto.enforce.tree":"true","hive.llap.io.threadpool.size":"2","hive.llap.daemon.service.hosts":"@llap-demo","hive.llap.auto.enforce.stats":"true","hive.llap.auto.allow.uber":"false","hive.llap.daemon.num.executors":"1"}

Hive测试

hive-site.xml 添加以下配置,注意 hive.llap.daemon.service.hosts 必须是 “@” + ${LLAP_SERVICE_NAME}

<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
<property>
<name>hive.llap.execution.mode</name>
<value>all</value>
</property>
<property>
<name>hive.execution.mode</name>
<value>llap</value>
</property>
<property>
<name>hive.llap.daemon.service.hosts</name>
<value>@llap-demo</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<value>zk_ip:zk_port</value>
</property>
<property>
<name>hive.llap.daemon.memory.per.instance.mb</name>
<value>2048</value>
</property>
<property>
<name>hive.llap.daemon.num.executors</name>
<value>2</value>
</property>
<property>
<name>hive.server2.tez.default.queues</name>
<value>root.default</value>
</property>
<property>
<name>hive.server2.tez.initialize.default.sessions</name>
<value>true</value>
</property>
<property>
<name>hive.server2.tez.sessions.per.default.queue</name>
<value>2</value>
</property>

LLAP 测试

我们用 tpch-ds 测试,执行三次 query1.sql。

use tpcds_bin_partitioned_orc_2;
source query1.sql;
source query1.sql;
source query1.sql;

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
我们发现,第一次执行用 7.47 秒,第2次执行用 4.31 秒,第 3 次执行用 4.05 秒。因为第 1 次执行后,LLAP 把一些原始数据缓冲到堆外内存里。

非 LLAP 测试

set hive.execution.mode=tez;
set hive.llap.execution.mode=none;
use tpcds_bin_partitioned_orc_2;
source query1.sql;
source query1.sql;
source query1.sql;

为了公平,测试之前先杀死 LLAP 的资源。
第 1 次运行。
在这里插入图片描述
第 2 次运行:
在这里插入图片描述
第 3 次运行:
在这里插入图片描述
可以看出,每次运行都用 11 秒左右。

问题总结

  1. 没办法指定容器的cpu 的 vcores 数量。我们指定 executors 参数,是控制启动后的容器中,启动多少个计算线程,并不控制从 ResourceManager 中申请多少 CPU 资源。向 ResourceManager 申请的 CPU 资源,是生成的 Yarnfile 中的以下参数控制。
"resource": {
        "cpus": 1,
        "memory": "1024"
      },
  1. 不能在一台服务器上启动两个 LLAP 服务。
    因为

其他问题

  1. 用户自定义 jar 包。
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值