Yarn 常见配置项

yarn 日志聚合的相关配置

yarn.log-aggregation-enable

share/doc/hadoop/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
默认值为false

  <property>
    <description>Whether to enable log aggregation. Log aggregation collects
      each container's logs and moves these logs onto a file-system, for e.g.
      HDFS, after the application completes. Users can configure the
      "yarn.nodemanager.remote-app-log-dir" and
      "yarn.nodemanager.remote-app-log-dir-suffix" properties to determine
      where these logs are moved to. Users can access the logs via the
      Application Timeline Server.
    </description>
    <name>yarn.log-aggregation-enable</name>
    <value>false</value>
  </property>
  
<property>
    <description>Where to aggregate logs to.</description>
    <name>yarn.nodemanager.remote-app-log-dir</name>
    <value>/tmp/logs</value>
  </property>
  
  <property>
    <description>The remote log dir will be created at 
      {yarn.nodemanager.remote-app-log-dir}/${user}/{thisParam}
    </description>
    <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
    <value>logs</value>
  </property>

解释:
注意聚合的目录是hdfs文件系统

yarn.log-aggregation-enable      true                  执行结束后收集(聚合)各个container本地的日志
yarn.nodemanager.remote-app-log-dir  /app-logs          聚合日志后在hdfs的存放地址
yarn.nodemanager.remote-app-log-dir-suffix  logs         聚合日志存放的后缀,存放地址由 ${remote-app-log-dir}/${user}/{thisParam}构成
Yarn executor 运行时的日志情况

运行时候的日志文件存放于yarn.nodemanager.log−dirs/{ApplicationID}

  <property>
    <description>
      Where to store container logs. An application's localized log directory
      will be found in ${yarn.nodemanager.log-dirs}/application_${appid}.
      Individual containers' log directories will be below this, in directories 
      named container_{$contid}. Each container directory will contain the files
      stderr, stdin, and syslog generated by that container.
    </description>
    <name>yarn.nodemanager.log-dirs</name>
    <value>${yarn.log.dir}/userlogs</value>
  </property>

如果这样配置

<property>
    <!--应用执行时存储路径-->
    <name>yarn.nodemanager.log-dirs</name>
    <value>file:/mnt/ddb/2/hadoop/nm</value>
</property>

运行时候的executor日志存放于:

root@xxx:/mnt/ddb/2/hadoop/nm/application_1471515078641_0007# ls
container_1471515078641_0007_01_000001  container_1471515078641_0007_01_000002  container_1471515078641_0007_01_000003

注:其中container_1471515078641_0007_01_000001为RM为application_1471515078641_0007分配的第一个container,即AM所在的container,

executor 运行结束以后日志会聚合到HDFS上

默认存入于/tmp/logs/${user}/logs 下

drwxrwx---   - root supergroup          0 2016-08-18 18:29 /tmp/logs/root/logs/application_1471515078641_0002
drwxrwx---   - root supergroup          0 2016-08-18 19:10 /tmp/logs/root/logs/application_1471515078641_0003
drwxrwx---   - root supergroup          0 2016-08-18 19:17 /tmp/logs/root/logs/application_1471515078641_0004

例如,在默认配置下,
Spark 产生的 container 日志,

hadoop fs -get /tmp/logs/root/logs/application_1653740407738_0020  ./

运行结束后,可以通过以下命令查看日志:

yarn logs --applicationId <id>

可参见 https://www.cnblogs.com/caoweixiong/p/13634188.html

抓取 yarn am 的日志

yarn logs -am -applicationId application_1480922439133_0845_02

默认配置Value

  <property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>true</value>
  </property>
<property>
    <description>Ratio between virtual memory to physical memory when 
    setting memory limits for containers. Container allocations are
    expressed in terms of physical memory, and virtual memory usage
    is allowed to exceed this allocation by this ratio.
    </description>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>2.1</value>
  </property>

hadoop/logs/yarn-root-nodemanager-{hostname}.log

2020-10-29 13:29:42,852 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1600421369116_0008_01_000003
2020-10-29 13:29:42,911 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56539 for container-id container_1600421369116_0008_01_000002: 377.0 MB of 2 GB physical memory used; 3.1 GB of 4.2 GB virtual memory used
2020-10-29 13:29:42,951 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56643 for container-id container_1600421369116_0008_01_000003: 371.7 MB of 2 GB physical memory used; 3.1 GB of 4.2 GB virtual memory used
2020-10-29 13:29:42,980 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56413 for container-id container_1600421369116_0008_01_000001: 368.9 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used
2020-10-29 13:29:46,040 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56539 for container-id container_1600421369116_0008_01_000002: 377.0 MB of 2 GB physical memory used; 3.1 GB of 4.2 GB virtual memory used
2020-10-29 13:29:46,082 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56643 for container-id container_1600421369116_0008_01_000003: 367.2 MB of 2 GB physical memory used; 3.1 GB of 4.2 GB virtual memory used
2020-10-29 13:29:46,111 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56413 for container-id container_1600421369116_0008_01_000001: 368.9 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used
2020-10-29 13:29:46,885 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1600421369116_0008_01_000004
2020-10-29 13:29:49,175 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56539 for container-id container_1600421369116_0008_01_000002: 377.0 MB of 2 GB physical memory used; 3.1 GB of 4.2 GB virtual memory used
2020-10-29 13:29:49,219 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56643 for container-id container_1600421369116_0008_01_000003: 367.2 MB of 2 GB physical memory used; 3.1 GB of 4.2 GB virtual memory used
2020-10-29 13:29:49,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 56413 for container-id container_1600421369116_0008_01_000001: 369.0 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used

367.2 MB of 2 GB physical memory used; 3.1 GB of 4.2 GB virtual memory used
表示:
申请到了2GB,其中使用了367.2 MB。
2GB╳2.1=4.2GB ,4.2GB虚拟内存中使用了3.1GB
如果 yarn.nodemanager.vmem-check-enabled 此时还是使用 默认值 true, 该container virtual memory 超过 4.2 将会被killed 掉

yarn 缓存目录
yarn/local/usercache/hadoop/filecache]$ ll
total 0
drwxr-xr-x 3 hadoop hadoop 59 Apr  1 14:45 10
drwxr-xr-x 3 hadoop hadoop 40 Apr  1 14:45 11
drwxr-xr-x 3 hadoop hadoop 59 Apr  1 14:50 12
drwxr-xr-x 3 hadoop hadoop 59 Apr  1 19:56 14
drwxr-xr-x 3 hadoop hadoop 40 Apr  1 19:57 17
drwxr-xr-x 3 hadoop hadoop 59 Apr  1 20:00 18
drwxr-xr-x 3 hadoop hadoop 40 Apr  3 11:10 23
drwxr-xr-x 3 hadoop hadoop 40 Apr  3 11:10 25
drwxr-xr-x 3 hadoop hadoop 59 Apr 12 10:47 28
drwxr-xr-x 3 hadoop hadoop 40 Apr 12 16:37 33
yarn/local/usercache/hadoop/filecache/33]$ ll
total 0
drwx------ 3 hadoop hadoop 183 Apr 12 16:37 __spark_conf__.zip

NodeManager 采用轮询的分配策略将这三类资源存放在 yarn.nodemanager.local-dirs 指定的目录列表中,在每个目录中,资源按照以下方式存放:

  • PUBLIC 资源:存放在 ${yarn.nodemanager.local-dirs}/filecache/ 目录下,每个资源将单独存放在以一个随机整数命名的目录中,且目录的访问权限均为 0755。
  • PRIVATE 资源:存放在 y a r n . n o d e m a n a g e r . l o c a l − d i r s / u s e r c a c h e / {yarn.nodemanager.local-dirs}/usercache/ yarn.nodemanager.localdirs/usercache/{user}/filecache/ 目录下,每个资源将单独存放在以一个随机整数命名的目录中,且目录的访问权限均为 0710。
  • APPLICATION 资源:存放在 y a r n . n o d e m a n a g e r . l o c a l − d i r s / u s e r c a c h e / {yarn.nodemanager.local-dirs}/usercache/ yarn.nodemanager.localdirs/usercache/{user}/ a p p c a c h e / {appcache}/ appcache/{appid}/filecache/ 目录下,每个资源将单独存放在以一个随机整数命名的目录中,且目录的访问权限均为 0710。

其中 Container 的工作目录位于 y a r n . n o d e m a n a g e r . l o c a l − d i r s / u s e r c a c h e / {yarn.nodemanager.local-dirs}/usercache/ yarn.nodemanager.localdirs/usercache/{user}/ a p p c a c h e / {appcache}/ appcache/{appid}/${containerid} 目录下,其主要保存 jar 包文件、字典文件对应的软链接。

./nm-local-dir/
|-- filecache		// PUBLIC资源
|   `-- 10			// 每个资源将单独存放在以一个随机整数命名的目录中
|-- nmPrivate
|   |-- application_xxxx_xxx
|   |   |-- container_xxx_xxx_xxx_xx_xxxx
|   |   |-- container_xxx_xxx_xxx_xx_xxxx	// 私有目录数据(执行脚本、token文件、pid文件)
|   |   |   |-- container_xxx_xxx_xxx_xx_xxxx.pid
|   |   |   |-- container_xxx_xxx_xxx_xx_xxxx.tokens
|   |   |   `-- launch_container.sh
|   |-- application_xxxx_xxx
|   `-- application_xxxx_xxx
`-- usercache
    |-- userXxx
    |   |-- appcache		// APPLICATION资源
    |   `-- filecache		// PRIVATE资源
    |-- userXxx
    |   |-- appcache
    |   `-- filecache

可参考:
https://www.cnblogs.com/shuofxz/p/17383011.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值