Apache Hadoop 3.x HDFS 磁盘平衡

30 篇文章 3 订阅
3 篇文章 0 订阅

一、概述

Diskbalancer 是一个命令行工具,可以将数据均匀分布在一个数据节点的所有磁盘上。该工具与负责集群范围数据平衡的Balancer不同 。由于多种原因,数据可能在节点上的磁盘之间分布不均。这可能是由于大量写入和删除或由于磁盘更换而发生的。该工具针对给定的数据节点运行并将块从一个磁盘移动到另一个磁盘。

二、架构

磁盘均衡器通过创建一个计划来操作,并继续在datanode上执行该计划。计划是一组描述在两个磁盘之间应该移动多少数据的语句。计划是由多个移动步骤组成的。一个移动步骤有源磁盘、目标磁盘和要移动的字节数。计划可以针对操作数据节点执行。磁盘平衡器不应该干扰其他进程,因为它会控制每秒复制的数据量。注意,磁盘均衡器在集群中默认是不启用的。在hdfs-site.xml中,“dfs.disk.balancer.enabled”必须设置为“true”。

三、命令

以下部分将讨论磁盘均衡器支持哪些命令以及如何使用它们。

Plan

plan命令可以针对给定的datanode运行

# sudo -u hdfs hdfs diskbalancer -plan hadoop-01

该命令接受通用选项。

plan命令还有一组参数,允许用户控制计划的输出和执行

COMMAND_OPTION

Description

-out

Allows user to control the output location of the plan file.

-bandwidth

Since datanode is operational and might be running other jobs, diskbalancer limits the amount of data moved per second. This parameter allows user to set the maximum bandwidth to be used. This is not required to be set since diskBalancer will use the default bandwidth if this is not specified.

-thresholdPercentage

Since we operate against a snap-shot of datanode, the move operations have a tolerance percentage to declare success. If user specifies 10% and move operation is say 20GB in size, if we can move 18GB that operation is considered successful. This is to accommodate the changes in datanode in real time. This parameter is not needed and a default is used if not specified.

-maxerror

Max error allows users to specify how many block copy operations must fail before we abort a move step. Once again, this is not a needed parameter and a system-default is used if not specified.

-v

Verbose mode, specifying this parameter forces the plan command to print out a summary of the plan on stdout.

-fs

- Specifies the namenode to use. if not specified default from config is used.

plan 命令写入两个输出文件。它们是<nodename>.before.json,它在运行 diskbalancer 之前捕获集群的状态,以及<nodename>.plan.json。

# sudo -u hdfs hdfs diskbalancer -plan hadoop-01 -bandwidth  100 -thresholdPercentage 2

 

Execute

Execute命令接受一个计划命令,计划命令针对生成计划的datanode执行它。

# sudo -u hdfs hdfs diskbalancer -execute /system/diskbalancer/2021-Jun-16-11-16-55/hadoop-01.plan.json

 

它通过从计划文件中读取datanode的地址来执行计划。当DiskBalancer执行该计划时,它是一个异步进程的开始,可能需要很长时间。因此,查询命令可以帮助获取执行命令的当前状态。

COMMAND_OPTION

Description

-skipDateCheck

Skip date check and force execute the plan.

Query

查询命令从datanode获取diskbalancer的当前状态。

# sudo -u hdfs hdfs diskbalancer -query hadoop-01

 

COMMAND_OPTION

Description

-v

Verbose mode, Prints out status of individual moves

# sudo -u hdfs hdfs diskbalancer -query hadoop-01 –v

Cancel

Cancel命令用来取消运行计划。重启datanode与取消命令的效果相同,因为datanode上的计划信息是暂时的。

# hdfs diskbalancer -cancel /system/diskbalancer/2021-Jun-16-11-16-55/hadoop-01.plan.json

或者

# hdfs diskbalancer -cancel planID -node hadoop-01

planID可以使用查询命令从datanode读取。

Report

报告命令提供运行磁盘均衡器将受益的指定节点或顶级节点的详细报告。节点可以通过主机文件或以逗号分隔的节点列表来指定。

# sudo -u hdfs hdfs diskbalancer -fs hdfs://cluster -report -node <file://> | [<DataNodeID|IP|Hostname>,...]

 

或者

# sudo -u hdfs hdfs diskbalancer -fs hdfs://cluster -report -top topnum

 

四、设置

diskbalancer设置可以通过hdfs-site.xml控制

Setting

Description

dfs.disk.balancer.enabled

This parameter controls if diskbalancer is enabled for a cluster. if this is not enabled, any execute command will be rejected by the datanode.The default value is false.

dfs.disk.balancer.max.disk.throughputInMBperSec

This controls the maximum disk bandwidth consumed by diskbalancer while copying data. If a value like 10MB is specified then diskbalancer on the average will only copy 10MB/S. The default value is 10MB/S.

dfs.disk.balancer.max.disk.errors

sets the value of maximum number of errors we can ignore for a specific move between two disks before it is abandoned. For example, if a plan has 3 pair of disks to copy between , and the first disk set encounters more than 5 errors, then we abandon the first copy and start the second copy in the plan. The default value of max errors is set to 5.

dfs.disk.balancer.block.tolerance.percent

The tolerance percent specifies when we have reached a good enough value for any copy step. For example, if you specify 10% then getting close to 10% of the target value is good enough.

dfs.disk.balancer.plan.threshold.percent

The percentage threshold value for volume Data Density in a plan. If the absolute value of volume Data Density which is out of threshold value in a node, it means that the volumes corresponding to the disks should do the balancing in the plan. The default value is 10.

dfs.disk.balancer.plan.valid.interval

Maximum amount of time disk balancer plan is valid. Supports the following suffixes (case insensitive): ms(millis), s(sec), m(min), h(hour), d(day) to specify the time (such as 2s, 2m, 1h, etc.). If no suffix is specified then milliseconds is assumed. Default value is 1d

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值