在hdfs做集群balance时,可能会看到类似于如下WARN信息
23/03/23 10:01:42 WARN balancer.Dispatcher: Failed to move blk_1082562934_8824015 with size=12571895 from 10.10.10.10:1019:DISK to 10.10.10.10:1019:DISK through 10.10.10.10:1019
java.io.IOException: Got error, status=ERROR, status message Not able to receive block 1082562934 from /10.10.10.10:43716 because threads quota is exceeded., reportedBlock move is failed
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:128)
at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:462)
at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:393)
at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$3100(Dispatcher.java:235)
at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1146)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
原因是banlance的并行线程满了
可参考参数dfs.datanode.balance.max.concurrent.moves
默认是100
(解释为:Maximum number of threads for Datanode balancer pending moves. This value is reconfigurable via the “dfsadmin -reconfig” command)
这个不会影响正常balance,可以不用理会。
如果要解决
hdfs中修改这个参数,重启生效
参考资料
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml