HDFS块恢复流程初级版

1.先看几个线程栈

1.没有修改代码时走localRack -> nextRack -> Random时的流程

"RedundancyMonitor" #48 daemon prio=5 os_prio=0 tid=0x00007f925ec14800 nid=0x10544 runnable [0x00007f5a2491d000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:682)
        at org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533)
        at org.apache.hadoop.hdfs.net.DFSNetworkTopology.chooseRandomWithStorageTypeTwoTrial(DFSNetworkTopology.java:122)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:901)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:798)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:766)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseFromNextRack(BlockPlacementPolicyDefault.java:709)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:685)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:633)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyRackFaultTolerant.chooseOnce(BlockPlacementPolicyRackFaultTolerant.java:218)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyRackFaultTolerant.chooseTargetInOrder(BlockPlacementPolicyRackFaultTolerant.java:94)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:438)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:310)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:149)
        at org.apache.hadoop.hdfs.server.blockmanagement.ErasureCodingWork.chooseTargets(ErasureCodingWork.java:62)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1956)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1908)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4872)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4739)
        at java.lang.Thread.run(Thread.java:745)

2.修改代码加大每个机架允许的最大dn数为2时的流程 进入localRack

"RedundancyMonitor" #48 daemon prio=5 os_prio=0 tid=0x00007f0c19b56800 nid=0xca27 runnable [0x00007ed3e1115000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.hadoop.net.NetworkTopology.countNumOfAvailableNodes(NetworkTopology.java:678)
        at org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:533)
        at org.apache.hadoop.hdfs.net.DFSNetworkTopology.chooseRandomWithStorageTypeTwoTrial(DFSNetworkTopology.java:122)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:903)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:800)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:768)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:675)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:635)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyRackFaultTolerant.chooseOnce(BlockPlacementPolicyRackFaultTolerant.java:220)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyRackFaultTolerant.chooseTargetInOrder(BlockPlacementPolicyRackFaultTolerant.java:96)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:440)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:310)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:149)
        at org.apache.hadoop.hdfs.server.blockmanagement.ErasureCodingWork.chooseTargets(ErasureCodingWork.java:62)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1956)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1908)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4872)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4739)
        at java.lang.Thread.run(Thread.java:745)

3.修改代码后直接从全局选dn时的线程栈

"RedundancyMonitor" #48 daemon prio=5 os_prio=0 tid=0x00007f9629e39800 nid=0x3c69 runnable [0x00007f5df1d02000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2394)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.scheduleReconstruction(BlockManager.java:2027)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1931)
        - locked <0x00007f658c2668c0> (a org.apache.hadoop.hdfs.server.blockmanagement.LowRedundancyBlocks)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1908)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4872)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4739)
        at java.lang.Thread.run(Thread.java:745)

现在有几个问题需要解决:
1.这是在选原有的健康节点中选一个(选择已有的块的一台),还是选一个新的节点?
2.EC选择的时候,和副本选择到底有什么区别?
3.副本是怎样块恢复的?EC是怎么样块恢复的?

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值