Hadoop distcp命令遇到的异常及解决方案

1 异常信息
Caused by: java.io.IOException: Mismatch in length of source:hdfs://xxx and target:hdfs://xxx

2 原因
需要远程复制的文件没有关闭,还处于写的状态。

3 解决方案:

1) 检查文件状态
hdfs fsck hdfs://10.10.10.10:80/flume/xxx/xxxxxxxx/day=2018-03-12/xxx.2018-03-12.1520841735508


Connecting to namenode via http://xxx:50070/fsck?ugi=hadoop&path=%2Fflume%2Fxxx%xxx%2Fday%3D2018-03-12%xxx.2018-03-12.1520841735508
FSCK started by hadoop (auth:SIMPLE) from /139.5.108.244 for path /flume/xxx/xxx/day=2018-03-12/xxx.2018-03-12.1520841735508 at Tue Mar 13 16:12:39 CST 2018
Status: HEALTHY
Total size: 0 B (Total open files size: 420 B)
Total dirs: 0
Total files: 0
Total symlinks: 0 (Files currently being written: 1)
Total blocks (validated): 0 (Total open file blocks (not validated): 1)
Minimally replicated blocks: 0
Over-replicated blocks: 0
Under-replicated blocks: 0
Mis-replicated blocks: 0
Default replication factor: 3
Average block replication: 0.0
Corrupt blocks: 0
Missing replicas: 0
Number of data-nodes: 8
Number of racks: 1
FSCK ended at Tue Mar 13 16:12:39 CST 2018 in 2 milliseconds


The filesystem under path '/flume/xxx/xxx/day=2018-03-12/xxx.2018-03-12.1520841735508' is HEALTHY


2) 关闭文件
hdfs debug recoverLease -path hdfs://10.10.10.10:8020/flume/xxx/xxx/day=2018-03-12/xxx.2018-03-12.1520841735508

可能会失败:
recoverLease returned false.
Giving up on recoverLease for hdfs://10.10.10.10:8020/flume/xxx/xxx/day=2018-03-12/xxx.2018-03-12.1520841735508 after 1 try.

尝试再次关闭:
recoverLease SUCCEEDED on hdfs://10.10.10.10:8020/flume/xxx/xxx/day=2018-03-12/xxx.2018-03-12.1520841735508

3)再次检查状态
Connecting to namenode via http://xxx:50070/fsck?ugi=hadoop&path=%2Fflume%2Fxxx%xxx%2Fday%3D2018-03-12%xxx.2018-03-12.1520841735508
FSCK started by hadoop (auth:SIMPLE) from /xx.xx.xx.xx for path /flume/xxx/xxx/day=2018-03-12/xxx.2018-03-12.1520841735508 at Tue Mar 13 16:19:57 CST 2018
.Status: HEALTHY
Total size: 838 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 838 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 8
Number of racks: 1
FSCK ended at Tue Mar 13 16:19:57 CST 2018 in 2 milliseconds


The filesystem under path '/flume/xxx/xxx/day=2018-03-12/xxx.2018-03-12.1520841735508' is HEALTHY

4) 再次复制
hadoop distcp -bandwidth 15 -m 50 -pb hdfs://10.10.10.10:8020//flume/xxx/xxx/day=2018-03-12 /flume/xxx/xxx/day=2018-03-12

* 注意:在使用该命令时最好指定带宽限制(-bandwidth),同时拷贝的最大数目(-m)。我在首次迁移数据时没有设置,一次性迁移了好几个月的数据,导致流量超标。
 
---------------------  
作者:TOMSCUT  
来源:CSDN  
原文:https://blog.csdn.net/qq_29829081/article/details/79605028  
版权声明:本文为博主原创文章,转载请附上博文链接!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值