hadoop Corrupt blocks或Missing replicas问题处理方法

hadoop集群出现硬盘物理故障导致部分块损坏,出现Corrupt blocks或Missing replicas问题,下面说下如何处理:

1、查看状态:
hdfs fsck /

需要等待一些时间
.........Status: CORRUPT
 Total size:    110507203084214 B
 Total dirs:    258577
 Total files:   4144709
 Total symlinks:                0 (Files currently being written: 1)
 Total blocks (validated):      3929730 (avg. block size 28120813 B)
  ********************************
  CORRUPT FILES:        26
  MISSING BLOCKS:       26
  MISSING SIZE:         27262976 B
  CORRUPT BLOCKS:       26
  ********************************
 Minimally replicated blocks:   3929704 (99.99934 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0117457
 Corrupt blocks:                26
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          30
 Number of racks:               1
FSCK ended at Tue Apr 18 10:04:42 CST 2017 in 84054 milliseconds

2、处理方法

使用 hdfs fsck / 列出损坏文件,损坏的文件无法恢复,只能删除损坏的文件  

hdfs fsck -delete  这个命令只删除有问题的块文件

3、参考处理方法:
You can use
hdfs fsck /
to determine which files are having problems. Look through the output for missing or corrupt blocks (ignore under-replicated blocks for now). This command is really verbose especially on a large HDFS filesystem so I normally get down to the meaningful output with
hdfs fsck / | egrep -v '^\.+$' | grep -v eplica
which ignores lines with nothing but dots and lines talking about replication.
Once you find a file that is corrupt
hdfs fsck /path/to/corrupt/file -locations -blocks -files
Use that output to determine where blocks might live. If the file is larger than your block size it might have multiple blocks.
You can use the reported block numbers to go around to the datanodes and the namenode logs searching for the machine or machines on which the blocks lived. Try looking for filesystem errors on those machines. Missing mount points, datanode not running, file system reformatted/reprovisioned. If you can find a problem in that way and bring the block back online that file will be healthy again.

Lather rinse and repeat until all files are healthy or you exhaust all alternatives looking for the blocks.
Once you determine what happened and you cannot recover any more blocks, just use the
hdfs fs -rm /path/to/file/with/permanently/missing/blocks
command to get your HDFS filesystem back to healthy so you can start tracking new errors as they occur.

参考:
http://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files
http://blog.csdn.net/d6619309/article/details/51595884
  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值