CDH 丢失块及副本不足的块

备注:
CDH 6.3.1

一.问题描述

如下图所示,刚安装的CDH,提示存在丢失块,也存在副本不足的块
image.png

二.解决方案

2.1 丢失块查找

sudo -u hdfs hadoop fsck / -files -blocks

测试记录:

FSCK started by hdfs (auth:SIMPLE) from /10.31.1.123 for path / at Fri Nov 27 09:55:15 CST 2020
/ <dir>
/tmp <dir>
/tmp/.cloudera_health_monitoring_canary_files <dir>
/tmp/.cloudera_health_monitoring_canary_files/.canary_file_2020_11_19-14_47_30.2b73c248cbe340c5 0 bytes, replicated: replication=3, 0 block(s):  OK

/tmp/.cloudera_health_monitoring_canary_files/.canary_file_2020_11_19-14_48_30.f8637315419b1c8e 0 bytes, replicated: replication=3, 0 block(s):  OK

/tmp/hive <dir>
/tmp/hive/anonymous <dir>
/tmp/hive/anonymous/5245f213-70eb-45d2-970f-1db033fbafad <dir>
/tmp/hive/anonymous/5245f213-70eb-45d2-970f-1db033fbafad/_tmp_space.db <dir>
/tmp/hive/anonymous/83ac2b81-64c8-4a4e-832f-c1f933cea7eb <dir>
/tmp/hive/anonymous/83ac2b81-64c8-4a4e-832f-c1f933cea7eb/_tmp_space.db <dir>
/tmp/hive/anonymous/83ac2b81-64c8-4a4e-832f-c1f933cea7eb/_tmp_space.db/Values__Tmp__Table__1 <dir>
/tmp/hive/anonymous/83ac2b81-64c8-4a4e-832f-c1f933cea7eb/_tmp_space.db/Values__Tmp__Table__1/data_file 22 bytes, replicated: replication=3, 1 block(s):  OK
0. BP-157751563-10.31.1.123-1605413961809:blk_1073749568_8744 len=22 Live_repl=3

/tmp/hive/hive <dir>
/tmp/hive/hive/0988ee1b-25eb-43a4-b8cd-d0d7e43a457e <dir>
/tmp/hive/hive/0988ee1b-25eb-43a4-b8cd-d0d7e43a457e/_tmp_space.db <dir>
/tmp/hive/hive/11709f3e-a78b-4f6c-9d85-214474b0a566 <dir>
/tmp/hive/hive/11709f3e-a78b-4f6c-9d85-214474b0a566/_tmp_space.db <dir>
/tmp/hive/hive/190fef6f-a879-4fe3-8861-510701ce0c62 <dir>
/tmp/hive/hive/190fef6f-a879-4fe3-8861-510701ce0c62/_tmp_space.db <dir>
/tmp/hive/hive/1d83563c-8b79-4fea-9dec-22a3d1e0454b <dir>
/tmp/hive/hive/1d83563c-8b79-4fea-9dec-22a3d1e0454b/_tmp_space.db <dir>
/tmp/hive/hive/4f781ef4-b9da-4515-84f4-ff8037dd6e23 <dir>
/tmp/hive/hive/4f781ef4-b9da-4515-84f4-ff8037dd6e23/_tmp_space.db <dir>
**中间省略N多输出**
/user/yarn <dir>
/user/yarn/mapreduce <dir>
/user/yarn/mapreduce/mr-framework <dir>
/user/yarn/mapreduce/mr-framework/3.0.0-cdh6.3.1-mr-framework.tar.gz 235053931 bytes, replicated: replication=3, 2 block(s):  OK
0. BP-157751563-10.31.1.123-1605413961809:blk_1073766370_25546 len=134217728 Live_repl=3
1. BP-157751563-10.31.1.123-1605413961809:blk_1073766371_25547 len=100836203 Live_repl=3


Status: CORRUPT
 Number of data-nodes:  4
 Number of racks:               1
 Total dirs:                    2108
 Total symlinks:                0

Replicated Blocks:
 Total size:    128560763438 B
 Total files:   2187
 Total blocks (validated):      3112 (avg. block size 41311299 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:      1814 (58.29049 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:        1814
  MISSING BLOCKS:       1814
  MISSING SIZE:         1519384904 B
  ********************************
 Minimally replicated blocks:   1298 (41.70951 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     1.2512853
 Missing blocks:                1814
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Blocks queued for replication: 0

Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0
 Blocks queued for replication: 0
FSCK ended at Fri Nov 27 09:55:15 CST 2020 in 47 milliseconds


The filesystem under path '/' is CORRUPT

过滤出 MISSING的信息
可以看到都是oozie这个空间丢失的块

/user/oozie/share/lib/lib_20201115122055/distcp/hadoop-distcp.jar 4038448 bytes, replicated: replication=3, 1 block(s):  MISSING 1 blocks of total size 4038448 B
0. BP-157751563-10.31.1.123-1605413961809:blk_1073741831_1007 len=4038448 MISSING!
/user/oozie/share/lib/lib_20201115122055/distcp/netty-all-4.1.17.Final.jar 3780056 bytes, replicated: replication=3, 1 block(s):  MISSING 1 blocks of total size 3780056 B
0. BP-157751563-10.31.1.123-1605413961809:blk_1073741829_1005 len=3780056 MISSING!
/user/oozie/share/lib/lib_20201115122055/distcp/oozie-sharelib-distcp-5.1.0-cdh6.3.1.jar 12759 bytes, replicated: replication=3, 1 block(s):  MISSING 1 blocks of total size 12759 B
0. BP-157751563-10.31.1.123-1605413961809:blk_1073741826_1002 len=12759 MISSING!
/user/oozie/share/lib/lib_20201115122055/distcp/oozie-sharelib-distcp.jar 12759 bytes, replicated: replication=3, 1 block(s):  MISSING 1 blocks of total size 12759 B
0. BP-157751563-10.31.1.123-1605413961809:blk_1073741830_1006 len=12759 MISSING!
/user/oozie/share/lib/lib_20201115122055/git/HikariCP-2.6.1.jar 133942 bytes, replicated: replication=3, 1 block(s):  MISSING 1 blocks of total size 133942 B
0. BP-157751563-10.31.1.123-1605413961809:blk_1073741832_1008 len=133942 MISSING!
**中间省略部分输出**
0. BP-157751563-10.31.1.123-1605413961809:blk_1073743627_2803 len=18161 MISSING!
/user/oozie/share/lib/lib_20201115122055/sqoop/xz-1.6.jar 103131 bytes, replicated: replication=3, 1 block(s):  MISSING 1 blocks of total size 103131 B
0. BP-157751563-10.31.1.123-1605413961809:blk_1073743633_2809 len=103131 MISSING!
/user/oozie/share/lib/lib_20201115122055/sqoop/zookeeper.jar 1543701 bytes, replicated: replication=3, 1 block(s):  MISSING 1 blocks of total size 1543701 B
0. BP-157751563-10.31.1.123-1605413961809:blk_1073743638_2814 len=1543701 MISSING!
  MISSING BLOCKS:       1814
  MISSING SIZE:         1519384904 B

2.2 解决oozie副本块不足的问题

查看丢失的文件块信息

[root@hp1 ~]# hadoop dfs -ls /user/oozie/share/lib/lib_20201115122055/hive2/jcodings-1.0.18.jar  
WARNING: Use of this script to execute dfs is deprecated.
WARNING: Attempting to execute replacement "hdfs dfs" instead.

-rwxrwxr-x   3 oo

导致这个问题产生时是由于初始化集群后,只有两个节点的datanode,所以将副本调整为2 、dfs.replication=2,在切换后对hdfs做了抑制,导致告警出现;现在重新加入一个datanode,重新恢复3个datanode节点3副本模式。

2.2.1 设置3个副本模式

sudo -u hdfs hadoop fs -setrep -R 3 /

2.2.2 删除坏的块:

-- 删除损坏的块
sudo -u hdfs hadoop fsck / -delete  
-- 列出损坏的块
sudo -u hdfs hadoop fsck -list-corruptfileblocks

测试记录:

[root@hp1 ~]# sudo -u hdfs hadoop fsck -list-corruptfileblocks
WARNING: Use of this script to execute fsck is deprecated.
WARNING: Attempting to execute replacement "hdfs fsck" instead.

Connecting to namenode via http://hp1:9870/fsck?ugi=hdfs&listcorruptfileblocks=1&path=%2F
The filesystem under path '/' has 0 CORRUPT files

现在看丢失块的问题解决:
image.png

参考

1.https://www.cnblogs.com/hqt0731/articles/8804924.html

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值