1、查看节点、hdfs、丢失的数据块 命令:hadoop dfsadmin -report
用这个命令可以快速定位出哪些节点down掉了,HDFS的容量以及使用了多少,以及每个节点的硬盘使用情况。
当然NameNode有个http页面也可以查询,但是这个命令的输出更适合我们的脚本监控dfs的使用状况;具体如图:
[root@linux01 ~]# hdfs dfsadmin -report
Configured Capacity: 54716792832 (50.96 GB)
Present Capacity: 43419418624 (40.44 GB)
DFS Remaining: 43259793408 (40.29 GB)
DFS Used: 159625216 (152.23 MB)
DFS Used%: 0.37%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
Live datanodes (3):
Name: 192.168.133.201:50010 (linux01)
Hostname: linux01
Decommission Status : Normal
Configured Capacity: 18238930944 (16.99 GB)
DFS Used: 53207040 (50.74 MB)
Non DFS Used: 4242960384 (3.95 GB)
DFS Remaining: 13942763520 (12.99 GB)
DFS Used%: 0.29%
DFS Remaining%: 76.45%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jun 15 13:11:19 CST 2021
Name: 192.168.133.202:50010 (linux02)
Hostname: linux02
Decommission Status : Normal
Configured Capacity: 18238930944 (16.99 GB)
DFS Used: 53207040 (50.74 MB)
Non DFS Used: 3524919296 (3.28 GB)
DFS Remaining: 14660804608 (13.65 GB)
DFS Used%: 0.29%
DFS Remaining%: 80.38%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jun 15 13:11:19 CST 2021
Name: 192.168.133.203:50010 (linux03)
Hostname: linux03
Decommission Status : Normal
Configured Capacity: 18238930944 (16.99 GB)
DFS Used: 53211136 (50.75 MB)
Non DFS Used: 3529494528 (3.29 GB)
DFS Remaining: 14656225280 (13.65 GB)
DFS Used%: 0.29%
DFS Remaining%: 80.36%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jun 15 13:11:17 CST 2021
2、查看文件系统的健康状况:hdfs fsck –
Usage: DFSck
-move 破损的文件移至/lost+found目录
-delete 删除破损的文件
-openforwrite 打印正在打开写操作的文件
-files 打印正在check的文件名
-blocks 打印block报告 (需要和-files参数一起使用)
-locations 打印每个block的位置信息(需要和-files参数一起使用)
-racks 打印位置信息的网络拓扑图 (需要和-files参数一起使用)
hadoop fsck /
用这个命令可以检查整个文件系统的健康状况,但是要注意它不会主动恢复备份缺失的block,这个是由NameNode单独的线程异步处理的。
3、如果hadoop不能自动恢复,则只能删除 corrupted blocks;
hdfs fsck -delete /