[color=red][size=medium]Hadoop provides an fsck utility for checking the health of files in HDFS.[/size][/color]
% hadoop fsck /
.................
Status: HEALTHY
Total size: 2928057882 B
Total dirs: 271
Total files: 173 (Files currently being written: 1)
Total blocks (validated): 194 (avg. block size 15093081 B) (Total open file blocks (not validated): 1)
Minimally replicated blocks: 194 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 194 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 388 (200.0 %)
Number of data-nodes: 1
Number of racks: 1
The filesystem under path '/' is HEALTHY
Note that fsck retrieves all of its information from the namenode; it
does not communicate with any datanodes to actually retrieve any block data.
[color=red][size=medium]Finding the blocks for a file.[/size][/color]
$ hadoop fsck /user/root/hbase/counters/c03349790e6588512d66f3afc25a999b/.oldlogs/hlog.1329514389203 -files -blocks -racks
/user/root/hbase/counters/c03349790e6588512d66f3afc25a999b/.oldlogs/hlog.1329514389203 124 bytes, 1 block(s): Under replicated blk_-162790574863232271_2270. Target Replicas is 3 but found 1 replica(s).
0. blk_-162790574863232271_2270 len=124 repl=1 [/default-rack/10.42.197.92:50010]
Status: HEALTHY
Total size: 124 B
Total dirs: 0
Total files: 1
Total blocks (validated): 1 (avg. block size 124 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 1 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 2 (200.0 %)
Number of data-nodes: 1
Number of racks: 1
The filesystem under path '/user/root/hbase/counters/c03349790e6588512d66f3afc25a999b/.oldlogs/hlog.1329514389203' is HEALTHY
• The -files option shows the line with the filename, size, number of blocks, and
its health (whether there are any missing blocks).
• The -blocks option shows information about each block in the file, one line per
block.
• The -racks option displays the rack location and the datanode addresses for each
block.
% hadoop fsck /
.................
Status: HEALTHY
Total size: 2928057882 B
Total dirs: 271
Total files: 173 (Files currently being written: 1)
Total blocks (validated): 194 (avg. block size 15093081 B) (Total open file blocks (not validated): 1)
Minimally replicated blocks: 194 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 194 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 388 (200.0 %)
Number of data-nodes: 1
Number of racks: 1
The filesystem under path '/' is HEALTHY
Note that fsck retrieves all of its information from the namenode; it
does not communicate with any datanodes to actually retrieve any block data.
[color=red][size=medium]Finding the blocks for a file.[/size][/color]
$ hadoop fsck /user/root/hbase/counters/c03349790e6588512d66f3afc25a999b/.oldlogs/hlog.1329514389203 -files -blocks -racks
/user/root/hbase/counters/c03349790e6588512d66f3afc25a999b/.oldlogs/hlog.1329514389203 124 bytes, 1 block(s): Under replicated blk_-162790574863232271_2270. Target Replicas is 3 but found 1 replica(s).
0. blk_-162790574863232271_2270 len=124 repl=1 [/default-rack/10.42.197.92:50010]
Status: HEALTHY
Total size: 124 B
Total dirs: 0
Total files: 1
Total blocks (validated): 1 (avg. block size 124 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 1 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 2 (200.0 %)
Number of data-nodes: 1
Number of racks: 1
The filesystem under path '/user/root/hbase/counters/c03349790e6588512d66f3afc25a999b/.oldlogs/hlog.1329514389203' is HEALTHY
• The -files option shows the line with the filename, size, number of blocks, and
its health (whether there are any missing blocks).
• The -blocks option shows information about each block in the file, one line per
block.
• The -racks option displays the rack location and the datanode addresses for each
block.