1、hadoop会以块的形式存储在HDFS系统。通过命令可以查看所在节点和块的位置:
[root@master softpackage]# hadoop fs -put scala-2.10.4.tgz /
[root@master softpackage]# hadoop fsck /scala-2.10.4.tgz -files -locations -blocks
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Connecting to namenode via http://master:50070/fsck?ugi=root&files=1&locations=1&blocks=1&path=%2Fscala-2.10.4.tgz
FSCK started by root (auth:SIMPLE) from /192.168.86.133 for path /scala-2.10.4.tgz at Fri Jun 09 11:14:14 EDT 2017
/scala-2.10.4.tgz 29937534 bytes, 1 block(s): Under replicated BP-1810807976-192.168.86.133-1496888566245:blk_1073741829_1005. Target Replicas is 3 but found 2 replica(s).
0. BP-1810807976-192.168.86.133-1496888566245:blk_1073741829_1005 len=29937534 repl=2 [DatanodeInfoWithStorage[192.168.86.132:50010,DS-ead6ac48-ce41-4133-9552-ec5ca51a6204,DISK], DatanodeInfoWithStorage[192.168.86.134:50010,DS-5059d0f7-4e64-4554-aa92-375a1fe573b8,DISK]]
Status: HEALTHY
Total size:
29937534 B
Total dirs:
0
Total files:
1
Total symlinks:
0
Total blocks (validated):
1 (avg. block size 29937534 B)
Minimally replicated blocks:
1 (100.0 %)
Over-replicated blocks:
0 (0.0 %)
Under-replicated blocks:
1 (100.0 %)
Mis-replicated blocks:
0 (0.0 %)
Default replication factor:
3
Average block replication:
2.0
Corrupt blocks:
0
Missing replicas:
1 (33.333332 %)
Number of data-nodes:
2
Number of racks:
1
FSCK ended at Fri Jun 09 11:14:14 EDT 2017 in 2 milliseconds
The filesystem under path '/scala-2.10.4.tgz' is HEALTHY
然后去datanode 查找具体位置:
[root@slave2 subdir0]# du -sh *
224K
blk_1073741825
4.0K
blk_1073741825_1001.meta
4.0K
blk_1073741827
4.0K
blk_1073741827_1003.meta
4.0K
blk_1073741828
4.0K
blk_1073741828_1004.meta
29M
blk_1073741829
232K
blk_1073741829_1005.meta
[root@slave2 subdir0]# pwd
/opt/hadoop/dfs/data/current/BP-1810807976-192.168.86.133-1496888566245/current/finalized/subdir0/subdir0
可参考:
http://www.myexception.cn/database/1997522.html