Namenode格式化后,会生成FSImage文件,位于dfs.name.dir参数指定目录的current目录中,记录了最初文件系统元数据的信息,随着系统运行,系统文件会越来越多,如果我们有一个要统计HDFS文件数的需求,通过shell或API来遍历都是比较麻烦的,HDFS经常跑离线数据处理,如果延时一段时间可以接受的话,我们可以通过直接解析FSIMAGE的方法来统计相关信息,这时统计的信息是来自上次checkpoint后的,下面是通过源码跟踪记录的FSIMAGE文件内容,本身是二进制文件,不能用普通的文本编辑器打开,hadoop源码版本1.0.4。
格式化时,会把format参数传递给namenode,经历过参数解析后会执行相应的format函数,调用流程比较简单,下面是格式化后的日志:
13/08/17 15:42:17 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host =ts08/192.168.0.43
STARTUP_MSG: args =[-format]
STARTUP_MSG: version =1.0.4
STARTUP_MSG: build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290;compiled by 'hortonfo' on Wed Oct 305:13:58 UTC 2012
************************************************************/
Re-format filesystem in C:\hadoop\hname ? (Y or N) Y
13/08/17 15:42:21 INFO util.GSet: VM type = 32-bit
13/08/17 15:42:21 INFO util.GSet: 2% max memory = 1.27125 MB
13/08/17 15:42:21 INFO util.GSet: capacity = 2^18 = 262144 entries
13/08/17 15:42:21 INFO util.GSet: recommended=262144,actual=262144
13/08/17 15:42:21 INFO namenode.FSNamesystem:fsOwner=Administrator
13/08/17 15:42:21 INFO namenode.FSNamesystem:supergroup=supergroup
13/08/17 15:42:21 INFO namenode.FSNamesystem:isPermissionEnabled=true
13/08/17 15:42:21 INFO namenode.FSNamesystem:dfs.block.invalidate.limit=100
13/08/17 15:42:21 INFO namenode.FSNamesystem:isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),accessTokenLifetime=0 min(s)
13/08/17 15:42:21 INFO namenode.NameNode: Caching file namesoccuring more than 10 times
13/08/17 15:42:21 INFO common.Storage: Image file ofsize 119 saved in 0 seconds.
13/08/17 15:42:21 INFO common.Storage: Storage directoryC:\hadoop\hname has been successfully formatted.
13/08/17 15:42:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ts08/192.168.0.43
************************************************************/
可以看到对于FSImage文件供写入了119字节,通过跟踪可以得出该文件的结构,上图:
FSImage初始结构 | |||||||
偏移量 | 长度 | 元素名 | 数据类型 | 值 | 源码文件名 | 源码位置 | 代码 |
0 | 4 | LAYOUT_VERSION | int | -32 | FSImage.java | 1048 | out.writeInt(FSConstants.LAYOUT_VERSION); |
4 | 4 | namespaceID | int | 1114775793 | FSImage.java | 1049 | out.writeInt(namespaceID); |
8 | 8 | INodeDirectoryWithQuota nsCount | long | 1 | FSImage.java | 1050 | out.writeLong(fsDir.rootDir.numItemsInTree()); |
16 | 8 | timestamp | long | xxxxxxxx | FSImage.java | 1051 | out.writeLong(fsNamesys.getGenerationStamp()); |
Inode | |||||||
24 | 2 | nameLen | short | 0 | FSImage.java | 1350 | out.writeShort(nameLen); |
26 | 2 | replication | short | 0 | FSImage.java | 1367 | out.writeShort(0); // replication |
28 | 8 | ModificationTime | long | xxxxxxxx | FSImage.java | 1368 | out.writeLong(node.getModificationTime()); |
36 | 8 | access time | long | 0 | FSImage.java | 1369 | out.writeLong(0); // access time |
44 | 8 | preferred block size | long | 0 | FSImage.java | 1370 | out.writeLong(0); // preferred block size |
52 | 4 | # of blocks | int | 0 | FSImage.java | 1371 | out.writeInt(-1); // # of blocks |
56 | 8 | NameSpace quota | long | -1 | FSImage.java | 1372 | out.writeLong(node.getNsQuota()); |
64 | 8 | disk space quota | long | -1 | FSImage.java | 1373 | out.writeLong(node.getDsQuota()); |
72 | 1 | username length | byte | 13 | Text.java | 411 | WritableUtils.writeVInt(out, length); |
73 | 13 | username | array[] | administrator | Text.java | 412 | out.write(bytes.array(), 0, length); |
86 | 1 | groupname length | byte | 10 | Text.java | 411 | WritableUtils.writeVInt(out, length); |
87 | 10 | groupname | array[] | supergroup | Text.java | 412 | out.write(bytes.array(), 0, length); |
97 | 2 | permission | short | 493 | PermissionStatus.java | 111 | permission.write(out); |
99 | 4 | paths in lease | int | 0 | FSNameSystem.java | 5443 | out.writeInt(leaseManager.countPath()); |
103 | 4 | currentId | int | 0 | DelegationTokenSecretManager.java | 120 | out.writeInt(currentId); |
107 | 4 | allKeys | int | 0 | DelegationTokenSecretManager.java | 244 | out.writeInt(allKeys.size()); |
111 | 4 | delegationTokenSequenceNumber | int | 0 | DelegationTokenSecretManager.java | 122 | out.writeInt(delegationTokenSequenceNumber); |
115 | 4 | CurrentTokens | int | 0 | DelegationTokenSecretManager.java | 123 | saveCurrentTokens(out); |