引入
hadoop集群中,namenode管理了元数据。那么,元数据存储在哪里呢?
如果是磁盘中,必然效率过低,如果是内存中,又会不安全。
所以元数据存储在内存中,但是有一个备份文件fsimage在磁盘。这就防止了断电元数据丢失的问题。
现在内存中的元数据增加之后,需要不断同步fsimage吗?这样又很没有效率。但你又必然要更新元数据,所以,可以将追加的操作写入一个文件edits。
fsimage与edits一起加载到内存中,就会得到所有的元数据。
合并
元数据一旦变得庞大,加载到内存就会很慢。所以,还需要定期去合并fsimage与edits。这个合并的工作最好由别人来做。
一般的集群中,可以由secondaryNameNode完成。
合并的时机可以是到一定的时间,或者是操作数到达一定阈值。
由于我使用的是hadoop3.1.3,所以我找了该版本的hdfs-default.xml:
<property>
<name>dfs.namenode.checkpoint.period</name>
<value>3600s</value>
<description>The number of seconds between two periodic checkpoints.
Support multiple time unit suffix(case insensitive), as described in dfs.heartbeat.interval.</description>
</property>
<property>
<name>dfs.namenode.checkpoint.txns</name>
<value>1000000</value>
<description>The Secondary NameNode or CheckpointNode will create a checkpoint of the namespace every 'dfs.namenode.checkpoint.txns' transactions,
regardless of whether 'dfs.namenode.checkpoint.period' has expired.</description>
</property>
<property>
<name>dfs.namenode.checkpoint.check.period</name>
<value>60s</value>
<description> The SecondaryNameNode and CheckpointNode will poll the NameNode every 'dfs.namenode.checkpoint.check.period' seconds to query the number of uncheckpointed transactions.
Support multiple time unit suffix(case insensitive), as described in dfs.heartbeat.interval.</description>
</property>
默认的检查合并时间是一个小时,或者操作数到达了1000000。查询操作数的时间间隔为一分钟。
另一方面:
从我的hadoop元数据来看,有些edits的确是1个小时更新的,有些是因为启动关闭的时间不同造成混乱,但是没有一个edits超过1M,这是不是也是一个大小的限制呢?不得而知。如果默认是1M的话也太小了。
至于HA,那么我们就不需要secondaryNameNode来定期检查和合并了,做这份工作的是standBy namenode:
Note that, in an HA cluster, the Standby NameNodes also performs checkpoints of the namespace state, and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster to be HA-enabled to reuse the hardware which they had previously dedicated to the Secondary NameNode.
查看
我将除了tmp文件之外的所有文件都删了:
最后我们观察几个文件:
首先,当前的最大操作数(事务数)为29325。
使用
hdfs oev -p XML -i edits_0000000000000029321-0000000000000029322 -o /opt/module/hadoop-3.1.3/edits.xml
得到edits_0000000000000029321-0000000000000029322的xml文件:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EDITS>
<EDITS_VERSION>-64</EDITS_VERSION>
<RECORD>
<OPCODE>OP_START_LOG_SEGMENT</OPCODE>
<DATA>
<TXID>29321</TXID>
</DATA>
</RECORD>
<RECORD>
<OPCODE>OP_END_LOG_SEGMENT</OPCODE>
<DATA>
<TXID>29322</TXID>
</DATA>
</RECORD>
</EDITS>
这个操作就是开始和结束日志切分。
以同样的方式查看edits_inprogress_0000000000000029325:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EDITS>
<EDITS_VERSION>-64</EDITS_VERSION>
<RECORD>
<OPCODE>OP_START_LOG_SEGMENT</OPCODE>
<DATA>
<TXID>29325</TXID>
</DATA>
</RECORD>
</EDITS>
同样是日志切分的操作。它的TXID也是最大的操作数ID。secondaryNameNode唯独没有这个文件。可以说,它就是hadoop重启之后新增的操作。
我们可以通过
hdfs oiv -p XML -i fsimage_0000000000000029324 -o /opt/module/hadoop-3.1.3/fsimage_new.xml
查看fsimage:
<INodeSection>
<lastInodeId>22648</lastInodeId>
<numInodes>27</numInodes>
<inode>
<id>16385</id>
<type>DIRECTORY</type>
<name/>
<mtime>1629345954854</mtime>
<permission>root:supergroup:0777</permission>
<nsquota>9223372036854775807</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22554</id>
<type>DIRECTORY</type>
<name>tmp</name>
<mtime>1629289108967</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22555</id>
<type>DIRECTORY</type>
<name>hadoop-yarn</name>
<mtime>1629280562345</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22556</id>
<type>DIRECTORY</type>
<name>staging</name>
<mtime>1629289107248</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22557</id>
<type>DIRECTORY</type>
<name>history</name>
<mtime>1629280562367</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22558</id>
<type>DIRECTORY</type>
<name>done</name>
<mtime>1629289269870</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22559</id>
<type>DIRECTORY</type>
<name>done_intermediate</name>
<mtime>1629289110600</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22573</id>
<type>DIRECTORY</type>
<name>ocean</name>
<mtime>1629289107248</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22574</id>
<type>DIRECTORY</type>
<name>.staging</name>
<mtime>1629290164292</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22580</id>
<type>DIRECTORY</type>
<name>logs</name>
<mtime>1629289108990</mtime>
<permission>ocean:ocean:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22581</id>
<type>DIRECTORY</type>
<name>ocean</name>
<mtime>1629289108992</mtime>
<permission>ocean:ocean:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22582</id>
<type>DIRECTORY</type>
<name>logs-tfile</name>
<mtime>1629290147310</mtime>
<permission>ocean:ocean:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22583</id>
<type>DIRECTORY</type>
<name>application_1629280557686_0001</name>
<mtime>1629289238916</mtime>
<permission>ocean:ocean:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22584</id>
<type>DIRECTORY</type>
<name>ocean</name>
<mtime>1629290169798</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22608</id>
<type>FILE</type>
<name>job_1629280557686_0001-1629289108111-ocean-hadoop%2Dmapreduce%2Dclient%2Djobclient%2D3.1.3%2D-1629289231294-10-1-SUCCEEDED-default-1629289112345.jhist</name>
<replication>3</replication>
<mtime>1629289231358</mtime>
<atime>1629289231340</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:supergroup:0777</permission>
<blocks>
<block>
<id>1073744990</id>
<genstamp>4174</genstamp>
<numBytes>55176</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
<inode>
<id>22609</id>
<type>FILE</type>
<name>job_1629280557686_0001_conf.xml</name>
<replication>3</replication>
<mtime>1629289231386</mtime>
<atime>1629289231366</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:supergroup:0777</permission>
<blocks>
<block>
<id>1073744991</id>
<genstamp>4175</genstamp>
<numBytes>216419</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
<inode>
<id>22610</id>
<type>FILE</type>
<name>hadoop102_35246</name>
<replication>3</replication>
<mtime>1629289238515</mtime>
<atime>1629289238458</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:ocean:0777</permission>
<blocks>
<block>
<id>1073744992</id>
<genstamp>4176</genstamp>
<numBytes>172209</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
<inode>
<id>22611</id>
<type>FILE</type>
<name>hadoop103_33694</name>
<replication>3</replication>
<mtime>1629289238912</mtime>
<atime>1629289238844</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:ocean:0777</permission>
<blocks>
<block>
<id>1073744993</id>
<genstamp>4177</genstamp>
<numBytes>352999</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
<inode>
<id>22612</id>
<type>DIRECTORY</type>
<name>2021</name>
<mtime>1629289269870</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22613</id>
<type>DIRECTORY</type>
<name>08</name>
<mtime>1629289269870</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22614</id>
<type>DIRECTORY</type>
<name>18</name>
<mtime>1629289269870</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22615</id>
<type>DIRECTORY</type>
<name>000000</name>
<mtime>1629290169798</mtime>
<permission>ocean:supergroup:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22632</id>
<type>DIRECTORY</type>
<name>application_1629280557686_0002</name>
<mtime>1629290170700</mtime>
<permission>ocean:ocean:0777</permission>
<nsquota>-1</nsquota>
<dsquota>-1</dsquota>
</inode>
<inode>
<id>22645</id>
<type>FILE</type>
<name>job_1629280557686_0002-1629290147211-ocean-hadoop%2Dmapreduce%2Dclient%2Djobclient%2D3.1.3%2D-1629290163180-10-1-SUCCEEDED-default-1629290150116.jhist</name>
<replication>3</replication>
<mtime>1629290163238</mtime>
<atime>1629290163222</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:supergroup:0777</permission>
<blocks>
<block>
<id>1073745012</id>
<genstamp>4196</genstamp>
<numBytes>54938</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
<inode>
<id>22646</id>
<type>FILE</type>
<name>job_1629280557686_0002_conf.xml</name>
<replication>3</replication>
<mtime>1629290163262</mtime>
<atime>1629290163244</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:supergroup:0777</permission>
<blocks>
<block>
<id>1073745013</id>
<genstamp>4197</genstamp>
<numBytes>216417</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
<inode>
<id>22647</id>
<type>FILE</type>
<name>hadoop103_33694</name>
<replication>3</replication>
<mtime>1629290170343</mtime>
<atime>1629290170309</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:ocean:0777</permission>
<blocks>
<block>
<id>1073745014</id>
<genstamp>4198</genstamp>
<numBytes>203114</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
<inode>
<id>22648</id>
<type>FILE</type>
<name>hadoop102_35246</name>
<replication>3</replication>
<mtime>1629290170698</mtime>
<atime>1629290170675</atime>
<preferredBlockSize>134217728</preferredBlockSize>
<permission>ocean:ocean:0777</permission>
<blocks>
<block>
<id>1073745015</id>
<genstamp>4199</genstamp>
<numBytes>281375</numBytes>
</block>
</blocks>
<storagePolicyId>0</storagePolicyId>
</inode>
</INodeSection>
除了核心文件,已经没有自己建文件或文件夹了。
同样的我们可以查看另个一个fsimage。
注意到两个对应的数字(在我写博文的途中又发生了合并,所以数字与前面的截图不同了):
这相当于是存了最后的两个版本。那为什么会有两个fsimage呢?
<property>
<name>dfs.namenode.num.checkpoints.retained</name>
<value>2</value>
<description> The number of image checkpoint files (fsimage_*) that will be retained by the NameNode and Secondary NameNode in their storage directories.
All edit logs (stored on edits_* files) necessary to recover an up-to-date namespace from the oldest retained checkpoint will also be retained.</description>
</property>
默认就是2。