namenode的fsimage与edits文件

namenode的fsimage与edits文件

引入

hadoop集群中,namenode管理了元数据。那么,元数据存储在哪里呢?

如果是磁盘中,必然效率过低,如果是内存中,又会不安全。

所以元数据存储在内存中,但是有一个备份文件fsimage在磁盘。这就防止了断电元数据丢失的问题。

现在内存中的元数据增加之后,需要不断同步fsimage吗?这样又很没有效率。但你又必然要更新元数据,所以,可以将追加的操作写入一个文件edits。

fsimage与edits一起加载到内存中,就会得到所有的元数据。

合并

元数据一旦变得庞大,加载到内存就会很慢。所以,还需要定期去合并fsimage与edits。这个合并的工作最好由别人来做。

一般的集群中,可以由secondaryNameNode完成。

合并的时机可以是到一定的时间,或者是操作数到达一定阈值。

由于我使用的是hadoop3.1.3,所以我找了该版本的hdfs-default.xml:

<property>
  <name>dfs.namenode.checkpoint.period</name>
  <value>3600s</value>
  <description>The number of seconds between two periodic checkpoints. 
Support multiple time unit suffix(case insensitive), as described in dfs.heartbeat.interval.</description>
</property>

<property>
  <name>dfs.namenode.checkpoint.txns</name>
  <value>1000000</value>
<description>The Secondary NameNode or CheckpointNode will create a checkpoint of the namespace every 'dfs.namenode.checkpoint.txns' transactions, 
regardless of whether 'dfs.namenode.checkpoint.period' has expired.</description>
</property>

<property>
  <name>dfs.namenode.checkpoint.check.period</name>
  <value>60s</value>
<description> The SecondaryNameNode and CheckpointNode will poll the NameNode every 'dfs.namenode.checkpoint.check.period' seconds to query the number of uncheckpointed transactions. 
Support multiple time unit suffix(case insensitive), as described in dfs.heartbeat.interval.</description>
</property>

默认的检查合并时间是一个小时,或者操作数到达了1000000。查询操作数的时间间隔为一分钟。

另一方面:

在这里插入图片描述
从我的hadoop元数据来看,有些edits的确是1个小时更新的,有些是因为启动关闭的时间不同造成混乱,但是没有一个edits超过1M,这是不是也是一个大小的限制呢?不得而知。如果默认是1M的话也太小了。

至于HA,那么我们就不需要secondaryNameNode来定期检查和合并了,做这份工作的是standBy namenode:

Note that, in an HA cluster, the Standby NameNodes also performs checkpoints of the namespace state, and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster to be HA-enabled to reuse the hardware which they had previously dedicated to the Secondary NameNode.

查看

我将除了tmp文件之外的所有文件都删了:

在这里插入图片描述
最后我们观察几个文件:

在这里插入图片描述
首先,当前的最大操作数(事务数)为29325。

使用

hdfs oev -p XML -i edits_0000000000000029321-0000000000000029322 -o /opt/module/hadoop-3.1.3/edits.xml

得到edits_0000000000000029321-0000000000000029322的xml文件:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EDITS>
  <EDITS_VERSION>-64</EDITS_VERSION>
  <RECORD>
    <OPCODE>OP_START_LOG_SEGMENT</OPCODE>
    <DATA>
      <TXID>29321</TXID>
    </DATA>
  </RECORD>
  <RECORD>
    <OPCODE>OP_END_LOG_SEGMENT</OPCODE>
    <DATA>
      <TXID>29322</TXID>
    </DATA>
  </RECORD>
</EDITS>

这个操作就是开始和结束日志切分。

以同样的方式查看edits_inprogress_0000000000000029325:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EDITS>
  <EDITS_VERSION>-64</EDITS_VERSION>
  <RECORD>
    <OPCODE>OP_START_LOG_SEGMENT</OPCODE>
    <DATA>
      <TXID>29325</TXID>
    </DATA>
  </RECORD>
</EDITS>

同样是日志切分的操作。它的TXID也是最大的操作数ID。secondaryNameNode唯独没有这个文件。可以说,它就是hadoop重启之后新增的操作。

我们可以通过

 hdfs oiv -p XML -i fsimage_0000000000000029324 -o /opt/module/hadoop-3.1.3/fsimage_new.xml

查看fsimage

<INodeSection>
		<lastInodeId>22648</lastInodeId>
		<numInodes>27</numInodes>
		<inode>
			<id>16385</id>
			<type>DIRECTORY</type>
			<name/>
			<mtime>1629345954854</mtime>
			<permission>root:supergroup:0777</permission>
			<nsquota>9223372036854775807</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22554</id>
			<type>DIRECTORY</type>
			<name>tmp</name>
			<mtime>1629289108967</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22555</id>
			<type>DIRECTORY</type>
			<name>hadoop-yarn</name>
			<mtime>1629280562345</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22556</id>
			<type>DIRECTORY</type>
			<name>staging</name>
			<mtime>1629289107248</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22557</id>
			<type>DIRECTORY</type>
			<name>history</name>
			<mtime>1629280562367</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22558</id>
			<type>DIRECTORY</type>
			<name>done</name>
			<mtime>1629289269870</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22559</id>
			<type>DIRECTORY</type>
			<name>done_intermediate</name>
			<mtime>1629289110600</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22573</id>
			<type>DIRECTORY</type>
			<name>ocean</name>
			<mtime>1629289107248</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22574</id>
			<type>DIRECTORY</type>
			<name>.staging</name>
			<mtime>1629290164292</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22580</id>
			<type>DIRECTORY</type>
			<name>logs</name>
			<mtime>1629289108990</mtime>
			<permission>ocean:ocean:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22581</id>
			<type>DIRECTORY</type>
			<name>ocean</name>
			<mtime>1629289108992</mtime>
			<permission>ocean:ocean:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22582</id>
			<type>DIRECTORY</type>
			<name>logs-tfile</name>
			<mtime>1629290147310</mtime>
			<permission>ocean:ocean:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22583</id>
			<type>DIRECTORY</type>
			<name>application_1629280557686_0001</name>
			<mtime>1629289238916</mtime>
			<permission>ocean:ocean:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22584</id>
			<type>DIRECTORY</type>
			<name>ocean</name>
			<mtime>1629290169798</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22608</id>
			<type>FILE</type>
			<name>job_1629280557686_0001-1629289108111-ocean-hadoop%2Dmapreduce%2Dclient%2Djobclient%2D3.1.3%2D-1629289231294-10-1-SUCCEEDED-default-1629289112345.jhist</name>
			<replication>3</replication>
			<mtime>1629289231358</mtime>
			<atime>1629289231340</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:supergroup:0777</permission>
			<blocks>
				<block>
					<id>1073744990</id>
					<genstamp>4174</genstamp>
					<numBytes>55176</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
		<inode>
			<id>22609</id>
			<type>FILE</type>
			<name>job_1629280557686_0001_conf.xml</name>
			<replication>3</replication>
			<mtime>1629289231386</mtime>
			<atime>1629289231366</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:supergroup:0777</permission>
			<blocks>
				<block>
					<id>1073744991</id>
					<genstamp>4175</genstamp>
					<numBytes>216419</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
		<inode>
			<id>22610</id>
			<type>FILE</type>
			<name>hadoop102_35246</name>
			<replication>3</replication>
			<mtime>1629289238515</mtime>
			<atime>1629289238458</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:ocean:0777</permission>
			<blocks>
				<block>
					<id>1073744992</id>
					<genstamp>4176</genstamp>
					<numBytes>172209</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
		<inode>
			<id>22611</id>
			<type>FILE</type>
			<name>hadoop103_33694</name>
			<replication>3</replication>
			<mtime>1629289238912</mtime>
			<atime>1629289238844</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:ocean:0777</permission>
			<blocks>
				<block>
					<id>1073744993</id>
					<genstamp>4177</genstamp>
					<numBytes>352999</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
		<inode>
			<id>22612</id>
			<type>DIRECTORY</type>
			<name>2021</name>
			<mtime>1629289269870</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22613</id>
			<type>DIRECTORY</type>
			<name>08</name>
			<mtime>1629289269870</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22614</id>
			<type>DIRECTORY</type>
			<name>18</name>
			<mtime>1629289269870</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22615</id>
			<type>DIRECTORY</type>
			<name>000000</name>
			<mtime>1629290169798</mtime>
			<permission>ocean:supergroup:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22632</id>
			<type>DIRECTORY</type>
			<name>application_1629280557686_0002</name>
			<mtime>1629290170700</mtime>
			<permission>ocean:ocean:0777</permission>
			<nsquota>-1</nsquota>
			<dsquota>-1</dsquota>
		</inode>
		<inode>
			<id>22645</id>
			<type>FILE</type>
			<name>job_1629280557686_0002-1629290147211-ocean-hadoop%2Dmapreduce%2Dclient%2Djobclient%2D3.1.3%2D-1629290163180-10-1-SUCCEEDED-default-1629290150116.jhist</name>
			<replication>3</replication>
			<mtime>1629290163238</mtime>
			<atime>1629290163222</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:supergroup:0777</permission>
			<blocks>
				<block>
					<id>1073745012</id>
					<genstamp>4196</genstamp>
					<numBytes>54938</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
		<inode>
			<id>22646</id>
			<type>FILE</type>
			<name>job_1629280557686_0002_conf.xml</name>
			<replication>3</replication>
			<mtime>1629290163262</mtime>
			<atime>1629290163244</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:supergroup:0777</permission>
			<blocks>
				<block>
					<id>1073745013</id>
					<genstamp>4197</genstamp>
					<numBytes>216417</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
		<inode>
			<id>22647</id>
			<type>FILE</type>
			<name>hadoop103_33694</name>
			<replication>3</replication>
			<mtime>1629290170343</mtime>
			<atime>1629290170309</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:ocean:0777</permission>
			<blocks>
				<block>
					<id>1073745014</id>
					<genstamp>4198</genstamp>
					<numBytes>203114</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
		<inode>
			<id>22648</id>
			<type>FILE</type>
			<name>hadoop102_35246</name>
			<replication>3</replication>
			<mtime>1629290170698</mtime>
			<atime>1629290170675</atime>
			<preferredBlockSize>134217728</preferredBlockSize>
			<permission>ocean:ocean:0777</permission>
			<blocks>
				<block>
					<id>1073745015</id>
					<genstamp>4199</genstamp>
					<numBytes>281375</numBytes>
				</block>
			</blocks>
			<storagePolicyId>0</storagePolicyId>
		</inode>
	</INodeSection>

除了核心文件,已经没有自己建文件或文件夹了。

同样的我们可以查看另个一个fsimage

注意到两个对应的数字(在我写博文的途中又发生了合并,所以数字与前面的截图不同了):

在这里插入图片描述

这相当于是存了最后的两个版本。那为什么会有两个fsimage呢?

<property>
  <name>dfs.namenode.num.checkpoints.retained</name>
  <value>2</value>
<description> The number of image checkpoint files (fsimage_*) that will be retained by the NameNode and Secondary NameNode in their storage directories. 
All edit logs (stored on edits_* files) necessary to recover an up-to-date namespace from the oldest retained checkpoint will also be retained.</description>
</property>

默认就是2。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值