hadoop 2.x-HDFS HA --Part I: abstraction

  below are the outlines of this article ,using by hadoop-2.5.1 :

1.abstraction
 1.1 what is HDFS HA
 1.2 how to 
 1.3 other HAs
2.installation
 2.1 manual failover
 2.2 auto failover
3.conclusion

 

1.abstraction

 1.1 what is HDFS HA

  in hadoop distributed file system ,there are some processes distinguished by roles,e.g. namenode,secondarynode,datanode,backupnode etc.and the most important role is namenode,which mantains the name space service,resouce assignment,heartbeat detection etc.

  before hadoop-2.x,the secondary node only do fs image merge with edits from namenode(nn),to speedup startup of nn.and the backup node is same as it but do in memory.so in fact there is only 'a brain' in the system ,this is very different from some design mode of related db system:master and slave(hot spare or named standby). so if something wrong occurs (eg. too many file causes OOME of nn) the nn will be in deep compressure and slower reponses or even 'play dead' to not reponse any requests.

  so a HA(high availablity) is ,of course,come here:keeps the system running as healthy as possible for any failures(hard ware faults,soft ware bugs etc),when a namenode is failed to reponse or down,the other one will updertake it immediately,yes,this switch is transparent to all clients.so its simple and cost lowly.

  1.2 how to implement

   in distributed coordination system,there is a term "most part" whcih means that all the single-ones are over the average of total.that is if N is an odd number ,then formular math.int((N+1)/2) is the so called 'most part'.so if N is an even number, that means prior formular is not of it.therefor,they always use odd number to construct the coordination system base number.

   of course,hadoop's HA(hdfs/namenode HA) uses this term to generate a coordinated service:journal node(ie edit logs),see figure 1 below:



             figure 1

  u can see ,the # of  nn is not necessary to construct to an odd number BUT journal nodes.when the active nn write to journal nodes if 'most part' nodes are success then this write oper is success ;esle failure.

  so the journal nodes consists of 'coordination system' for namenodes.also named as "quorum journal manager".

  then the standby nn will read the edits immediately and maybe combine it with existed fs image for reducing the amount o f edits.so both the active nn and standby one are thinked as 'the same'

  1.3 other HAs

   a.Facebook AvatarNode

   b.Linux HA

   c.ip failover 

 

 

ref:

jira:High Availability Framework for HDFS NN

HDFS High Availability Using the Quorum Journal Manager

hadoop 2.x-HDFS HA --Part II: installation 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值