below are the outlines of this article ,using by hadoop-2.5.1 :
1.abstraction
1.1 what is HDFS HA
1.2 how to
1.3 other HAs
2.installation
2.1 manual failover
2.2 auto failover
3.conclusion
1.abstraction
1.1 what is HDFS HA
in hadoop distributed file system ,there are some processes distinguished by roles,e.g. namenode,secondarynode,datanode,backupnode etc.and the most important role is namenode,which mantains the name space service,resouce assignment,heartbeat detection etc.
before hadoop-2.x,the secondary node only do fs image merge with edits from namenode(nn),to speedup startup of nn.and the backup node is same as it but do in memory.so in fact there is only 'a brain' in the system ,this is very different from some design mode of related db system:master and slave(hot spare or named standby). so if something wrong occurs (eg. too many file causes OOME of nn) the nn will be in deep compressure and slower reponses or even 'play dead' to not reponse any requests.
so a HA(high availablity) is ,of course,come here:keeps the system running as healthy as possible for any failures(hard ware faults,soft ware bugs etc),when a namenode is failed to reponse or down,the other one will updertake it immediately,yes,this switch is transparent to all clients.so its simple and cost lowly.
1.2 how to implement
in distributed coordination system,there is a term "most part" whcih means that all the single-ones are over the average of total.that is if N is an odd number ,then formular math.int((N+1)/2) is the so called 'most part'.so if N is an even number, that means prior formular is not of it.therefor,they always use odd number to construct the coordination system base number.
of course,hadoop's HA(hdfs/namenode HA) uses this term to generate a coordinated service:journal node(ie edit logs),see figure 1 below:
figure 1
u can see ,the # of nn is not necessary to construct to an odd number BUT journal nodes.when the active nn write to journal nodes if 'most part' nodes are success then this write oper is success ;esle failure.
so the journal nodes consists of 'coordination system' for namenodes.also named as "quorum journal manager".
then the standby nn will read the edits immediately and maybe combine it with existed fs image for reducing the amount o f edits.so both the active nn and standby one are thinked as 'the same'
1.3 other HAs
b.Linux HA
c.ip failover
ref:
jira:High Availability Framework for HDFS NN