在邮件列表里看到一个经验,记录下来以备后用:
Couple of things that one can do:
1. dfs.name.dir should have at least two locations, one on the local
disk and one on NFS. This means that all transactions are
synchronously logged into two places.
2. Create a virtual IP, say name.xx.com that points to the real
machine name of the machine on which the namenode runs.
If the namenode machine burns, then change the virtual IP to point to
a new machine. Copy the namenode metadata from the NFS location to the
local disk on this new machine. Then start namenode on this new
machine.
Done!
-dhruba
On Mon, Nov 10, 2008 at 12:24 AM, Goel, Ankur <ankur.goel@corp.aol.com> wrote:
Hi Folks,
I am looking for some advice on some the ways / techniques
that people are using to get around namenode failures (Both disk and
host).
We have a small cluster with several job scheduled for periodic
execution on the same host where name server runs. What we would like to
have is an automatic failover mechanism in hadoop so that a secondary
namenode automatically takes the roll of a master.
I can move this discussion to a JIRA if people are interested.
Thanks
-Ankur
.