暂停了好长一段时间,终于可以继续大数据的学习了,今天要学习的是HDFS集群自动故障切换的知识,学习本部分内容,需要提前了解ZooKeeper和HDFS HA QJM相关知识点。


        Apache ZooKeeper is a highly available service for maintaining small amounts of coordination data, notifying clients of changes in that data, and monitoring clients for failures. The implementation of automatic HDFS failover relies on ZooKeeper for the following things:

  • Failure detection - each of the NameNode machines in the cluster  maintains a persistent session in ZooKeeper. If the machine crashes, the  ZooKeeper session will expire, notifying the other NameNode that a failover  should be triggered.

  • Active NameNode election - ZooKeeper provides a simple mechanism to  exclusively elect a node as active. If the current active NameNode crashes,  another node may take a special exclusive lock in ZooKeeper indicating that  it should become the next active.


        The ZKFailoverController (ZKFC) is a new component which is a ZooKeeper client which also monitors and manages the state of the NameNode. Each of the machines which runs a NameNode also runs a ZKFC, and that ZKFC is responsible for:

  • Health monitoring - the ZKFC pings its local NameNode on a periodic  basis with a health-check command. So long as the NameNode responds in a  timely fashion with a healthy status, the ZKFC considers the node  healthy. If the node has crashed, frozen, or otherwise entered an unhealthy  state, the health monitor will mark it as unhealthy.

  • ZooKeeper session management - when the local NameNode is healthy, the  ZKFC holds a session open in ZooKeeper. If the local NameNode is active, it  also holds a special “lock” znode. This lock uses ZooKeeper’s support for  “ephemeral” nodes; if the session expires, the lock node will be  automatically deleted.

  • ZooKeeper-based election - if the local NameNode is healthy, and the  ZKFC sees that no other node currently holds the lock znode, it will itself  try to acquire the lock. If it succeeds, then it has “won the election”, and  is responsible for running a failover to make its local NameNode active. The  failover process is similar to the manual failover described above: first,  the previous active is fenced if necessary, and then the local NameNode  transitions to active state.



[hadoop@hadoop01 hadoop]$ pwd
[hadoop@hadoop01 hadoop]$ vi hdfs-site.xml

<!-- Put site-specific property overrides in this file. -->


  <!-- add start 20160712 -->







  <!-- add end 20160712 -->

  <!-- add start 20160713 -->


  <!-- add end 20160713 -->

    <!-- add start 20160623 -->
            <!-- modify start 20160627
            <value>1</value>  -->
            <!-- modify start 20170712
            <!-- modify end 20170712-->
            <!-- modify end 20160627 -->
    <!-- add end  20160623 -->

    <!-- add start 20160627 -->
    <!-- add end by  20160627 -->


[hadoop@hadoop01 hadoop]$ pwd
[hadoop@hadoop01 hadoop]$ vi core-site.xml

<!-- Put site-specific property overrides in this file. -->

    <!--add start 20160623 -->
            <!-- modify start 20160627
            <value>hdfs://localhost:9000</value>   -->
            <!--modify start 20160712
            <value>hdfs://hadoop01:9000</value> -->
            <!-- modify end 20160712 -->
            <!-- modify end -->
    <!--add end 20160623 -->

    <!-- add start 20160712 -->
    <!-- add end 20160712 -->

    <!-- add start 20160713 -->

    <!-- add end 20160713 -->

    <!--add start 20160627 -->
    <!--add end by 20160627 -->

3、将hadoop01机器上配置好的 hdfs-site.xml 和 core-site.xml 两个文件复制hadoop02、hadoop03服务器上。

[hadoop@hadoop01 hadoop]$ pwd
[hadoop@hadoop01 hadoop]$ scp hdfs-site.xml core-site.xml hadoop02:$PWD
hdfs-site.xml                                                             100% 2973     2.9KB/s   00:00   
core-site.xml                                                             100% 1906     1.9KB/s   00:00   
[hadoop@hadoop01 hadoop]$ scp hdfs-site.xml core-site.xml hadoop03:$PWD
hdfs-site.xml                                                             100% 2973     2.9KB/s   00:00   
core-site.xml                                                             100% 1906     1.9KB/s   00:00   
[hadoop@hadoop01 hadoop]$



[zookeeper@hadoop01 ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/zookeeper/zookeeper-3.4.8/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED


[zookeeper@hadoop02 ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/zookeeper/zookeeper-3.4.8/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED


[zookeeper@hadoop03 ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/zookeeper/zookeeper-3.4.8/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED


[hadoop@hadoop01 ~]$ hdfs zkfc -formatZK

16/07/03 10:25:13 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at hadoop01/
16/07/03 10:25:14 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
16/07/03 10:25:14 INFO zookeeper.ZooKeeper: Client environment:host.name=hadoop01
16/07/03 10:25:14 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_92
16/07/03 10:25:14 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
16/07/03 10:25:14 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/java/jdk1.8.0_92/jre



Proceed formatting /hadoop-ha/mycluster? (Y or N) Y

16/07/03 10:25:21 INFO ha.ActiveStandbyElector: Recursively deleting /hadoop-ha/mycluster from ZK...
16/07/03 10:25:21 INFO ha.ActiveStandbyElector: Successfully deleted /hadoop-ha/mycluster from ZK.
16/07/03 10:25:22 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.
16/07/03 10:25:22 INFO zookeeper.ZooKeeper: Session: 0x155ae929cb60000 closed
16/07/03 10:25:22 INFO zookeeper.ClientCnxn: EventThread shut down

6、执行start-all.sh命令,启动HDFS HA 集群,两台服务器分别为 Active状态和Standby状态。

[hadoop@hadoop01 ~]$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [hadoop01 hadoop02]
hadoop02: starting namenode, ......
hadoop01: starting namenode, ......
hadoop02: starting datanode, ......
hadoop01: starting datanode, ......
hadoop03: starting datanode, ......
Starting journal nodes [hadoop01 hadoop02 hadoop03]
hadoop01: starting journalnode, ......
hadoop02: starting journalnode, ......
hadoop03: starting journalnode, ......
Starting ZK Failover Controllers on NN hosts [hadoop01 hadoop02]
hadoop02: starting zkfc, ......
hadoop01: starting zkfc, ......
starting yarn daemons
starting resourcemanager, ......
hadoop01: starting nodemanager, ......
hadoop02: starting nodemanager, ......
hadoop03: starting nodemanager, ......



[hadoop@hadoop01 ~]$ jps
2882 NameNode
4034 Jps
2995 DataNode
3466 ResourceManager
3578 NodeManager
3371 DFSZKFailoverController
3213 JournalNode


[hadoop@hadoop02 ~]$ jps
2791 NameNode
2955 JournalNode
3068 DFSZKFailoverController
3421 Jps
3166 NodeManager
2862 DataNode


[hadoop@hadoop03 ~]$ jps
2946 NodeManager
3159 Jps
2792 DataNode
2863 JournalNode







[hadoop@hadoop01 ~]$ jps|grep NameNode
nnnn  NameNode
[hadoop@hadoop01 ~]$ kill -9 nnnn




        .查看hadoop02服务器,Namenode已自动切换为 Active。



[hadoop@hadoop01 ~]$ hadoop-daemon.sh start namenode

starting namenode, logging to /home/hadoop/hadoop-2.7.2//logs/hadoop-hadoop-namenode-hadoop01.out