namenode-ha
实践测试了下hadoop 2.2.0 版本中的NameNode HA特性,记录如下。
配置
开启NameNode HA功能需在hdfs-site.xml与core-site.xml两个文件中添加以下相关的配置项
hdfs-site.xml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | <!-- Start NameNode HA --> <property> <name>dfs.nameservices</name> <value>cluster1</value> <description> Comma-separated list of nameservices. </description> </property> <property> <name>dfs.ha.namenodes.cluster1</name> <value>nn1,nn2</value> <description> 命名服务下包含的NameNode列表 The prefix for a given nameservice, contains a comma-separated list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE). </description> </property> <property> <name>dfs.namenode.rpc-address.cluster1.nn1</name> <value>master:8020</value> <description> 设置namenode id 对应的RPC端口 RPC address that handles all clients requests. In the case of HA/Federation where multiple namenodes exist, the name service id is added to the name e.g. dfs.namenode.rpc-address.ns1 dfs.namenode.rpc-address.EXAMPLENAMESERVICE The value of this property will take the form of nn-host1:rpc-port. </description> </property> <property> <name>dfs.namenode.rpc-address.cluster1.nn2</name> <value>slave0:8020</value> <description> RPC address that handles all clients requests. In the case of HA/Federation where multiple namenodes exist, the name service id is added to the name e.g. dfs.namenode.rpc-address.ns1 dfs.namenode.rpc-address.EXAMPLENAMESERVICE The value of this property will take the form of nn-host1:rpc-port. </description> </property> <property> <name>dfs.namenode.http-address.cluster.nn1</name> <value>10.1.93.42:50070</value> <description> 设置每个NameNode对外提供Http服务的端口 The address and the base port where the dfs namenode web ui will listen on. </description> </property> <property> <name>dfs.namenode.http-address.cluster.nn2</name> <value>10.1.93.43:50070</value> <description> The address and the base port where the dfs namenode web ui will listen on. </description> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://10.1.93.42:8485;10.1.93.43:8485;10.1.93.44:8485/cluster1</value> <description> 提供journal服务的服务器列表,cluster1指定具体的文件夹 A directory on shared storage between the multiple namenodes in an HA cluster. This directory will be written by the active and read by the standby in order to keep the namespaces synchronized. This directory does not need to be listed in dfs.namenode.edits.dir above. It should be left empty in a non-HA cluster. </description> </property> <property> <name>dfs.client.failover.proxy.provider.cluster1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> <description> DFS客户端通过此类查找当前的active NameNode </description> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> <description> 状态切换过程中执行的fencing方法,当前提供ssh与shell脚本两种方式,对指定的fencing方法将逐一尝试,直至成功,否则将超时退出 </description> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/opt/home/yuanlinsi/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>10000</value> <description>以毫秒为单位,默认30000</description> </property> <property> <name>ha.zookeeper.session-timeout.ms</name> <value>10000</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/linsiyuan/hadoop-journalnode-data</value> </property> |
部署
启动journalnode
在dfs.namenode.shared.edits.dir配置项指定的节点上启动journalnode服务,若在所有结点上均启动journalnode,直接通过hadoop-daemons.sh启动,否则需在指定的结点上执行hadoop-daemon.sh来启动journalnode服务
1 2 | ./sbin/hadoop-daemon.sh start journalnode ./sbin/hadoop-daemons.sh start journalnode |
初始化NameNode
对于新建立的HDFS集群,需首先在主NameNode上进行格式化操作
1
hdfs namenode -format
在主NameNode上启动namenode
1 2
hdfs namenode -initializeSharedEdits sbin/hadoop-daemon.sh start namenode
在从NameNode上启动NameNode,首先通过hdfs namenode -bootstrapStandby 从主NameNode上同步元数据,然后启动namenode
1 2
hdfs namenode -bootstrapStandby ./sbin/hadoop-daemon.sh start namenode
启动所有结点的DataNode
1
./sbin/hadoop-daemons.sh start datanode
切换NameNode的运行状态
在HA模式下,NameNode启动后会直接进入standby模式,可通过以下命令将NameNode切换为active模式
1 2 3 4 5 6 7 8 9 10 11 | 查询NameNode状态 ./bin/hdfs haadmin -getServiceState nn1 ./bin/hdfs haadmin -getServiceState nn2 查询在指定的name service下,NameNode的状态 ./bin/hdfs haadmin -ns cluster2 -getServiceState nn1 ./bin/hdfs haadmin -ns cluster2 -getServiceState nn2 在指定的name service下的切换NameNode的状态 ./bin/hdfs haadmin -ns [name service] -failover [standby] [active] ./bin/hdfs haadmin -ns cluster1 -failover nn1 nn2 |
Automatic Failover
以上方式当处于active模式的NameNode挂掉时需手动进行主备间的状态切换,生产环境显然不现实。hadoop2.2.0 中同样提供基于zookeeper实现的自动切换。
配置
开启自动Failover处理需在以上的配置基础上添加以下的配置项:
hdfs-site.xml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> <description> Whether automatic failover is enabled. See the HDFS High Availability documentation for details on automatic HA configuration. </description> </property> core-site.xml <property> <name>ha.zookeeper.quorum</name> <value>10.1.93.42:2188,10.1.93.43:2188,10.1.93.44:2188</value> <description> A list of ZooKeeper server addresses, separated by commas, that are to be used by the ZKFailoverController in automatic failover. </description> </property> |
启动
停止集群并添加以上配置项后,可通过以下方式启动集群:
在主NameNode上,启动namenode结点:
1 2 3 | - hdfs namenode -format - hdfs namenode -initializeSharedEdits - sbin/hadoop-daemon.sh start namenode |
启动zkfc:
1 2 | - hdfs zkfc -formatZK - ./sbin/hadoop-daemon.sh start zkfc |
在从NameNode上,启动namenode结点:
1 2 | - hdfs namenode -bootstrapStandby - ./sbin/hadoop-daemon.sh start namenode |
启动zkfc:
1 | - ./sbin/hadoop-daemon.sh start zkfc |
check NameNode 状态
1 2 | hdfs haadmin -getServiceState nn1 hdfs haadmin -getServiceState nn2 |
测试
通过两种方式模拟Namenode异常
- 终止NameNode进程
- 直接重启active NN
两种方式NameNode状态均切换成功,且保持与client及Datanode端的正常通信。
不同之处在于对于直接kill active NameNode,会有zookeeper主动释放锁,而掉电重启active NN所运行的服务器则需等待zookeeper锁超时方能进行Namenode状态切换,由此在掉电的情景中状态切换的过程更长
![hadoop2.2环境搭建 - cyxinda - JAVA技术分享 hadoop2.2环境搭建 - cyxinda - JAVA技术分享](http://img0.ph.126.net/3nVKCXa3i9pARPpetOo9zQ==/4856006298312489984.jpg)
hadoop.proxyuser.yarn.hosts
http://hadoop.apache.org 官方文档。