目录
先安装好ZooKeeper。
Hadoop HA
1、集群规划
host | HDFS | Yarn | ZK | HA | |
bigdata111 | NameNode SecondaryNameNode | ResourceManager | QuorumPeerMain | ||
bigdata112 | DataNode | NodeManager | JournalNode | QuorumPeerMain | NameNode |
bigdata113 | DataNode | NodeManager | JournalNode | QuorumPeerMain | ResourceManager |
2、core-site.xml
<configuration>
<property>
<!-- 配置NameNode地址 -->
<name>fs.defaultFS</name>
<value>hdfs://namespace_ha</value>
</property>
<property>
<!-- 保存HDFS临时数据的目录 -->
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-3.2.1/tmp</value>
</property>
<property>
<!-- ha zookeeper -->
<name>ha.zookeeper.quorum</name>
<value>bigdata111:2181,bigdata112:2181,bigdata113:2181</value>
</property>
</configuration>
3、hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>namespace_ha</value>
</property>
<property>
<name>dfs.ha.namenodes.namespace_ha</name>
<value>namenode1,namenode2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.namespace_ha.namenode1</name>
<value>bigdata111:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.namespace_ha.namenode1</name>
<value>bigdata111:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.namespace_ha.namenode2</name>
<value>bigdata112:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.namespace_ha.namenode2</name>
<value>bigdata112:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://bigdata112:8485;bigdata113:8485;/namespace_ha</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop-3.2.1/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.namespace_ha</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<!-- HDFS数据冗余度,默认3 -->
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<!-- 是否开启HDFS权限检查,默认true -->
<name>dfs.permissions</name>
<value>true</value>
</property>
</configuration>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop-3.2.1/journal</value>
</property>
<property>
<!-- 保存HDFS临时数据的目录 -->
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-3.2.1/tmp</value>
</property>
两个数据目录要独立。
4、map-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
5、yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn_ha</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>bigdata111</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>bigdata113</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>bigdata111:2181,bigdata112:2181,bigdata113:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
6、测试
(1)启动Zookeeper
zkServer.sh start
(2)启动JournalNode
hadoop-daemon.sh start journalnode
(3)格式化namenode
格式化前一定要启动JournalNode和zk。
hdfs namenode -format
(4)拷贝namenode数据
将bigdata111上tmp目录拷贝到bigdata112
scp -r /opt/hadoop-3.2.1/tmp root@bigdata112:/opt/hadoop-3.2.1/
(5)格式化zookeeper
hdfs zkfc -formatZK
(6)启动集群
start-all.sh
和集群规划一致。
将bigdata111上的namenode杀死,如图:
现在bigdata112从standby变为active,如图:
常见问题:
(1)NameNode启动不了
日志如下:
org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException: Journal Storage Directory /mnt/data1/hadoop/dfs/journal/hdfscluster not formatted
解决办法:在NameNode节点执行
hdfs namenode -initializeSharedEdits
(2)NameNode格式化时报错
Could not format one or more JournalNodes. 1 exceptions thrown:
192.168.128.113:8485: Directory /opt/hadoop-3.2.1/journal/hadoopha is in an inconsistent state: Can't format the storage directory because the current directory is not empty.
解决办法:是因为JournalNode节点的/opt/hadoop-3.2.1/journal/不一致造成的,删除各JournalNode节点该目录数据。
NameNode Federation
1、集群规划
host | HDFS | Yarn | ZK | HA | |
bigdata111 | NameNode SecondaryNameNode | ResourceManager | QuorumPeerMain | ||
bigdata112 | DataNode | NodeManager | JournalNode | QuorumPeerMain | NameNode |
bigdata113 | DataNode | NodeManager | JournalNode | QuorumPeerMain | ResourceManager |
3、hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>namespace_ha</value>
</property>
<property>
<name>dfs.ha.namenodes.namespace_ha</name>
<value>namenode1,namenode2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.namespace_ha.namenode1</name>
<value>bigdata111:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.namespace_ha.namenode1</name>
<value>bigdata111:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.namespace_ha.namenode2</name>
<value>bigdata112:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.namespace_ha.namenode2</name>
<value>bigdata112:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://bigdata112:8485;bigdata113:8485;/namespace_ha</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop-3.2.1/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.namespace_ha</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<!-- HDFS数据冗余度,默认3 -->
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<!-- 是否开启HDFS权限检查,默认true -->
<name>dfs.permissions</name>
<value>true</value>
</property>
</configuration>
HBase HA