hadoop ha的配置与踩坑

配置hadoop ha模式,首先配置 zookeeper
配置zoo.cfg

tickTime=2000


maxClientCnxns=0


initLimit=50


syncLimit=5


dataDir=/home/zfy/zookeeper-3.4.13/zkdata

dataLogDir=/home/zfy/zookeeper-3.4.13/zkdatalog


clientPort=2182


server.1=node1664:12888:13888
server.2=node1665:12888:13888
server.3=node1666:12888:13888

在zkdata目录下创建myid,对应的机器下的myid写对应的server的编号,比如node1664写1就行,node1665写2

  1. 配置
    hdfs-site.xml 的配置
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/home/zfy/hadoop-3.1.3/data/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/home/zfy/hadoop-3.1.3/data/data</value>
    </property>
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/home/zfy/hadoop-3.1.3/data/jn</value>
    </property>
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>

    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>node1664:1214</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>node1664:11214</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>node1665:1214</value>
        </property>
        <property>
                <name>dfs.namenode.http-address.mycluster.nn2</name>
                <value>node1665:11214</value>
        </property>
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://node1664:8485;node1665:8485;node1666:8485/mycluster</value>
    </property>
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/zfy/hadoop-3.1.3/.ssh/id_rsa</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>



</configuration>

指定datanode,namenode,journalnode的 dir路径,需要手动创建对应的目录
dfs.nameservices 为mycluster
指定哪几个节点为namenode 64,65,66 机器上选择了64,和65 作为namenode,所以 dfs.ha.namenodes.mycluster 设置2个值,因为主节点为2个,然后配置 这两个namenode对应的dfs.namenode.rpc-address.mycluster,dfs.namenode.http-address.mycluster
dfs.namenode.shared.edits.dir 为namenode之间共享的目录
其他的照着配就好,其中dfs.ha.fencing.ssh.private-key-files为hadoop所在路径

core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
  <value>/home/zfy/hadoop-3.1.3/data</value>
    </property>
    <property>
      <name>ha.zookeeper.quorum</name>
      <value>node1664:2182,node1665:2182,node1666:2182</value> 
   </property>
    
</configuration>

fs.defaultFS 的配置要跟hdfs-site.xml 的dfs.nameservices对应上
ha.zookeeper.quorum 对应zookeeper中配置的clientPort

3.yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

<property>
  <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
</property>
<property>
   <name>yarn.resourcemanager.cluster-id</name>
   <value>cluster1</value>
</property>
<property>
   <name>yarn.resourcemanager.ha.rm-ids</name>
   <value>rm1,rm2</value>
</property>
<property>
    <name>yarn.resourcemanager.hostname.rm1</name>
    <value>node1664</value>
</property>
<property>
    <name>yarn.resourcemanager.hostname.rm2</name>
    <value>node1665</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>node1664:5088</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>node1665:5088</value>
</property>
<property>
    <name>hadoop.zk.address</name>
     <value>node1664:2182,node1665:2182,node1666:2182</value>
</property>
</configuration>

在这里需要制定zookeeper 的客户端的端口,和namenode
mapred-site.xml 跟正常的hadoop那样配置就行了

2.启动
每个节点分别启动 journalnode
hdfs --daemon start journalnode

然后进行初始化namenode
hdfs namenode -format

在未进行初始化的namenode执行下面命令
hdfs namenode -bootstrapStandby

初始化namenode以后就进行zookeeper初始化
$HADOOP_HOME/bin/hdfs zkfc -formatZK

之后就可以开启 集群
start-dfs.sh
hdfs --daemon start zkfc

遇见的问题:
hdfs-site.xml中的的datanode和namenode对应的目录的路径要单斜杆
zookeeper从新启动失败,要确保配置的dataDir中只有myid一个文件,把多余的其他的文件都删除掉

官网的参考
https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
https://hadoop.apache.org/docs/r3.1.3/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值