Hadoop 3.1.1高可用(HA)集群安装笔记

环境准备

1.服务器概览

hostnameip说明
nn01192.168.56.101name node
nn02192.168.56.102name node
dn01192.168.56.103data node
dn02192.168.56.104data node
dn03192.168.56.105data node
nn01nn02dn01dn02dn03
NameNode
DataNode
ResourceManager
NodeManager
Zookeeper
journalnode
zkfc

分别在三台服务器上执行以下命令

#添加host
[root@nn01 ~] vim /etc/hosts

192.168.56.101 nn01
192.168.56.102 nn02
192.168.56.103 dn01
192.168.56.104 dn02
192.168.56.105 dn03

#执行以下命令关闭防火墙
[root@nn01 ~]systemctl stop firewalld && systemctl disable firewalld
[root@nn01 ~]setenforce 0

#将SELINUX的值改成disabled
[root@nn01 ~]vim /etc/selinux/config

SELINUX=disabled

#重启服务器
[root@nn01 ~]reboot

2.JDK安装

#配置环境变量
[root@nn01 ~]# vim /etc/profile

# 在最后下添加
# Java Environment Path
export JAVA_HOME=/opt/java/jdk1.8.0_172
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

# 刷新配置文件
source /etc/profile

3.配置免密码登录

#nn01执行以下命令

#生成密钥Pair,输入之后一直选择enter即可。生成的秘钥位于 ~/.ssh文件夹下
[root@nn01 ~]# ssh-keygen -t rsa 

[root@nn01 .ssh]# scp /root/.ssh/id_rsa.pub root@nn01:~
[root@nn01 .ssh]#cat ~/id_rsa.pub >> /root/.ssh/authorized_keys

##nn02 执行以下命令
[root@nn02 .ssh]#cat ~/id_rsa.pub >> /root/.ssh/authorized_keys

##nn02,dn01,dn02,dn03 执行以下命令
[root@nn02 ~]# mkdir -p ~/.ssh
[root@nn02 ~]# cd .ssh/
[root@nn02 .ssh]# cat ~/id_rsa.pub >> /root/.ssh/authorized_keys
[root@nn02 .ssh]# vim /etc/ssh/sshd_config
#禁用root账户登录,如果是用root用户登录请开启
PermitRootLogin yes
PubkeyAuthentication yes

要求能通过免登录包括使用IP和主机名都能免密码登录: 1) NameNode能免密码登录所有的DataNode 2) 各NameNode能免密码登录自己 3) 各NameNode间能免密码互登录 4) DataNode能免密码登录自己 5) DataNode不需要配置免密码登录NameNode和其它DataNode。

同理,配置nn02免密码登录nn01,dn01,dn02,dn03

安装zookeeper

mkdir -p /opt/zookeeper/
cd /opt/zookeeper/
tar -zxvf zookeeper-3.4.13.tar.gz
cd zookeeper-3.4.13/conf/
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg 

zoo.cfg

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=nn01:2888:3888
server.2=nn02:2888:3888
server.3=dn01:2888:3888
server.4=dn02:2888:3888
server.5=dn03:2888:3888

基本配置:
tickTime
心跳基本时间单位,毫秒级,ZK基本上所有的时间都是这个时间的整数倍。
initLimit
tickTime的个数,表示在leader选举结束后,followers与leader同步需要的时间,如果followers比较多或者说leader的数据灰常多时,同步时间相应可能会增加,那么这个值也需要相应增加。当然,这个值也是follower和observer在开始同步leader的数据时的最大等待时间(setSoTimeout)
syncLimit
tickTime的个数,这时间容易和上面的时间混淆,它也表示follower和observer与leader交互时的最大等待时间,只不过是在与leader同步完毕之后,进入正常请求转发或ping等消息交互时的超时时间。
dataDir
内存数据库快照存放地址,如果没有指定事务日志存放地址(dataLogDir),默认也是存放在这个路径下,建议两个地址分开存放到不同的设备上。

clientPort
配置ZK监听客户端连接的端口

server.serverid=host:tickpot:electionport

server:固定写法
serverid:每个服务器的指定ID(必须处于1-255之间,必须每一台机器不能重复)
host:主机名
tickpot:心跳通信端口
electionport:选举端口

#新建文件夹
mkdir -p /opt/data/zookeeper
mkdir -p /opt/data/logs/zookeeper
touch /opt/data/zookeeper/myid



#复制到其他主机
scp -r /opt/zookeeper root@nn02:/opt/
scp -r /opt/data/zookeeper root@nn02:/opt/data/
scp -r /opt/data/logs/zookeeper root@nn02:/opt/data/logs/

#在nn01上执行
echo 1 > /opt/data/zookeeper/myid

#在nn02上执行
echo 2 > /opt/data/zookeeper/myid

#在dn01上执行
echo 3 > /opt/data/zookeeper/myid

#在dn02上执行
echo 4 > /opt/data/zookeeper/myid

#在dn03上执行
echo 5 > /opt/data/zookeeper/myid
#添加环境变量
export ZOOKEEPER_HOME=/opt/zookeeper/zookeeper-3.4.13
export PATH=$ZOOKEEPER_HOME/bin:$PATH

source /etc/profile

安装hadoop

1 下载hadoop

mkdir -p /opt/hadoop/
cd /opt/hadoop
tar -xf hadoop-3.1.1.tar.gz

##设置环境变量
export HADOOP_HOME=/opt/hadoop/hadoop-3.1.1
export HADOOP_PREFIX=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME

# 新建文件夹
mkdir -p /opt/data/logs/hadoop
mkdir -p /opt/data/hadoop/hdfs/nn
mkdir -p /opt/data/hadoop/hdfs/dn
mkdir -p /opt/data/hadoop/hdfs/jn

修改配置文件:/opt/hadoop/hadoop-3.1.1/etc/hadoop/hadoop-env.sh

## 在文件开头加上,根据自己服务器配置设置jvm内存大小
export JAVA_HOME=/opt/java/jdk1.8.0_172
export HADOOP_NAMENODE_OPTS=" -Xms1024m -Xmx1024m -XX:+UseParallelGC"
export HADOOP_DATANODE_OPTS=" -Xms512m -Xmx512m"
export HADOOP_LOG_DIR=/opt/data/logs/hadoop

/opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- 指定hdfs的nameservice为mycluster -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/data/hadoop/tmp</value>
    </property>

    <!-- 指定zookeeper地址 -->
<property>
    <name>ha.zookeeper.quorum</name>
    <value>nn01:2181,nn02:2181,dn01:2181,dn02:2181,dn03:2181</value>
</property>

<!-- hadoop链接zookeeper的超时时长设置 -->
     <property>
         <name>ha.zookeeper.session-timeout.ms</name>
         <value>30000</value>
         <description>ms</description>
     </property>

    <property>
        <name>fs.trash.interval</name>
        <value>1440</value>
    </property>

</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

<!-- journalnode集群之间通信的超时时间 -->
<property>
    <name>dfs.qjournal.start-segment.timeout.ms</name>
    <value>60000</value>
</property>

    <!--指定hdfs的nameservice为mycluster,需要和core-site.xml中的保持一致
                          dfs.ha.namenodes.[nameservice id]为在nameservice中的每一个NameNode设置唯一标示符。
        配置一个逗号分隔的NameNode ID列表。这将是被DataNode识别为所有的NameNode。
        例如,如果使用"mycluster"作为nameservice ID,并且使用"nn01"和"nn02"作为NameNodes标示符
    -->
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>

        <!-- mycluster下面有两个NameNode,分别是nn01,nn02 -->
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn01,nn02</value>
    </property>

    <!-- nn01的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn01</name>
        <value>nn01:8020</value>
    </property>

    <!-- nn02的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn02</name>
        <value>nn02:8020</value>
    </property>

     <!-- nn01的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn01</name>
        <value>nn01:50070</value>
    </property>

    <!-- nn02的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn02</name>
        <value>nn02:50070</value>
    </property>

    <!-- 指定NameNode的edits元数据的共享存储位置。也就是JournalNode列表
                          该url的配置格式:qjournal://host1:port1;host2:port2;host3:port3/journalId
        journalId推荐使用nameservice,默认端口号是:8485 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://nn01:8485;nn02:8485;dn01:8485;dn02:8485;dn03:8485/mycluster</value>
    </property>

    <!-- 配置失败自动切换实现方式 -->
    <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行 -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
            sshfence
	    shell(/bin/true)
        </value>
    </property>

  <property>
     <name>dfs.permissions.enabled</name>
     <value>false</value>
  </property>

    <property>
        <name>dfs.support.append</name>
        <value>true</value>
    </property>

    <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>

    <!-- 指定副本数 -->
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>


    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/opt/data/hadoop/hdfs/nn</value>
    </property>

    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/opt/data/hadoop/hdfs/dn</value>
    </property>

    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/opt/data/hadoop/hdfs/jn</value>
    </property>

    <!-- 开启NameNode失败自动切换 -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <!-- 启用webhdfs -->
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>

    <!-- 配置sshfence隔离机制超时时间 -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>

    <property>
        <name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
        <value>60000</value>
    </property>

</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- 指定mr框架为yarn方式 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

    <!-- 指定mapreduce jobhistory地址 -->
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>nn01:10020</value>
    </property>

    <!-- 任务历史服务器的web地址 -->
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>nn01:19888</value>
    </property>

    <property>
      <name>mapreduce.application.classpath</name>
      <value>
          /opt/hadoop/hadoop-3.1.1/etc/hadoop,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/common/*,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/common/lib/*,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/*,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/lib/*,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/*,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/lib/*,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/*,
          /opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/lib/*
      </value>
    </property>

</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
    <!-- 开启RM高可用 -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>

    <!-- 指定RM的cluster id -->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yrc</value>
    </property>

    <!-- 指定RM的名字 -->
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <!-- 分别指定RM的地址 -->
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>nn01</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>nn02</value>
    </property>

    <!-- 指定zk集群地址 -->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>nn01:2181,nn02:2181,dn01:2181,dn02:2181,dn03:2181</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>

    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>86400</value>
    </property>

    <!-- 启用自动恢复 -->
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>

    <!-- 制定resourcemanager的状态信息存储在zookeeper集群上 -->
    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/workers

dn01
dn02
dn03

/opt/hadoop/hadoop-3.1.1/sbin/start-dfs.sh sbin/stop-dfs.sh

HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

/opt/hadoop/hadoop-3.1.1/sbin/start-yarn.sh sbin/stop-yarn.sh

YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn 
YARN_NODEMANAGER_USER=root

复制到其他机器

scp -r /opt/data root@nn02:/opt/
scp -r /opt/data root@dn01:/opt/
scp -r /opt/data root@dn02:/opt/
scp -r /opt/data root@dn03:/opt/


scp -r /opt/hadoop/hadoop-3.1.1 root@nn02:/opt/hadoop/
scp -r /opt/hadoop/hadoop-3.1.1 root@dn01:/opt/hadoop/
scp -r /opt/hadoop/hadoop-3.1.1 root@dn02:/opt/hadoop/
scp -r /opt/hadoop/hadoop-3.1.1 root@dn03:/opt/hadoop/

启动

Zookeeper -> JournalNode -> 格式化NameNode ->创建命名空间(zkfc) -> NameNode -> DataNode -> ResourceManager -> NodeManager。

1. 启动zookeeper

nn01,nn02,dn01,dn02,dn03

zkServer.sh start

2. 启动journalnode

nn01,nn02,dn01,dn02,dn03

hadoop-daemon.sh start journalnode

3. 格式化namenode

nn01

hadoop namenode -format

把在nn01节点上生成的元数据给复制到其他节点上

scp -r /opt/data/hadoop/hdfs/nn/* root@nn02:/opt/data/hadoop/hdfs/nn/

scp -r /opt/data/hadoop/hdfs/nn/* root@dn01:/opt/data/hadoop/hdfs/nn/

scp -r /opt/data/hadoop/hdfs/nn/* root@dn02:/opt/data/hadoop/hdfs/nn/

scp -r /opt/data/hadoop/hdfs/nn/* root@dn03:/opt/data/hadoop/hdfs/nn/

4. 格式化zkfc

重点强调:只能在nameonde节点进行 nn01

hdfs zkfc -formatZK

5. 启动HDFS

重点强调:只能在nameonde节点进行 nn01

start-dfs.sh

6. 启动YARN

在主备 resourcemanager 中随便选择一台进行启动

nn02

start-yarn.sh

若备用节点的 resourcemanager 没有启动起来,则手动启动起来: yarn-daemon.sh start resourcemanager

7. 启动 mapreduce 任务历史服务器

mr-jobhistory-daemon.sh start historyserver

8. 状态查看

查看各主节点的状态

hdfs haadmin -getServiceState nn01
hdfs haadmin -getServiceState nn02


[root@nn01 hadoop]# hdfs haadmin -getServiceState nn01
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:06:58,892 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]# hdfs haadmin -getServiceState nn02
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:07:02,217 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[root@nn01 hadoop]#



[root@nn01 hadoop]# yarn rmadmin -getServiceState rm1
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:07:45,112 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]# yarn rmadmin -getServiceState rm2
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:07:48,350 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[root@nn01 hadoop]#

WEB界面进行查看

##HDFS
http://192.168.56.101:50070/
http://192.168.56.102:50070/

#YARN
http://192.168.56.102:8088/cluster

转载于:https://my.oschina.net/orrin/blog/2209345

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值