文章目录
软件准备
组件
组件 | 软件包 | 网址 |
---|---|---|
Centos | Centos7.4(64位) | 阿里云服务器3台,按量付费,用完释放资源,密码:Root@123! |
JDK | jdk-8u151-linux-x64.tar.gz | https://www.oracle.com/technetwork/java/javase/downloads/index.html |
Zookeeper | zookeeper-3.4.5-cdh5.15.1.tar.gz | http://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.15.1.tar.gz |
Hadoop | hadoop-2.6.0-cdh5.15.1.tar.gz | http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.15.1.tar.gz |
主机规划
服务 | 主机 | IP | 配置 |
---|---|---|---|
QuorumPeerMain | hadoop001 | 172.26.239.216 | 2核4GB内存,带宽2M,系统盘赠送的40GB存储 |
JournalNode | |||
NameNode | |||
DataNode | |||
DFSZKFailoverController | |||
ResourceManager | |||
NodeManager | |||
JobHistoryServer | |||
QuorumPeerMain | hadoop002 | 172.26.239.214 | 2核4GB内存,带宽2M,系统盘赠送的40GB存储 |
JournalNode | |||
NameNode | |||
DataNode | |||
DFSZKFailoverController | |||
ResourceManager | |||
NodeManager | |||
QuorumPeerMain | hadoop003 | 172.26.239.215 | 2核4GB内存,带宽2M,系统盘赠送的40GB存储 |
JournalNode | |||
DataNode | |||
NodeManager |
购买阿里云服务配置,3台服务器,¥1.831/小时。
云主机,网络环境,时间同步就不需要做了,阿里云会帮我们做好。
l临时环境,可以将防火墙全部打开1/65535,授权对象0.0.0.0/0
目录规划
名称 | 路径 | 备注 |
---|---|---|
JAVA_HOME | /usr/java/jdk1.8.0_151 | 手动创建/usr/java |
$ZOOKEEPER_HOME | /opt/app/zeekeeper | |
data | $ZOOKEEPER_HOME/data | 手动创建 |
$HADOOP_HOME | /opt/app/hadoop | |
data | $ADOOP_HOME/data | |
log | $ADOOP_HOME/logs | |
hadoop.tmp.dir | $ADOOP_HOME/tmp | 手工创建,权限777,hadoop:hadoop |
software | /opt/app/software | 软件包存放路径 |
环境准备
防火墙(3台root)
查看防火墙(root用户操作)
[root@hadoop001 ~]# firewall-cmd --state
not running
[root@hadoop001 ~]#
关闭防火墙
如果防火墙开着,需要使用关闭防火墙
[root@hadoop001 ~]# systemctl stop firewalld
禁止开启自启防火墙
[root@hadoop001 ~]#systemctl disable firewalld
Selinux(3台root)
查看Selinux 状态
命令:getenforce 或者sestatus 命令
[root@hadoop001 ~]# getenforce
Disabled
[root@hadoop001 ~]#
关闭Selinux
临时生效(setenforce 0)
[root@hadoop001 ~]# setenforce 0 ###设置SELinux成为permissive模式,用于临时关闭selinux防火墙,但重启后失效。
[root@hadoop001 ~]# setenforce 1 ###设置SELinux成为enforcing模式,用于临时开启selinux防火墙,但重启后失效。
永久生效,修改/etc/selinux/config
[root@hadoop001 ~]#vim /etc/selinux/config
SELINUX=disabled
创建用户(3台root)
使用root创建hadoop用户
[root@hadoop001 ~]# useradd hadoop
修改hadoop用户的密码为hadoop
[root@hadoop001 ~]# echo hadoop | passwd --stdin hadoop
返回结果如下:
[root@hadoop001 ~]# echo hadoop | passwd --stdin hadoop
Changing password for user hadoop.
passwd: all authentication tokens updated successfully.
[root@hadoop001 ~]#
安装lrzsz软件(3台)
[root@hadoop001 ~]# yum -y install lrzsz
创建目录(3台root)
使用root用户创建目录
[root@hadoop001 ~]# mkdir -p /opt/app/software
修改/opt/app拥有者
[root@hadoop001 ~]# chown -R hadoop:hadoop /opt/app
切换hadoop( - 加载环境变量,同时切换到hadoop家目录下)
[root@hadoop001 ~]# su - hadoop
[hadoop@hadoop001 ~]$
配置hosts文件
使用root用户打开/etc/hosts
[root@hadoop001 ~]# vim /etc/hosts
修改hosts文件,添加以下内容
172.26.239.216 hadoop001
172.26.239.214 hadoop002
172.26.239.215 hadoop003
scp同步到另外两台主机上
[root@hadoop001 ~]# scp /etc/hosts root@172.26.239.215:/etc/
[root@hadoop001 ~]# scp /etc/hosts root@172.26.239.214:/etc/
SSH互信
ssh-keygen创建公钥(Public Key)和密钥(Private Key)ssh-keygen -t [rsa|dsa] 默认dsa
[hadoop@hadoop001 ~]$ ssh-keygen
ssh-copy-id把本地主机的公钥复制到远程主机的 authorized_keys 文件上。
也会给远程主机的用户主目录(home)和~/.ssh, 和~/.ssh/authorized_keys 设置合适的权限。
[hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop001
[hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop002
[hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop003
返回结果
[hadoop@hadoop001 ~]$ ssh-copy-id hadoop@hadoop002
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'hadoop002 (172.26.239.214)' can't be established.
ECDSA key fingerprint is SHA256:WuDCjCFcqjYk/C4Wgop9M6rIbkmnE4gn6mEHMVnBcWk.
ECDSA key fingerprint is MD5:f5:1e:b4:52:47:19:d6:ce:2b:31:a0:b4:48:ee:d2:f2.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@hadoop002's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop@hadoop002'"
and check to make sure that only the key(s) you wanted were added.
[hadoop@hadoop001 ~]$
client 端去登陆server 端免密码输入
[hadoop@hadoop001 ~]$ ssh hadoop@hadoop001 ls
[hadoop@hadoop001 ~]$ ssh hadoop@hadoop002 ls
[hadoop@hadoop001 ~]$ ssh hadoop@hadoop003 ls
上传软件包
使用hadoop用户将下载好的软件包上传到/opt/app/software目录下
切换到软件存放目录
[hadoop@hadoop001 ~]$ cd /opt/app/software
使用rz命令上传软件包
[hadoop@hadoop001 software]$ rz
使用scp将软件包同步到另外两台主机上
[hadoop@hadoop001 ~]$ scp -r /opt/app/software/* hadoop@hadoop002:/opt/app/software/
[hadoop@hadoop001 ~]$ scp -r /opt/app/software/* hadoop@hadoop003:/opt/app/software/
安装JDK
使用root用户创建/usr/java目录
[root@hadoop001 ~]# mkdir /usr/java
修改目录拥有者
[root@hadoop001 ~]# chown -R hadoop:hadoop /usr/java
使用hadoop用户解压jdk
[hadoop@hadoop001 software]$ tar -zxvf /opt/app/software/jdk-8u151-linux-x64.tar.gz -C /usr/java/
使用root用户修改环境变量
[root@hadoop001 ~]# vim /etc/profile
###ADD JDK environment
export JAVA_HOME=/usr/java/jdk1.8.0_151
export PATH=$JAVA_HOME/bin:$PATH
source使环境变量生效
[root@hadoop001 ~]# source /etc/profile
验证JDK
[hadoop@hadoop001 ~]$ java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
[hadoop@hadoop001 ~]$
同步到另外两台主机
[root@hadoop001 ~]# scp /etc/profile root@hadoop002:/etc/profile
[root@hadoop001 ~]# scp /etc/profile root@hadoop003:/etc/profile
集群安装部署
安装Zookeeper
解压Zookeeper(3台)
[hadoop@hadoop001 software]$ tar -zxvf /opt/app/software/zookeeper-3.4.5-cdh5.15.1.tar.gz -C /opt/app
修改配置文件
复制生成/opt/app/zookeeper-3.4.5-cdh5.15.1/conf/zoo.cfg
[hadoop@hadoop001 ~]$ cd /opt/app/zookeeper-3.4.5-cdh5.15.1/conf
[hadoop@hadoop001 conf]$ cp zoo_sample.cfg zoo.cfg
修改zoo.cfg文件
[hadoop@hadoop001 conf]$ vim zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
##这个目录存放zookeeper 的数据,以及myid 配置文件。此目录若没有则必须手动创建。
dataDir=/opt/app/zookeeper-3.4.5-cdh5.15.1/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
## server 后面的数字为myid 配置文件中的数据。必须与主机名严格对应。
server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888
##生产环境至少3 台zookeeper 服务器,并且如果集群较大可增加至奇数台。
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
重要参数
##这个目录存放zookeeper 的数据,以及myid 配置文件。此目录若没有则必须手动创建。
dataDir=/opt/app/zookeeper-3.4.5-cdh5.15.1/data
## server 后面的数字为myid 配置文件中的数据。必须与主机名严格对应。
##生产环境至少3 台zookeeper 服务器,并且如果集群较大可增加至奇数台。
server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888
同步配置文件
[hadoop@hadoop001 conf]$ scp zoo.cfg hadoop002:/opt/app/zookeeper-3.4.5-cdh5.15.1/conf
[hadoop@hadoop001 conf]$ scp zoo.cfg hadoop003:/opt/app/zookeeper-3.4.5-cdh5.15.1/conf
创建zookeeper数据目录(3台)
[hadoop@hadoop001 conf]$ mkdir /opt/app/zookeeper-3.4.5-cdh5.15.1/data
配置/opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid
[hadoop@hadoop001 conf]$ cd /opt/app/zookeeper-3.4.5-cdh5.15.1/data
[hadoop@hadoop001 data]$ echo 1 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid
[hadoop@hadoop001 data]$ ssh hadoop002 " echo 2 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid "
[hadoop@hadoop001 data]$ ssh hadoop003 " echo 3 > /opt/app/zookeeper-3.4.5-cdh5.15.1/data/myid "
安装hadoop
解压hadoop(3台)
[hadoop@hadoop001 ~]$ tar -zxvf /opt/app/software/hadoop-2.6.0-cdh5.15.1.tar.gz -C /opt/app
配置环境变量
[root@hadoop001 ~]# vim /etc/profile
###ADD JDK environment
export JAVA_HOME=/usr/java/jdk1.8.0_151
export PATH=$JAVA_HOME/bin:$PATH
###ADD JDK environment
export ZOOKEEPER_HOME=/opt/app/zookeeper-3.4.5-cdh5.15.1
export PATH=$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/sbin:$PATH
###ADD HADOOP environment
export HADOOP_HOME=/opt/app/hadoop-2.6.0-cdh5.15.1
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
###ADD CLASSPATH
CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib:$CLASSPATH
同步环境变量文件/etc/profile
[root@hadoop001 ~]# scp /etc/profile root@hadoop002:/etc/profile
[root@hadoop001 ~]# scp /etc/profile root@hadoop003:/etc/profile
修改配置文件
hadoop-env
[hadoop@hadoop001 hadoop]$ vim hadoop-env.sh
...
export JAVA_HOME="/usr/java/jdk1.8.0_151"
...
core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdp</value>
</property>
<!--==============================Trash机制======================================= -->
<property>
<!--多长时间创建CheckPoint NameNode截点上运行的CheckPointer 从Current文件夹创建CheckPoint;默认:0 由fs.trash.interval项指定 -->
<name>fs.trash.checkpoint.interval</name>
<value>0</value>
</property>
<property>
<!--多少分钟.Trash下的CheckPoint目录会被删除,该配置服务器设置优先级大于客户端,默认:0 不删除 -->
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<!--指定hadoop临时目录, hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果hdfs-site.xml中不配 置namenode和datanode的存放位置,默认就放在这>个路径中 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/app/hadoop-2.6.0-cdh5.15.1/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
</property>
<!--指定ZooKeeper超时间隔,单位毫秒 -->
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>2000</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.SnappyCodec
</value>
</property>
</configuration>
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!--HDFS超级用户 -->
<property>
<name>dfs.permissions.superusergroup</name>
<value>hadoop</value>
</property>
<!--开启web hdfs -->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/name</value>
<description> namenode 存放name table(fsimage)本地目录(需要修改)</description>
</property>
<property>
<name>dfs.namenode.edits.dir</name>
<value>${dfs.namenode.name.dir}</value>
<description>namenode粗放 transaction file(edits)本地目录(需要修改)</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/data</value>
<description>datanode存放block本地目录(需要修改)</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- 块大小256M (默认128M) -->
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<!--======================================================================= -->
<!--HDFS高可用配置 -->
<!--指定hdfs的nameservice为hdp,需要和core-site.xml中的保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>hdp</value>
</property>
<property>
<!--设置NameNode IDs 此版本最大只支持两个NameNode -->
<name>dfs.ha.namenodes.hdp</name>
<value>nn1,nn2</value>
</property>
<!-- Hdfs HA: dfs.namenode.rpc-address.[nameservice ID] rpc 通信地址 -->
<property>
<name>dfs.namenode.rpc-address.hdp.nn1</name>
<value>hadoop001:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdp.nn2</name>
<value>hadoop002:8020</value>
</property>
<!-- Hdfs HA: dfs.namenode.http-address.[nameservice ID] http 通信地址 -->
<property>
<name>dfs.namenode.http-address.hdp.nn1</name>
<value>hadoop001:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.hdp.nn2</name>
<value>hadoop002:50070</value>
</property>
<!--==================Namenode editlog同步 ============================================ -->
<!--保证数据恢复 -->
<property>
<name>dfs.journalnode.http-address</name>
<value>0.0.0.0:8480</value>
</property>
<property>
<name>dfs.journalnode.rpc-address</name>
<value>0.0.0.0:8485</value>
</property>
<property>
<!--设置JournalNode服务器地址,QuorumJournalManager 用于存储editlog -->
<!--格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 端口同journalnode.rpc-address -->
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/hdp</value>
</property>
<property>
<!--JournalNode存放数据地址 -->
<name>dfs.journalnode.edits.dir</name>
<value>/opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/jn</value>
</property>
<!--==================DataNode editlog同步 ============================================ -->
<property>
<!--DataNode,Client连接Namenode识别选择Active NameNode策略 -->
<!-- 配置失败自动切换实现方式 -->
<name>dfs.client.failover.proxy.provider.hdp</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!--==================Namenode fencing:=============================================== -->
<!--Failover后防止停掉的Namenode启动,造成两个服务 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<!--多少milliseconds 认为fencing失败 -->
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<!--==================NameNode auto failover base ZKFC and Zookeeper====================== -->
<!--开启基于Zookeeper -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!--动态许可datanode连接namenode列表 -->
<property>
<name>dfs.hosts</name>
<value>/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop/slaves</value>
</property>
</configuration>
yarn-env
[hadoop@hadoop001 hadoop]$ vim yarn-env.sh
...
export YARN_LOG_DIR="/opt/app/hadoop-2.6.0-cdh5.15.1logs"
...
mapred-site.xml
[hadoop@hadoop001 hadoop]$ cd /opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop
[hadoop@hadoop001 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop001 hadoop]$ vim mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- 配置 MapReduce Applications -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- JobHistory Server ============================================================== -->
<!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop001:10020</value>
</property>
<!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop001:19888</value>
</property>
<!-- 配置 Map段输出的压缩,snappy-->
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
</configuration>
yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- nodemanager 配置 ================================================= -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.localizer.address</name>
<value>0.0.0.0:23344</value>
<description>Address where the localizer IPC is.</description>
</property>
<property>
<name>yarn.nodemanager.webapp.address</name>
<value>0.0.0.0:23999</value>
<description>NM Webapp address.</description>
</property>
<!-- HA 配置 =============================================================== -->
<!-- Resource Manager Configs -->
<property>
<name>yarn.resourcemanager.connect.retry-interval.ms</name>
<value>2000</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 使嵌入式自动故障转移。HA环境启动,与 ZKRMStateStore 配合 处理fencing -->
<property>
<name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
<value>true</value>
</property>
<!-- 集群名称,确保HA选举时对应的集群 -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!--这里RM主备结点需要单独指定,(可选)
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm2</value>
</property>
-->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
<value>5000</value>
</property>
<!-- ZKRMStateStore 配置 -->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
</property>
<property>
<name>yarn.resourcemanager.zk.state-store.address</name>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
</property>
<!-- Client访问RM的RPC地址 (applications manager interface) -->
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>hadoop001:23140</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>hadoop002:23140</value>
</property>
<!-- AM访问RM的RPC地址(scheduler interface) -->
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>hadoop001:23130</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>hadoop002:23130</value>
</property>
<!-- RM admin interface -->
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>hadoop001:23141</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>hadoop002:23141</value>
</property>
<!--NM访问RM的RPC端口 -->
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>hadoop001:23125</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>hadoop002:23125</value>
</property>
<!-- RM web application 地址 -->
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>hadoop001:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>hadoop002:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address.rm1</name>
<value>hadoop001:23189</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address.rm2</name>
<value>hadoop002:23189</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://hadoop001:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
<discription>单个任务可申请最少内存,默认1024MB</discription>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
<discription>单个任务可申请最大内存,默认8192MB</discription>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
</configuration>
slaves
[hadoop@hadoop001 hadoop]$ vim slaves
hadoop001
hadoop002
hadoop003
同步配置文件
[hadoop@hadoop001 hadoop]$ scp core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml hadoop-env.sh slaves hadoop002:/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop
[hadoop@hadoop001 hadoop]$ scp core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml hadoop-env.sh slaves hadoop003:/opt/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop
创建临时目录(3台)
[hadoop@hadoop001 hadoop-2.6.0-cdh5.15.1]$ mkdir /opt/app/hadoop-2.6.0-cdh5.15.1/tmp
[hadoop@hadoop001 hadoop-2.6.0-cdh5.15.1]$ chmod 777 /opt/app/hadoop-2.6.0-cdh5.15.1/tmp
启动集群服务
zookper
启动zookeeper服务(3台)
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh start
JMX enabled by default
Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@hadoop001 ~]$
查看zookeeper状态(3台)
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh status
JMX enabled by default
Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg
Mode: follower
[hadoop@hadoop001 ~]$
jps查看zookeeper进程(3台)
[hadoop@hadoop001 ~]$ jps
20483 QuorumPeerMain
20516 Jps
[hadoop@hadoop001 ~]$
验证zookeeper服务
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] quit
Quitting...
[hadoop@hadoop001 ~]$
启动HDFS
格式化hdfs的zookeeper存储目录
[hadoop@hadoop001 ~]$ hdfs zkfc -formatZK
查看zookeeper信息
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hadoop-ha]
[zk: localhost:2181(CONNECTED) 1] quit
Quitting...
[hadoop@hadoop001 ~]$
启动JournalNode服务(3台)
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-journalnode-hadoop001.out
[hadoop@hadoop001 ~]$
jps查看JournalNode进程(3台)
[hadoop@hadoop001 ~]$ jps
21107 JournalNode
21156 Jps
20893 QuorumPeerMain
[hadoop@hadoop001 ~]$
格式化并启动第一个NameNode(hadoop001)
[hadoop@hadoop001 ~]$ hdfs namenode -format ##格式化当前节点的namenode 数据
格式化journalnode 的数据,这个是ha需要做的
[hadoop@hadoop001 ~]$ hdfs namenode -initializeSharedEdits
18/11/27 01:51:28 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
Re-format filesystem in QJM to [172.26.239.216:8485, 172.26.239.214:8485, 172.26.239.215:8485] ? (Y or N) Y
18/11/27 01:51:39 INFO namenode.FileJournalManager: Recovering unfinalized segments in /opt/app/hadoop-2.6.0-cdh5.15.1/data/dfs/name/current
18/11/27 01:51:39 INFO client.QuorumJournalManager: Starting recovery process for unclosed journal segments...
18/11/27 01:51:39 INFO client.QuorumJournalManager: Successfully started new epoch 1
18/11/27 01:51:39 INFO util.ExitUtil: Exiting with status 0
18/11/27 01:51:39 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop001/172.26.239.216
************************************************************/
[hadoop@hadoop001 ~]$
启动当前节点的namenode 服务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-namenode-hadoop001.out
[hadoop@hadoop001 ~]$
jps查看namenode进程
[hadoop@hadoop001 ~]$ jps
21107 JournalNode
21350 Jps
21276 NameNode
20893 QuorumPeerMain
[hadoop@hadoop001 ~]
格式化并启动第二个NameNode(hadoop002)
[hadoop@hadoop002 ~]$ hdfs namenode -bootstrapStandby #hadoop001已经格式化过,同步至hadoop002
启动当前节点的namenode 服务(hadoop002)
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-namenode-hadoop002.out
[hadoop@hadoop002 ~]$
jps查看namenode进程(hadoop002)
[hadoop@hadoop002 ~]$ jps
20690 QuorumPeerMain
20788 JournalNode
21017 Jps
20923 NameNode
[hadoop@hadoop002 ~]$
启动datanode服务(hadoop001)
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemons.sh start datanode
jps查看datanode进程(3台)
[hadoop@hadoop001 ~]$ jps
21857 Jps
21107 JournalNode
21766 DataNode
21276 NameNode
20893 QuorumPeerMain
[hadoop@hadoop001 ~]$
启动ZooKeeperFailoverController(hadoop001,hadoop002)
hadoop001
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start zkfc #所有namenode节点分别执行
starting zkfc, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-zkfc-hadoop001.out
[hadoop@hadoop001 ~]$
hadoop002
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/hadoop-daemon.sh start zkfc #所有namenode节点分别执行
starting zkfc, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-zkfc-hadoop002.out
[hadoop@hadoop002 ~]$
查看DFSZKFailoverController
[hadoop@hadoop001 ~]$ jps
21107 JournalNode
21766 DataNode
21276 NameNode
20893 QuorumPeerMain
21950 DFSZKFailoverController
22030 Jps
[hadoop@hadoop001 ~]$
Web页面查看
登陆http://39.98.44.126:50070(hadoop001)
其中一个为active,另一个为standby状态
登陆http://39.98.37.133:50070(hadoop002)
启动YARN
hadoop001上启动yarn
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop002: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop002.out
hadoop003: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-nodemanager-hadoop003.out
hadoop001: starting nodemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1logs/yarn-hadoop-nodemanager-hadoop001.out
[hadoop@hadoop001 ~]$
jps查看yarn进程(3台)
[hadoop@hadoop001 ~]$ jps
22624 Jps
21107 JournalNode
22212 ResourceManager
21766 DataNode
22310 NodeManager
21276 NameNode
20893 QuorumPeerMain
21950 DFSZKFailoverController
[hadoop@hadoop001 ~]
hadoop002上启动resourcemanager
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/yarn-hadoop-resourcemanager-hadoop002.out
[hadoop@hadoop002 ~]$
jps查看resourcemanager进程
[hadoop@hadoop002 ~]$ jps
20690 QuorumPeerMain
20788 JournalNode
21908 Jps
21399 DFSZKFailoverController
20923 NameNode
21675 NodeManager
21117 DataNode
21853 ResourceManager
[hadoop@hadoop002 ~]$
Web页面查看
登陆http://39.98.44.126:8088(hadoop001)
其中一个为active,另一个为standby状态
登陆http://39.98.37.133:8088/cluster/cluster(hadoop002)
启动jobhistory
在hadoop001上启动jobhistory服务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /opt/app/hadoop-2.6.0-cdh5.15.1/logs/mapred-hadoop-historyserver-hadoop001.out
[hadoop@hadoop001 ~]$
jps查看JobHistoryServer进程
[hadoop@hadoop001 ~]$ jps
22785 Jps
21107 JournalNode
22212 ResourceManager
21766 DataNode
22310 NodeManager
22680 JobHistoryServer
21276 NameNode
20893 QuorumPeerMain
21950 DFSZKFailoverController
[hadoop@hadoop001 ~]$
Web页面查看
登陆jobhistory服务器web端查看job状态
http://39.98.44.126:19888(hadoop001)
停止集群服务
停止YARN相关服务
停止hadoop001上historyserver
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh stop historyserver
stopping historyserver
[hadoop@hadoop001 ~]$
停止hadoop001上yarn任务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/stop-yarn.sh #停止hadoop001上resourcemanager及所有的nodemanager
stopping yarn daemons
stopping resourcemanager
hadoop003: stopping nodemanager
hadoop001: stopping nodemanager
hadoop002: stopping nodemanager
no proxyserver to stop
[hadoop@hadoop001 ~]$
停止hadoop002上resourcemanager
[hadoop@hadoop002 ~]$ $HADOOP_HOME/sbin/yarn-daemon.sh stop resourcemanager
stopping resourcemanager
[hadoop@hadoop002 ~]$
停止HDFS相关服务
停止namenode,datanode,journalnode,zkfc服务
[hadoop@hadoop001 ~]$ $HADOOP_HOME/sbin/stop-dfs.sh
18/11/27 03:32:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [hadoop001 hadoop002]
hadoop001: stopping namenode
hadoop002: stopping namenode
hadoop003: stopping datanode
hadoop001: stopping datanode
hadoop002: stopping datanode
Stopping journal nodes [hadoop001 hadoop002 hadoop003]
hadoop002: stopping journalnode
hadoop003: stopping journalnode
hadoop001: stopping journalnode
18/11/27 03:33:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop001: stopping zkfc
hadoop002: stopping zkfc
[hadoop@hadoop001 ~]$
停止Zookeeper相关服务
Zookeeper(3台)
[hadoop@hadoop001 ~]$ $ZOOKEEPER_HOME/bin/zkServer.sh stop
JMX enabled by default
Using config: /opt/app/zookeeper-3.4.5-cdh5.15.1/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
[hadoop@hadoop001 ~]$
jps查看进程(3台)
[hadoop@hadoop001 ~]$ jps
23881 Jps
[hadoop@hadoop001 ~]$