准备搞一个Flink 集群计算环境,顺手记一下过程,Flink和Spark一样有三种部署模式,分别是Local,Standalone Cluster和Yarn Cluster。本文主要是介绍在Yarn Cluster模式下的环境搭建,最终先跑一个简单的计算文件中单词出现次数的demo,证明YARN集群是没问题可以用的


先放一段网上关于YARN的介绍:
YARN 是资源调度框架、通用的资源管理系统,可以为上层应用提供统一的资源管理和调度,Spark、Flink、Storm等计算框架都可以集成到 YARN 上。这些计算框架可以享受整体的资源调度,进而提高集群资源的利用率,这也就是所谓的 xxx on YARN。因此,绝大部分企业都是将计算作业放到 YARN 上进行调度(当然现在也有很多企业是on k8s了),而不是每种计算框架都单独搭一个资源分配和管理系统。


1. -配置3台机器互相免密登录,分别在3台机器上执行,并复制到其他机器上

[root@kafka3 ~]# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
[root@kafka3 ~]# ssh-copy-id -p 22 -i ~/.ssh/id_rsa.pub "root@kafka1"
[root@kafka3 ~]# ssh-copy-id -p 22 -i ~/.ssh/id_rsa.pub "root@kafka2"

2. 配置hosts文件

配置hosts映射,增加node-1,node-2,node-3
[root@kafka1 opt]# cat /etc/hosts
10.0.83.71  kafka1 node-1
10.0.83.72  kafka2 node-2
10.0.83.73  kafka3 node-3

把修改好的hosts移到另外2个节点
[root@kafka1 opt]# scp /etc/hosts node-2:/etc/
[root@kafka1 opt]# scp /etc/hosts node-3:/etc/

3.在3台机器上分别配置java_home

先找到java的安装目录

which java

ls -lrt /usr/bin/java

[root@kafka1 opt]# ls -lrt /etc/alternatives/java
lrwxrwxrwx. 1 root root 73 12月  5 09:37 /etc/alternatives/java -> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64/jre/bin/java

配置/etc/profile 环境变量生效

vim /etc/profile  在文件最后追加: 

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

4. hadoop集群部署

Hadoop集群依赖zookeeper集群,需要先准备好zookeeper集群可以参考:

https://blog.51cto.com/mapengfei/4752656

下载hadoop:
https://archive.apache.org/dist/hadoop/common/hadoop-3.3.1/

cd /opt
wget https://archive.apache.org/dist/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

tar zxvf hadoop-3.3.1.tar.gz

修改HDFS配置文件:

[root@kafka1 hadoop]# cat /opt/hadoop-3.3.1/etc/hadoop/hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
 <property>
    <name>dfs.nameservices</name>
    <value>vmcluster</value>
  </property>
  <property>
    <name>dfs.ha.namenodes.vmcluster</name>
    <value>nn1,nn2,nn3</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.vmcluster.nn1</name>
    <value>node-1:8020</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.vmcluster.nn2</name>
    <value>node-2:8020</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.vmcluster.nn3</name>
    <value>node-3:8020</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.vmcluster.nn1</name>
    <value>node-1:9870</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.vmcluster.nn2</name>
    <value>node-2:9870</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.vmcluster.nn3</name>
    <value>node-3:9870</value>
  </property>
  <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://node-1:8485;node-2:8485;node-3:8485/vmcluster</value>
  </property>
  <property>
    <name>dfs.client.failover.proxy.provider.vmcluster</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>
  <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
  </property>
  <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/root/.ssh/id_rsa</value>
  </property>
  <property>
    <name>dfs.ha.fencing.ssh.connect-timeout</name>
    <value>30000</value>
  </property>
  <property>
    <name>dfs.namenode.handler.count</name>
    <value>100</value>
  </property>
  <property>
    <name>dfs.safemode.threshold.pct</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/opt/hadoop-3.3.1/data/jn</value>
  </property>
  <property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file://${hadoop.tmp.dir}/dfs/nn</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file://${hadoop.tmp.dir}/dfs/dn</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>
  <property>                                        
    <name>dfs.permissions.enabled</name>
    <value>false</value>
  </property>
  <property>                                        
    <name>dfs.blocksize</name>
    <value>67108864</value>
  </property>

</configuration>

core-site配置

[root@kafka1 hadoop]# cat /opt/hadoop-3.3.1/etc/hadoop/core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
 <property>
    <name>fs.defaultFS</name>
    <value>hdfs://vmcluster</value>
  </property>
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>node-1:2181,node-2:2181,node-3:2181</value>
  </property>
  <property>
    <name>hadoop.http.staticuser.user</name>
    <value>root</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/hadoop-3.3.1/data</value>
  </property>

</configuration>

yarn-site配置

[root@kafka1 hadoop]# cat /opt/hadoop-3.3.1/etc/hadoop/yarn-site.xml 
<?xml version="1.0"?>
 <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarnCluster</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.connect.retry-interval.ms</name>
    <value>2000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname.rm1</name>
    <value>node-1</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname.rm2</name>
    <value>node-2</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>node-1:8088</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>node-2:8088</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address.rm1</name>
    <value>node-1:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address.rm2</name>
    <value>node-2:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm1</name>
    <value>node-1:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm2</name>
    <value>node-2:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>node-1:2181,node-2:2181,node-3:2181</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>yarn.nodemanager.env-whitelist</name>
    <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
  </property>
  <property>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.log.server.url</name>
    <value>http://node-3:19888/jobhistory/logs</value>
  </property>
  <property>
    <name>yarn.log-aggregation.retain-seconds</name>
    <value>604800</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  </property>
  <property>
    <name>yarn.resourcemanager.zk.state-store.address</name>
    <value>node-1:2181,node-2:2181,node-3:2181</value>
  </property>
  <property>
    <name>yarn.application.classpath</name>
    <value>
      $HADOOP_CONF_DIR,
      $HADOOP_COMMON_HOME/share/hadoop/common/*,
      $HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
      $HADOOP_HDFS_HOME/share/hadoop/hdfs/*,
      $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
      $HADOOP_YARN_HOME/share/hadoop/yarn/*,
      $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*
    </value>
  </property>
  <property>
    <name>yarn.nodemanager.pmem-check-enabled</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
  </property>
  <property>
    <name>mapred.job.queue.name</name>
    <value>root.myqueue</value>
  </property>

</configuration>

mapred-site配置

[root@kafka1 hadoop]# cat /opt/hadoop-3.3.1/etc/hadoop/mapred-site.xml 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>node-3:10020</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>node-3:19888</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=/opt/hadoop-3.3.1</value>
  </property>
  <property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=/opt/hadoop-3.3.1</value>
  </property>
  <property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=/opt/hadoop-3.3.1</value>
  </property>
  <property>
    <name>mapreduce.application.classpath</name>
    <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
  </property>

</configuration>

修改Hadoop启动的环境变量配置文件hadoop-env.sh

在最后追加这个配置

[root@kafka1 hadoop]# vim /opt/hadoop-3.3.1/etc/hadoop/hadoop-env.sh
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_ZKFC_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64

把修改后的hadoop源码及配置复制到另外2台

[root@kafka1 opt]# scp -r hadoop-3.3.1 node-2:/opt/
[root@kafka1 opt]# scp -r hadoop-3.3.1 node-3:/opt/

5.启动与初始化

[root@kafka1 opt]# /opt/hadoop-3.3.1/bin/hdfs --daemon start journalnode  # journalnode的节点都执行该命令
WARNING: /opt/hadoop-3.3.1/logs does not exist. Creating.

执行完成后,可以看下HADOOP_HOME目录下的logs目录的journalnode日志,是否正常。

[root@kafka1 opt]#  /opt/hadoop-3.3.1/bin/hdfs namenode -format # 格式化,在其中一台namenode虚拟机执行即可

[root@kafka1 ~]# /opt/hadoop-3.3.1/bin/hdfs --daemon start namenode   # 启动namenode
执行完成后,查看HADOOP_HOME目录下的logs目录的namenode日志,是否正常

[root@kafka1 opt]#  /opt/hadoop-3.3.1/bin/hdfs namenode -bootstrapStand # 副节点同步主节点格式化文件 其余namenode节点执行该命令

[root@kafka2 ~]#  /opt/hadoop-3.3.1/bin/hdfs namenode -bootstrapStandb
[root@kafka3 ~]#  /opt/hadoop-3.3.1/bin/hdfs namenode -bootstrapStandb

[root@kafka1 opt]#  /opt/hadoop-3.3.1/bin/hdfs zkfc -formatZK

[root@kafka1 opt]#  /opt/hadoop-3.3.1/sbin/stop-dfs.sh

## 如果在执行上面命令报错: Error JAVA_HOME is not set and could not be found
可以修改/opt/hadoop-3.3.1/etc/hadoop/hadoop-env.sh  
在里面追加一行 export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64
[root@kafka1 opt]#  /opt/hadoop-3.3.1/sbin/start-dfs.sh
[root@kafka1 opt]#  /opt/hadoop-3.3.1/sbin/start-yarn.sh

# 启动配置historyserver的服务器, (node-3)
[root@kafka3 ~]#  /opt/hadoop-3.3.1/bin/mapred --daemon start historyserver

hadoop 访问界面:
![image.png](https://note.youdao.com/yws/res/12833/WEBRESOURCEe4bd9d8926d388a951a2824a42f38796)

yarn 访问界面:

![image.png](https://note.youdao.com/yws/res/12835/WEBRESOURCE85df89a9b4822b9deaa63fe2dee3fe10)
----

## 6.运行一个mapreduce的自带例子wordcount例子测试下效果
[root@kafka1 hadoop-3.3.1]# /opt/hadoop-3.3.1/bin/hdfs dfs -mkdir /input

# 在服务器上创建一个word.txt做测试
[root@kafka1 ~]# cat word.txt 
a
as
asd
asdfasdf
qwer

# 上传到HDFS上面
[root@kafka1 ~]# /opt/hadoop-3.3.1/bin/hadoop fs -moveFromLocal word.txt /input

# 执行计算命令
[root@kafka1 ~]# /opt/hadoop-3.3.1/bin/hadoop jar /opt/hadoop-3.3.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar wordcount /input/word.txt /output/

yarn上面的运行效果:

image.png

hdfs上的文件:

image.png

注意:安装centos7的时候如果是最小化安装(默认的选择就是最小化安装),是不安装psmisc包,此时hadoop的HA无法正常切换,需要安装yum install psmisc -y包后,重启。

说明一下:psmisc工具包含了pstree、killall、fuser

pstree:以树状图显示程序;

killall:用于kill指定名称的进程;

fuser:用来显示所有正在使用着指定的file, file system 或者 sockets的进程信息。