Hadoop 分布式集群配置安装

Hadoop部署

集群部署规划

NameNode 和 SecondaryNameNode 不要安装在同一台服务器
ResourceManager 也很消耗内存,不要和 NameNode、SecondaryNameNode 配置在同一台机器上

cpu101cpu102cpu103
HDFSNameNodeSecondaryNameNode
DataNodeDataNodeDataNode
YARNResourceManager
NodeManagerNodeManagerNodeManager

下载地址 :

https://archive.apache.org/dist/hadoop/common/hadoop-3.1.3/

在这里插入图片描述

hadoop-3.1.3.tar.gz 导入到 /opt/software 文件夹下面

解压安装文件到 /opt/module 下面

tar -zxvf hadoop-3.1.3.tar.gz -C /opt/module/

在这里插入图片描述

查看是否解压成功

在这里插入图片描述

将 Hadoop 添加到环境变量

打开 /etc/profile.d/my_env.sh 文件

sudo vim /etc/profile.d/my_env.sh
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

在这里插入图片描述

有效

source /etc/profile

分发环境变量文件

sudo ~/bin/xsync /etc/profile.d/my_env.sh

在这里插入图片描述

常用端口号说明

端口名称Hadoop2.xHadoop3.x
NameNode 内部通信端口8020 / 90008020 / 9000 / 9820
NameNode HTTP UI500709870
MapReduce 查看执行任务端口80888088
历史服务器通信端口1988819888

配置集群

常用的配置文件

Hadoop2.xHadoop3.x
核心配置文件core-site.xmlcore-site.xml
HDFS 配置文件hdfs-site.xmlhdfs-site.xml
YARN 配置文件yarn-site.xmlyarn-site.xml
MapReduce 配置文件mapred-site.xmlmapred-site.xml
slavesworkers

核心配置文件 core-site.xml

vim core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <!-- 指定 NameNode 的地址 -->
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://cpu101:8020</value>
        </property>

        <!-- 指定 hadoop 数据的存储目录 -->
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/opt/module/hadoop-3.1.3/data</value>
        </property>
        <!-- 配置 HDFS 网页登录使用的静态用户为 cpu -->
        <property>
                <name>hadoop.http.staticuser.user</name>
                <value>cpu</value>
        </property>

		<!-- 配置该cpu(superUser)允许通过代理访问的主机节点 -->
	    <property>
	        <name>hadoop.proxyuser.cpu.hosts</name>
	        <value>*</value>
		</property>
		<!-- 配置该cpu(superUser)允许通过代理用户所属组 -->
	    <property>
	        <name>hadoop.proxyuser.cpu.groups</name>
	        <value>*</value>
		</property>
		<!-- 配置该cpu(superUser)允许通过代理的用户-->
	    <property>
	        <name>hadoop.proxyuser.cpu.users</name>
	        <value>*</value>
		</property>

</configuration>

在这里插入图片描述

HDFS配置文件 hdfs-site.xml

vim hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <!-- nn web 端访问地址-->
        <property>
                <name>dfs.namenode.http-address</name>
                <value>cpu101:9870</value>
        </property>
        <!-- 2nn web 端访问地址-->
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>cpu103:9868</value>
        </property>

		<!--指定 HDFS 的元信息存储目录-->
        <property>
              <name>dfs.namenode.name.dir</name>
              <value>file:///opt/module/hadoop-3.1.3/dfs/name</value>
        </property>

		<!--指定 HDFS 的数据存储目录-->
        <property>
              <name>dfs.datanode.data.dir</name>
              <value>file:///opt/module/hadoop-3.1.3/dfs/data</value>
        </property>

		<!-- datanode通信通过域名 -->
        <property>
                <name>dfs.client.use.datanode.hostname</name>
                <value>true</value>
        </property>
        
	    <!-- 测试环境指定HDFS副本的数量 1 -->
	    <property>
	        <name>dfs.replication</name>
	        <value>1</value>
	    </property>
</configuration>

在这里插入图片描述

YARN配置文件 yarn-site.xml

vim yarn-site.xml
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
        <!-- 指定 MR 走 shuffle -->
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <!-- 指定 ResourceManager 的地址-->
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>cpu102</value>
        </property>
        <!-- 环境变量的继承 -->
        <property>
                <name>yarn.nodemanager.env-whitelist</name>
                <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
        </property>
        <!-- 开启日志聚集功能 -->
        <property>
                <name>yarn.log-aggregation-enable</name>
                <value>true</value>
        </property>

        <!-- 设置日志聚集服务器地址 -->
        <property>
                <name>yarn.log.server.url</name>
                <value>http://cpu102:19888/jobhistory/logs</value>
        </property>

        <!-- 设置日志保留时间为 7 天 -->
        <property>
                <name>yarn.log-aggregation.retain-seconds</name>
                <value>604800</value>
        </property>

        <!--是否启动一个线程检查每个任务正使用的物理内存量,如果任务超出分配值,则直接将其杀掉,默认是true -->
        <property>
             <name>yarn.nodemanager.pmem-check-enabled</name>
             <value>false</value>
        </property>

        <!--是否启动一个线程检查每个任务正使用的虚拟内存量,如果任务超出分配值,则直接将其杀掉,默认是true -->
        <property>
             <name>yarn.nodemanager.vmem-check-enabled</name>
             <value>false</value>
        </property>

	<!--
	    <property>
	        <name>yarn.scheduler.maximum-allocation-mb</name>
	        <value>9000</value>
	        <discription>每个任务最多可用内存,默认8182MB</discription>
	    </property>
	-->
	<!--
	    <property>
	        <name>yarn.scheduler.minimum-allocation-mb</name>
	        <value>3072</value>
	        <discription>每个任务最小可用内存</discription>
	    </property>
	-->
	
	<!--
	    <property>
	        <name>yarn.nodemanager.vmem-pmem-ratio</name>
	        <value>3</value>
	        <discription>物理内存和虚拟内存比率</discription>
	    </property>
	-->
	
</configuration>

在这里插入图片描述

MapReduce配置文件 mapred-site.xml

vim mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <!-- 指定 MapReduce 程序运行在 Yarn 上 -->
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <!-- 找到mapreduce -->
        <property>
            <name>yarn.app.mapreduce.am.env</name>
            <value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.1.3</value>
        </property>
        <!-- 找到mapreduce -->
        <property>
           <name>mapreduce.map.env</name>
            <value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.1.3</value>
        </property>
        <!-- 找到mapreduce -->
        <property>
            <name>mapreduce.reduce.env</name>
            <value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.1.3</value>
        </property>

        <!-- 历史服务器端地址 -->
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>cpu102:10020</value>
        </property>

        <!-- 历史服务器 web 端地址 -->
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>cpu102:19888</value>
        </property>
</configuration>

在这里插入图片描述

配置 workers

vim workers
cpu101
cpu102
cpu103

该文件中添加的内容结尾不允许有空格,文件中不允许有空行

在这里插入图片描述

分发Hadoop

xsync /opt/module/hadoop-3.1.3/

在这里插入图片描述

群起集群

如果集群是第一次启动,需要在 cpu101 节点格式化 NameNode(注意格式化之前,一定要先停止上次启动的所有 namenode 和 datanode 进程,然后再删除 data 和 log 数据)

hdfs namenode -format

在这里插入图片描述

Hadoop群起脚本

vim myhadoop.sh
#!/bin/bash

if [ $# -lt 1 ]
then
        echo "No Args Input..."
        exit ;
fi

case $1 in
"start")
        echo " =================== 启动 hadoop 集群 ==================="

        echo " --------------- 启动 hdfs ---------------"
        ssh cpu101 "/opt/module/hadoop-3.1.3/sbin/start-dfs.sh"
        echo " --------------- 启动 yarn ---------------"
        ssh cpu102 "/opt/module/hadoop-3.1.3/sbin/start-yarn.sh"
        echo " --------------- 启动 historyserver ---------------"
        ssh cpu102 "/opt/module/hadoop-3.1.3/bin/mapred --daemon start historyserver"
;;
"stop")
        echo " =================== 关闭 hadoop 集群 ==================="

        echo " --------------- 关闭 historyserver ---------------"
        ssh cpu102 "/opt/module/hadoop-3.1.3/bin/mapred --daemon stop historyserver"
        echo " --------------- 关闭 yarn ---------------"
        ssh cpu102 "/opt/module/hadoop-3.1.3/sbin/stop-yarn.sh"
        echo " --------------- 关闭 hdfs ---------------"
        ssh cpu101 "/opt/module/hadoop-3.1.3/sbin/stop-dfs.sh"
;;
*)
        echo "Input Args Error..."
;;
esac

在这里插入图片描述

权限 :

chmod 777 myhadoop.sh

在这里插入图片描述

启动

myhadoop.sh start

在这里插入图片描述

停止

myhadoop.sh stop

在这里插入图片描述

查看页面

Web端查看HDFS的Web页面:

http://cpu101:9870/

在这里插入图片描述

Web 查看 Job

http://cpu102:19888/jobhistory

在这里插入图片描述

Web端查看 SecondaryNameNode

http://cpu103:9868/status.html

在这里插入图片描述

Web端查看 All Applications

http://cpu102:8088

在这里插入图片描述

测试 MR

创建目录

hadoop fs -mkdir /input

在这里插入图片描述

在这里插入图片描述

上传文件

在这里插入图片描述

hadoop fs -put word.txt /input

在这里插入图片描述

在这里插入图片描述

删除 HDFS 上已经存在的输出文件

hadoop fs -rm -r /output

执行 wordcount 程序

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output

在这里插入图片描述

在这里插入图片描述

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值