Hadoop集群搭建

操作系统:CentOS Stream release 8

Hadoop版本:3.3.6

集群角色规划

Hadoop集群组成:

集群名称集群角色
HDFS集群NameNode、DataNode、SecondaryNameNode
YARN集群ResourceManager、NodeManager

集群划分:

节点角色
node1.hadoop.studyNameNode、DataNode、ResourceManager、NodeManager
node2.hadoop.studySecondaryNameNode、DataNode、NodeManager
node3.hadoop.studyDataNode、NodeManager

环境准备

创建3台虚拟机,操作系统为CentOS8:

分别修改每台虚拟机的/etc/hostname文件,分别对应node1.hadoop.study、node2.hadoop.study、node3.hadoop.study:

修改/etc/hosts文件,将以下内容追加到文件末尾:

192.168.80.131 node1.hadoop.study
192.168.80.132 node2.hadoop.study
192.168.80.133 node3.hadoop.study

 设置ssh免密登录,这样node1可以直接连接到node1(自己也要)、node2和node3:

ssh-keygen 

ssh-copy-id root@node1.hadoop.study

ssh-copy-id root@node2.hadoop.study

ssh-copy-id root@node3.hadoop.study

安装 oracle jdk8:

oracle jdk8下载地址Java Downloads | Oracle,同rpm的方式安装,所以下载这个

安装

rpm -ivh jdk-8u401-linux-x64.rpm

验证

 关闭防火墙:

systemctl stop firewalld.service

systemctl disable firewalld.service

安装ntpdate,集群节点之间时间同步

rpm -ivh http://mirrors.wlnmp.com/centos/wlnmp-release-centos.noarch.rpm

yum install -y  wntp

ntpdate time1.cloud.tencent.com

上传编译好的hadoop压缩包到node1,并解压

解压命令

tar -zxvf hadoop-3.3.6.tar.gz

 修改etc/hadoop/core-site.xml

文件内容:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- name node的地址-->
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://node1.hadoop.study</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/root/data/hadoop-3.3.6</value>    
  </property>
  <property>
    <name>hadoop.http.staticuser.user</name>
    <value>root</value>    
  </property>
</configuration>

修改etc/hadoop/hdfs-site.xml

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>node2.hadoop.study:9868</value>
  </property>
</configuration>

修改etc/hadoop/yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>node1.hadoop.study</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>128</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>1024</value>
  </property>
  
</configuration>

修改etc/hadoop/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>	
  <property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>
</configuration>

修改etc/hadoop/hadoop-env.sh

在文件末尾追加

export JAVA_HOME=/usr/lib/jvm/jdk-1.8-oracle-x64

export HADOOP_HOME=/root/hadoop-3.3.6

export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root

修改etc/hadoop/workers

先删除原来的内容,再添加以下内容

node1.hadoop.study
node2.hadoop.study
node3.hadoop.study

将整个hadoop目录发送到node2和node3

scp -r hadoop-3.3.6 node2.hadoop.study:/root

scp -r hadoop-3.3.6 node3.hadoop.study:/root

 首次启动 HDFS 时,必须对其进行格式化

在NameNode节点上执行,也就是node1节点:

cd hadoop-3.3.6/bin/
./hdfs namenode -format

启动hdfs集群和yarn集群 

在node1节点上操作

cd /root/hadoop-3.3.6/sbin

./start-all.sh

 

验证:

访问web ui

需要修改window的hosts文件,或者直接用NameNode的ip和ResourceManager的ip进行访问

PS: 如果无法访问,可能是代理的问题

http://node1.hadoop.study:9870/

http://node1.hadoop.study:8088/

NameNodehttp://nn_host:port/Default HTTP port is 9870.
ResourceManagerhttp://rm_host:port/Default HTTP port is 8088.

 

将hadoop目录下的bin和sbin添加的path中,方便使用

在.bash_profile中添加如下内容

source .bash_profile 使其生效

  • 10
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值