hadoop3.3.0集群安装(centos7+三台虚拟机)

最新推荐文章于 2024-04-13 22:27:39 发布

欺骗落叶

最新推荐文章于 2024-04-13 22:27:39 发布

阅读量1.4k

点赞数 5

分类专栏：大数据

本文链接：https://blog.csdn.net/qq_44846166/article/details/111169529

版权

大数据专栏收录该内容

1 篇文章 0 订阅

订阅专栏

文章目录

一.准备工作
安装hadoop集群
遇到的问题
- (1)无法上网
- (2)hadoop运行成功，可以访问8088端口但无法访问9870端口

一.准备工作

安装三个CentOS7的虚拟机

在虚拟机中安装三个centOS7系统，此部分省略

三个虚拟机之间实现通信

效果：
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-cdgHDEyh-1607927360815)(en-resource://database/6907:1)]

三个虚拟机之间实现ssh无密登录

输入命令

ssh-keygen -t dsa

另外两个子虚拟机也执行ssh-keygen -t dsa操作，并分别将id_dsa.pub内容拷贝到第一台虚拟机的authorized_keys文件中。将第一台的authorized_keys文件拷贝到另外两台虚拟机的/root/.ssh/ 下面。

效果:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-C3BJaldi-1607927360819)(en-resource://database/6909:1)]

修改三台虚拟机主机名

执行

vi /etc/hostname

以主机(namenode为例,将其修改为leader)
在这里插入图片描述

在三台虚拟机里面更改，将其ip与主机名匹配起来

vi /etc/hosts

在这里插入图片描述

(PS：查询ip的方法:
输入

ifconfig

在这里插入图片描述

)

通过共享文件夹实现与物理机的文件传输

因为安装了centOS后一般都是没有图形化界面的，所以我们通过vmtools无法很好的实现虚拟机和物理机之间的文件传输，这里一个比较好的方法是通过vmware的共享文件夹实现，操作如下:

右键点击对应虚拟机的选项卡，点击设置。
在这里插入图片描述

然后在弹出的窗口中依次选择:选项——>共享文件夹，将其右边的已禁用改为总是启用，然后点击下面的添加按钮，点击浏览，选择物理机上的一个文件夹，作为共享文件夹
在这里插入图片描述

然后在centOS中输入:

vmware-hgfsclient

就可以看到共享的文件夹:
在这里插入图片描述

然后在虚拟机上新建一个文件将其挂载在上面:

mkdir myshare
vmhgfs-fuse .host:/ myshare -o nonempty   //后面的-o nonempty不加可能会出错

然后进入你创建的那个文件夹里面，发现共享文件夹就在里面，进去可以看到共享内容
在这里插入图片描述

在这里插入图片描述

成功实现共享

安装hadoop集群

安装JAVA环境

首先下载一个版本的jdk，然后放在共享文件夹下面，然后将其解压到你希望安装的地方，我的是/usr/local/hadoop/java下，首先进入共享文件夹下面执行命令:

mkdir /usr/local/hadoop/java
tar -zxf  jdk1.8.0_161.tar.gz -C /usr/local/hadoop/java

然后修改环境变量:

sudo vi /etc/profile

将下面几行内容写在文件末尾:

export JAVA_HOME=/usr/local/hadoop/java/jdk1.8.0_161
export JRE_HOME=/usr/local/hadoop/java/jdk1.8.0_161/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATHexport PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$JAVA_HOME:$PATH

然后更新一下:

source /etc/profile

可以通过查看版本来检查是否安装成功

java -version

在这里插入图片描述

出现以上界面说明没有问题了。

在三台虚拟机上安装hadoop3.3.0

(1)解压文件及新建目录(三台虚拟机同样的操作)

首先进入之前创建的共享文件夹，然后将hadoop解压到指定的文件夹，我的安装位置时/usr/local/hadoop

mkdir /usr/local/hadoop
tar -zxf hadoop-3.3.0.tar.gz -C /usr/local/hadoop

新建几个目录

cd /usr/local/hadoop
mkdir tmp 
mkdir var 
mkdir dfs 
mkdir dfs/name 
mkdir dfs/data

(2)修改hadoop配置文件

进入/usr/local/hadoop/hadoop-3.3.0/etc/hadoop/下

cd /usr/local/hadoop/hadoop-3.3.0/etc/hadoop/

修改core-site.xml(l(中的地址为你的地址(完全跟着本篇教程来课无视))):

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>

        <name>hadoop.tmp.dir</name>
        
        <value>/usr/local/hadoop/tmp</value>

        <description>Abase for other temporary directories.</description>

   </property>

   <property>

        <name>fs.default.name</name>
        
        <value>hdfs://leader:9000</value> 


   </property>
</configuration>

修改hadoop-env.sh:

添加一行(注意路径为你的java路径):

export JAVA_HOME=/usr/local/hadoop/java/jdk1.8.0_161

修改hdfs-site.xml(中的地址为你的地址(完全跟着本篇教程来课无视)):

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

   <property>
       <name>dfs.name.dir</name>
       <value>/usr/local/hadoop/dfs/name</value>
   <description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.</description>
   </property>
   <property>
       <name>dfs.data.dir</name>
       <value>/usr/local/hadoop/dfs/data</value>
   <description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description>
   </property>
   <property>
       <name>dfs.replication</name>
       <value>2</value>
   </property>
   <property>
       <name>dfs.permissions</name>
       <value>true</value>
       <description>need not permissions</description>
   </property>


</configuration>

修改mapred-site.xm(中的地址为你的地址(完全跟着本篇教程来课无视)):

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>

   <name>mapred.job.tracker</name>

   <value>leader:49001</value>

</property>

<property>

      <name>mapred.local.dir</name>

       <value>/usr/local/hadoop/var</value>

</property>


<property>

       <name>mapreduce.framework.name</name>

       <value>yarn</value>

</property>
</configuration>

修改worker文件:

第一台虚拟机(leader)

member1
member2

第二虚拟机(member1)

leader
member2

第三虚拟机(member2)

leader
member1

博客地址:https://blog.csdn.net/qq_44846166/article/details/111169529

修改yarn-site.xml文件(中的地址为你的地址(完全跟着本篇教程来课无视)):

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<property>

        <name>yarn.resourcemanager.hostname</name>

        <value>leader</value>

   </property>

   <property>

        <description>The address of the applications manager interface in the RM.</description>

        <name>yarn.resourcemanager.address</name>

        <value>${yarn.resourcemanager.hostname}:8032</value>

   </property>

   <property>

        <description>The address of the scheduler interface.</description>

        <name>yarn.resourcemanager.scheduler.address</name>

        <value>${yarn.resourcemanager.hostname}:8030</value>

   </property>

   <property>

        <description>The http address of the RM web application.</description>

        <name>yarn.resourcemanager.webapp.address</name>

        <value>${yarn.resourcemanager.hostname}:8088</value>

   </property>

   <property>

        <description>The https adddress of the RM web application.</description>

        <name>yarn.resourcemanager.webapp.https.address</name>

        <value>${yarn.resourcemanager.hostname}:8090</value>

   </property>

   <property>

        <name>yarn.resourcemanager.resource-tracker.address</name>

        <value>${yarn.resourcemanager.hostname}:8031</value>

   </property>

   <property>

        <description>The address of the RM admin interface.</description>

        <name>yarn.resourcemanager.admin.address</name>

        <value>${yarn.resourcemanager.hostname}:8033</value>

   </property>

   <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

   </property>

   <property>

        <name>yarn.scheduler.maximum-allocation-mb</name>

        <value>1024</value>

        <discription>每个节点可用内存,单位MB,默认8182MB</discription>

   </property>

   <property>

        <name>yarn.nodemanager.vmem-pmem-ratio</name>

        <value>2.1</value>

   </property>

   <property>

        <name>yarn.nodemanager.resource.memory-mb</name>

        <value>1024</value>

</property>

   <property>

        <name>yarn.nodemanager.vmem-check-enabled</name>

        <value>false</value>

</property>
</configuration>

(3)启动hadoop:

格式化脚本

cd /usr/local/hadoop/hadoop-3.3.0/bin
 ./hadoop namenode -format

启动

cd /usr/local/hadoop/hadoop-3.3.0/sbin
./start-all.sh

如果出错:
在start_dfs.sh 、stop-dfs.sh两个文件开头位置添加如下配置：
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root注意：一定要是开头位置。在start_yarn.sh 、stop-yarn.sh两个文件开头位置添加如下配置：
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=root
YARN_NODEMANAGER_USER=root

在这里插入图片描述

测试是否安装成功

在浏览器上输入主机(leader)的ip地址+:8088，会弹出如下画面:
在这里插入图片描述
博客地址:https://blog.csdn.net/qq_44846166/article/details/111169529

遇到的问题

(1)无法上网

解决参考：CentOS-7，网络ping不通详解_pocher的博客-CSDN博客_centos7 ping不通

(2)hadoop运行成功，可以访问8088端口但无法访问9870端口

解决参考: 启动Hadoop start-dfs.sh Permission denied
stackoverflow上的回答

首先需要检查防火墙是否关闭，输入以下命令关闭防火墙:

systemctl stop firewalld.service
systemctl disable firewalld.service

然后再运行一下./start-dfs.sh

cd /usr/local/hadoop/hadoop-3.3.0/sbin
./start-dfs.sh

本人运行的时候出现了这样的错误:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Zk2eZLKN-1607927360837)(en-resource://database/6913:1)]
原因是未生成公私钥对，设置免密登录

输入

ssh-keygen -t rsa
cd /root/.ssh
cp id_rsa.pub authorized_keys

然后运行./start-dfs.sh就没有那个错误了

此时我们可以查看端口情况:
输入:

netstat -tlunp

在这里插入图片描述

运行成功会出现以上三个接口
(安装netstat命令:

yum install net-tools

)

然后浏览器访问你的主机(leader)ip + 端口号9870就可以看到:
在这里插入图片描述

文章参考:

Hadoop-3.3.0 安装_zhanghaoninhao的博客-CSDN博客
 centos7虚拟机与主机共享文件夹 - christine-ting - 博客园 (cnblogs.com)
Hadoop集群搭建教程（详细）_fanxin_i的博客-CSDN博客_hadoop集群搭建完整教程
 用三台虚拟机搭建Hadoop全分布集群 - 奇域巫师 - 博客园 (cnblogs.com)
centos7安装配置Hadoop集群_一只修炼成精的猴子的博客-CSDN博客_centos7安装hadoop

欺骗落叶

关注

5
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
hadoop3.3.0集群安装(centos7+三台虚拟机)

目录文章目录一.准备工作安装三个CentOS7的虚拟机三个虚拟机之间实现通信三个虚拟机之间实现ssh无密登录修改三台虚拟机主机名通过共享文件夹实现与物理机的文件传输安装hadoop集群安装JAVA环境在三台虚拟机上安装hadoop3.3.0(1)解压文件及新建目录(三台虚拟机同样的操作)(2)修改hadoop配置文件修改core-site.xml(l(中的地址为你的地址(完全跟着本篇教程来课无视))):修改hadoop-env.sh:修改hdfs-site.xml(中的地址为你的地址(完全跟着本篇教程来课
复制链接

扫一扫