Centos7搭建hadoop环境的详细步骤
1.前期准备:
所需文件:3台虚拟机 ,jdk-8u161-linux-x64.tar.gz ,hadoop-2.7.4.tar.gz
1.1 centos的镜像下载地址: https://mirrors.aliyun.com/centos/7/isos/x86_64/
1.2 jdk的下载地址:https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html
1.3 hadoop的下载地址: https://archive.apache.org/dist/hadoop/common/
我这里我用的是 CentOS-7-x86_64-Minimal-2009.iso 精简版的centos镜像
虚拟机的装机步骤就不说了,安装好以后,进行对虚拟机配网,
2.基础环境:
2.1.配网:(方法有很多种这里就只提一种)
第一台:master 192.168.11.31
第二台:slave1 192.168.11.32
第三台:salve2 192.168.11.33
三台主机只有最后的主机ip不同,其余都一样
[root@master ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens32
添加ip地址,网关以及dns,把 ONBOOT 改成 yes 这是开启启动连接的意思
然后重启一下网卡命令是:
[root@master ~]# systemctl restart network
成功以后是可以实现3台主机之间互通的,
然后就可以用远程连接虚拟机了,连接工具用CRT或者Xshell都可
2.2 基础配置:
注意:基础配置在三台主机中都需要执行!!
2.2.1 修改主机名:
三台主机都要修改,分别命名为:
第一台:master
第二台:slave1
第三台:salve2
修改主机名的命令是:
[root@master ~]# hostnamectl set-hostname master
分别在三个主机中执行就可以了,完成以后使用 bash 命令刷新即可看到新的主机名
2.2.2 配置hosts文件:
修改hosts文件就是把主机的ip地址以及主机名写进去:
命令如下:
[root@master ~]# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.11.31 master
192.168.11.32 slave1
192.168.11.33 slave2
把后面的三个ip以及主机名写进去,:wq 保存退出就可以了
测试一下直接去ping主机名
2.2.3 关闭防火墙:
三台主机都需要关闭防火墙,且关闭开机自启
命令如下:
[root@master ~]# systemctl stop firewalld
[root@master ~]# systemctl disable firewalld
2.2.4 关闭selinux安全模式:
这里同样也是需要在三个主机中执行
命令如下:
[root@master ~]# setenforce 0
[root@master ~]# vim /etc/sysconfig/selinux
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
执行完以后,三台虚拟机的基础配置就配置好啦!
因为我装的是精简版的centos7 是不带java环境的,所以这里加一下,把java环境导入进去,这个步骤同样也是需要在3个节点都执行
3.java环境配置:
通过远程工具把 jdk-8u161-linux-x64.tar.gz 这个包导入到虚拟机当中,在/usrl/目录下新建一个java文目录,把解压好的包放到java目录当中
命令如下:
[root@master ~]# mkdir /usr/java/
[root@master ~]# tar -xzvf jdk-8u161-linux-x64.tar.gz -C /usr/java/
导入以后修改 /etc/profile 文件,把java的环境写进去:
[root@master ~]# vim /etc/profile
# /etc/profile
JAVA_HOME=/usr/java/jdk1.8.0_161
JRE_HOME=$JAVA_HOME/jre
CLASS_PATH=.:/JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME JRE_HOME CLASS_PATH PATH
!!!这里要注意的是路径的变量不要写错了,就是JAVA_HOME= 这一栏,这个是你解压完的jdk文件存放的路径是哪里,就写哪里,我的是 /usr/java/jdk1.8.0_161
完成以后看一下java的版本,是否配置成功了
[root@master ~]# source /etc/profile
[root@master ~]# java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
出现这个结果,就说明java环境配置成功啦!!
这是master节点的操作,在slave1 和 slave2 节点都是这样操作,
我们可以直接把jdk的目录(和profile文件)直接用 scp 命令发送到slave1和slave2上,命令如下:
[root@master ~]# scp -r /usr/java/jdk1.8.0_161/ slave1:/usr/java/
[root@master ~]# scp -r /usr/java/jdk1.8.0_161/ slave2:/usr/java/
[root@master ~]# scp /etc/profile slave1:/etc/profile
[root@master ~]# scp /etc/profile slave2:/etc/profile
注意:一定要三台虚拟机都能出现环境配置成功的结果才可以!!
4. 配置SSH免密登录:
进入~/.ssh目录
[root@master ~]# cd ~/.ssh
每台机器执行:ssh-keygen -t rsa,这里一路回车即可
[root@master .ssh]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:xa55ggm96lYAKZCjzwxsGFWFaRLn7olBaDRv0GfSwm8 root@master
The key's randomart image is:
+---[RSA 2048]----+
|o*=+++. |
|*.BB++ . |
|==.*B o |
|=+...E o |
|.=. +.. S . |
| ++ o.+ o |
| . o.+ + . |
| .. o |
| oo |
+----[SHA256]-----+
[root@master .ssh]#
然后会生成两个文件,一个是私钥,一个是公钥,
[root@master .ssh]# ls
authorized_keys id_rsa id_rsa.pub known_hosts
在master1中执行:cp id_rsa.pub authorized_keys
本机无密钥登录
修改authorized_keys权限:chmod 644 authorized_keys
[root@master .ssh]# chmod 644 authorized_keys
重启ssh服务
[root@master .ssh]# systemctl restart sshd
设置master与其他节点无密钥登录
从master中把authorized_keys分发到各个结点上(会提示输入密码,输入密码即可):
[root@master .ssh]# scp /root/.ssh/authorized_keys slave1:/root/.ssh
root@slave1's password:
authorized_keys 100% 393 419.7KB/s 00:00
[root@master .ssh]# scp /root/.ssh/authorized_keys slave2:/root/.ssh
root@slave1's password:
authorized_keys 100% 393 419.7KB/s 00:00
测试用master连接到其他节点
[root@master .ssh]# ssh slave1
Last failed login: Tue Jan 5 04:49:47 CST 2021 from master on ssh:notty
There was 1 failed login attempt since the last successful login.
Last login: Tue Jan 5 04:40:17 2021 from 192.168.11.1
[root@slave1 ~]#
测试成功!!!
5.安装配置hadoop
5.1 下载 hadoop-2.7.4.tar.gz
下载地址:https://archive.apache.org/dist/hadoop/common/hadoop-2.7.4/
5.2 导入master节点并解压:
导入的方法有很多种,这里不做详细解说,jdk的包能导入,hadoop的包也可,直接说解压hadoop的包把,把hadoop的包解压到/opt/hadoop/目录,命令如下
[root@master ~]# mkdir /opt/hadoop/
[root@master ~]# tar -xzvf hadoop-2.7.4.tar.gz -C /opt/hadoop/
5.3 修改配置文件:
进入/opt/hadoop/hadoop-2.7.4/etc/hadoop/目录下,添加JAVA_HOME环境到hadoop-env.sh和yarn-env.sh 两个配置文件当中
JAVA_HOME=/usr/java/jdk1.8.0_161
:wq 保存之后,修改余下这些配置文件
core-site.xml
[root@master hadoop]# vim core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
</configuration>
hdfs-site.xml
[root@master hadoop]# vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop/dfs/data</value>
</property>
</configuration>
mapred-site.xml.template
[root@master hadoop]# vim mapred-site.xml.template
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>Master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>Master:19888</value>
</property>
</configuration>
yarn-site.xml
[root@master hadoop]# vim yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
5.4 编辑slaves文件:
清空slaves,再加入其余两个节点的主机名
slave1
slave2
[root@master hadoop]# vim slaves
slave1
slave2
5.5 将hadoop分发到其余节点
将这一整个 hadoop-2.7.4/ 目录scp到salve1和slave2节点
[root@master ~]# scp -r /opt/hadoop/ slave1:/opt/hadoop/
[root@master ~]# scp -r /opt/hadoop/ slave2:/opt/hadoop/
5.5 hadoo启动!!!
在master服务器启动hadoop,从节点会自动启动,进入/home/hadoop/hadoop-2.7.0/ 目录
[root@master ~]# cd /opt/hadoop/hadoop-2.7.4/
(1)初始化,输入命令,bin/hdfs namenode -format
(2)全部启动sbin/start-all.sh
(3)终止服务器:sbin/stop-all.sh
(4)输入命令jps,可以看到相关信息
(1) 初始化
[root@master hadoop-2.7.4]# bin/hdfs namenode -format
21/01/05 05:28:07 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.11.31
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.4
....
........
............
Re-format filesystem in Storage Directory /opt/hadoop/dfs/name ? (Y or N) Y
.......
............
................
21/01/05 05:29:31 INFO util.ExitUtil: Exiting with status 0
21/01/05 05:29:31 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.11.31
************************************************************/
(2) 全部启动:
[root@master hadoop-2.7.4]# sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
.....
........
..........
(3) jps查看是否成功
有结果则说明hadoop部署完成!
[root@master ~]# jps
11670 ResourceManager
11335 NameNode
11995 Jps
11519 SecondaryNameNode
[root@slave1 ~]# jps
11269 DataNode
11509 Jps
11368 NodeManager
[root@slave2 ~]# jps
11298 DataNode
11397 NodeManager
11529 Jps
此时,三个节点执行jps命令,应该都可出现上图结果,否则即是安装失败,还可通过访问web页面查看是否可正常访问Hadoop,若可,即是部署完成
(1)浏览器打开http://192.168.11.31:8088/
(2)浏览器打开http://192.168.11:31:50070/