环境版本
Centos 版本:7
Hadoop版本:3.1.2
Java版本:1.8
一、安装JDK
1、官网下载jdk1.8压缩包
https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
2、上传并解压安装包
在usr/local下新建java目录用来存放jdk;
[root@localhost ~]# mkdir /usr/local/java
将压缩包上传至/usr/local/java目录下:
通过winscp(windows系统)、Finalshell(mac系统)等工具
解压压缩包
[root@localhost java]# tar -zxvf jdk-8u201-linux-x64.tar.gz
3、配置jdk的环境变量
配置环境变量
[root@localhost java]# vim /etc/profile
在profile文件最后一行加入下列内容
export JAVA_HOME=/usr/local/java/jdk1.8.0_201
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
更新profile文件使环境变量生效
[root@localhost ~]# source /etc/profile
输入java -version验证jdk是否配置成功:
[root@localhost java]# java -version
java version "1.8.0_201"
Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)
/etc/profile
二、配置SSH
1、查看ssh安装状态
[root@localhost java]# rpm -qa|grep ssh
openssh-clients-7.4p1-11.el7.x86_64
openssh-7.4p1-11.el7.x86_64
openssh-server-7.4p1-11.el7.x86_64
libssh2-1.4.3-10.el7_2.1.x86_64
2、其他ssh操作(非必要步骤)
安装ssh
[root@localhost ~]# yum install openssh-server
开启ssh服务
[root@localhost ~]# service sshd start
关闭ssh服务
[root@localhost ~]# service sshd stop
重启ssh服务
[root@localhost ~]# service sshd restart
3、配置ssh无密码登录
生成公钥和私钥
[root@localhost ~]# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
将id_rsa.pub公钥拷贝到authorized_keys文件
[root@localhost ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
验证是否成功
[root@localhost ~]# ssh localhost
Last login: Wed Apr 3 16:31:52 2019 from ::1
三、hadoop的安装配置
1、hadoop下载
(1)压缩包下载
[root@localhost ~]# mkdir /usr/local/hadoop
下载方法一:
https://hadoop.apache.org/releases.html
选择3.1.2版本,点击Announcement进行压缩包下载,然后将文件上传至服务器上
下载方法二:
进入/usr/local/hadoop目录下载:
[root@localhost ~]# cd /usr/local/hadoop
[root@localhost hadoop]# wget http://mirrors.shu.edu.cn/apache/hadoop/common/hadoop-3.1.2/hadoop-3.1.2.tar.gz
(2)解压压缩包
[root@localhost hadoop]# tar zxvf hadoop-3.1.2.tar.gz
2、相关文件配置
(1)配置hosts文件
查看当前服务器ip
[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:16:3e:68:c7:bf brd ff:ff:ff:ff:ff:ff
inet 211.68.36.67/24 brd 211.68.36.255 scope global dynamic eth0
valid_lft 59020sec preferred_lft 59020sec
inet6 fe80::42e2:ee0e:8cda:488a/64 scope link
valid_lft forever preferred_lft forever
查看当前主机名
[root@localhost ~]# hostname
localhost.localdomain
修改hosts文件
[root@localhost ~]# vim /etc/hosts
将ip与文件名的对应关系写入hosts文件
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
211.68.36.67 localhost.localdomain
(2)配置hadoop-env.sh文件
[root@localhost hadoop]# vim hadoop-3.1.2/etc/hadoop/hadoop-env.sh
将java的环境变量配置到文件指定位置
export JAVA_HOME=/usr/local/java/jdk1.8.0_201
(3)配置core-site.xml
编辑文件
[root@localhost hadoop]# vim hadoop-3.1.2/etc/hadoop/core-site.xml
将下列代码加入core-site.xml文件
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost.localdomain:9000/</value>
<description>NameNode URI</description>
</property>
</configuration>
(4)配置hdfs-site.xml
编辑文件
[root@localhost hadoop]# vim hadoop-3.1.2/etc/hadoop/hdfs-site.xml
将下列代码加入hdfs-site.xml文件
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/usr/local/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
(5)配置mapred-site.xml
编辑文件
[root@localhost hadoop]# vim hadoop-3.1.2/etc/hadoop/mapred-site.xml
将下列代码加入mapred-site.xml文件
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(5)配置yarn-site.xml
编辑文件
[root@localhost hadoop]# vim hadoop-3.1.2/etc/hadoop/yarn-site.xml
将下列代码加入yarn-site.xml文件
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost.localdomain:8088</value>
</property>
</configuration>
四、启动服务
1、格式化namenode
[root@localhost hadoop]# cd /usr/local/hadoop/hadoop-3.1.2
[root@localhost hadoop]# bin/hdfs namenode -format
2 、启动hadoop
[root@localhost hadoop]# cd /usr/local/hadoop/hadoop-3.1.2
[root@localhost hadoop]# sbin/start-all.sh
3 、查看hadoop进程
[root@localhost hadoop]# jps
18467 ResourceManager
15845 NameNode
18662 NodeManager
16024 DataNode
16328 SecondaryNameNode
31069 Jps
五、可能出现的问题
1、在启动hadoop执行sbin/start-all.sh时出现如下报错:
Starting namenodes on [master]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [slave1]
ERROR: Attempting to operate on hdfs secondarynamenode as root
解决办法1:
在/hadoop/sbin路径下
在start-dfs.sh和stop-dfs.sh两个文件的顶部添加以下参数
#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
在start-yarn.sh和stop-yarn.sh顶部添加以下参数
#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
修改完毕后重启 ./start-all.sh即可成功