1、首先安装rsync
yum install rsync -y
2、关闭防火墙
systemctl stop firewalld
禁止防火墙开机启动
systemctl disable firewalld
3、安装配置jdk
4、下载hadoop
5、解压到/root目录下(目录根据自己习惯)
tar -zxvf hadoop-2.7.5.tar.gz -C /root/
6、配置hadoop环境变量
6.1 打开
vim /etc/profile
6.2末尾添加
export HADOOP_HOME=/root/hadoop/hadoop-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin
6.3 使/etc/profile生效
source /etc/profile
7、配置hadoop
7.1、配置 hadoop-env.sh
打开
vi /root/hadoop/hadoop-3.2.1/etc/hadoop/hadoop-env.sh
找到# The java implementation to use.将其下面的一行改为:
export JAVA_HOME=/root/jdk/jdk1.8.0_161
7.2、修改配置文件etc/hadoop/core-site.xml
打开
vi /root/hadoop/hadoop-3.2.1/etc/hadoop/core-site.xml
添加
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.10.240:9870</value>
</property>
</configuration>
7.3、修改配置文件etc/hadoop/hdfs-site.xml
打开
vi /root/hadoop/hadoop-3.2.1/etc/hadoop/hdfs-site.xml
添加
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
7.1、测试hadoop环境
cd hadoop-3.2.1
bin/hadoop version
表示hadoop可以运行了
Hadoop 3.2.0
Source code repository https://github.com/apache/hadoop.git -r e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
Compiled by sunilg on 2019-01-08T06:08Z
Compiled with protoc 2.5.0
From source with checksum d3f0795ed0d9dc378e2c785d3668f39
This command was run using /software/hadoop-3.2.0/share/hadoop/common/hadoop-common-3.2.0.jar
8、检查ssh是否能免密码登录
[root@localhost hadoop-3.2.0]# ssh localhost
注:如下内容说明还不能免密码登录
The authenticity of host ‘localhost (::1)’ can’t be established.
ECDSA key fingerprint is SHA256:MJxZUIDNbbnlfxCU+l2usvsIsbc6/NTJ06j/TO4g8G0.
ECDSA key fingerprint is MD5:d1:8f:94:dd:80:e2:cf:6b:a7:45:74:e3:6b:2f:f2:0a.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘localhost’ (ECDSA) to the list of known hosts.
root@localhost’s password:
Last login: Fri Jan 25 14:30:29 2019 from 192.168.114.1
8.1 配置ssh免密码登录
[root@localhost ~]# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
[root@localhost ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[root@localhost ~]# chmod 0600 ~/.ssh/authorized_keys
8.2 再次检查是否能免密码登录
[root@localhost hadoop-3.2.1]# ssh localhost
注: 以下说明可以免密码登录了
Last login: Fri Jan 25 15:04:51 2019 from localhost
9、第一次启动hdfs需要格式化:
cd /root/hadoop/hadoop-3.2.1
./bin/hdfs namenode -format
10、 在/hadoop/sbin路径下:
10.1、修改sbin/start-dfs.sh和sbin/stop-dfs.sh
在文件头加入以下内容
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
10.2、在start-yarn.sh和stop-yarn.sh顶部需添加以下:
#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
11、配置yarn
11.1、配置mapred-site.xml
cd /root/hadoop/hadoop-3.2.1/etc/hadoop/
cp mapred-site.xml.template mapred-site.xml
vi mapred-site.xml
<configuration>
<!-- 通知框架MR使用YARN -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
11.2、配置yarn-site.xml
vi yarn-site.xml
<configuration>
<!-- reducer取数据的方式是mapreduce_shuffle -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
11.3、启动hadoop服务
启动:
cd /root/hadoop/hadoop-3.2.1
./sbin/start-yarn.sh
停止:
./sbin/stop-yarn.sh
浏览器访问
hadoop官方安装教程:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html