操作系统:Red Hat Linux7 64位
一.SSH免密登录
1.1、检查是否可以免密匙登录
1.2CentOS默认没有启动ssh无密登录,去掉/etc/ssh/sshd_config其中2行的注释,每台服务器都要设置,
#RSAAuthentication yes#PubkeyAuthentication yes
Last login: Thu Oct 20 15:47:22 2016 from 192.168.0.100
二.安装JDK
1.编辑/etc/profile文件,在文件末尾添加以下内容
export JAVA_HOME=/usr/java/jdk1.8.0_131
export PATH=$JAVA_HOME/bin:$PATH
//export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
2.使环境变量生效,终端中运行以下命令
#source /etc/profile
3.配置完后可以通过echo $PATH查看配置结果。
4.安装验证# java -version
******************************************
三.安装Hadoop
1,下载Hadoop3.0.0
下载地址:http://mirrors.hust.edu.cn/apache/hadoop/common/stable2/hadoop-3.0.0.tar.gz
2,解压安装
1),复制 hadoop-3.0.0.tar.gz 到/usr/hadoop目录下,
然后#tar -xzvf hadoop-2.7.1.tar.gz
解压,解压后目录为:/usr/hadoop/hadoop-3.0.0
Hadoop 解压后即可使用。输入如下命令来检查 Hadoop 是否可用,成功则会显示 Hadoop 版本信息:
cd /usr/hadoop/hadoop-3.0.0
./bin/hadoop version
2),在/usr/hadoop/目录下,建立tmp、hdfs/name、hdfs/data目录,执行如下命令
#mkdir /usr/hadoop/tmp
#mkdir /usr/hadoop/hdfs
#mkdir /usr/hadoop/hdfs/data
#mkdir /usr/hadoop/hdfs/name
3),设置环境变量,#vi /etc/profile
4),使环境变量生效,终端中运行如下命令
#source /etc/profile
CentOS版本用 #source ~/.bash_profile
3,Hadoop配置
进入$HADOOP_HOME/etc/hadoop目录,配置 hadoop-env.sh等。涉及的配置文件如下:
hadoop-3.0.0/etc/hadoop/hadoop-env.sh
hadoop-3.0.0/etc/hadoop/yarn-env.sh
hadoop-3.0.0/etc/hadoop/core-site.xml
hadoop-3.0.0/etc/hadoop/hdfs-site.xml
hadoop-3.0.0/etc/hadoop/mapred-site.xml
hadoop-3.0.0/etc/hadoop/yarn-site.xml
1)配置hadoop-env.sh
2)配置yarn-env.sh
3)配置core-site.xml
添加如下配置:
4),配置hdfs-site.xml
添加如下配置
5),配置mapred-site.xml
添加如下配置:
6),配置yarn-site.xml
添加如下配置:
4,Hadoop启动
1)格式化namenode,
成功的话,会看到 “successfully formatted” 和 “Exitting with status 0” 的提示,若为 “Exitting with status 1” 则是出错
2017-05-27 12:08:50,684 INFO common.Storage: Storage directory /usr/hadoop/hdfs/name has been successfully formatted.
2017-05-27 12:08:50,712 INFO namenode.FSImageFormatProtobuf: Saving image file /usr/hadoop/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2017-05-27 12:08:50,816 INFO namenode.FSImageFormatProtobuf: Image file /usr/hadoop/hdfs/name/current/fsimage.ckpt_0000000000000000000 of size 332 bytes saved in 0 seconds.
2017-05-27 12:08:50,824 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2017-05-27 12:08:50,826 INFO util.ExitUtil: Exiting with status 0
2017-05-27 12:08:50,829 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
************************************************************/
[root@localhost hadoop-3.0.0]#
2)启动NameNode 和 DataNode 守护进程
如果运行脚本报如下错误,
ERROR: Attempting to launch hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting launch.
Starting datanodes
ERROR: Attempting to launch hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting launch.
Starting secondary namenodes [localhost.localdomain]
ERROR: Attempting to launch hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting launch.
解决方法
是因为缺少用户定义造成的,所以分别编辑开始和关闭脚本
$ vim sbin/start-dfs.sh
$ vim sbin/stop-dfs.sh
在顶部空白处添加内容:
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
如图所示:
![在文件中配置用户名](https://www.vastyun.com/wp-content/uploads/2017/02/User-300x94.png)
在文件中配置用户名
3)启动ResourceManager 和 NodeManager 守护进程
如果启动时报如下错误,
Starting resourcemanager
ERROR: Attempting to launch yarn resourcemanager as root
ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting launch.
是因为缺少用户定义造成的,所以分别编辑开始和关闭脚本
$ vim sbin/start-yarn.sh
$ vim sbin/stop-yarn.sh
在顶部空白处添加内容:
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
5,启动验证
1)执行jps命令,有如下进程,说明Hadoop正常启动
***********************************************************************************************************************************************************************
FAQ:1 50070端口无法访问
# vi /etc/selinux/config
修改
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
为
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
设置默认访问端口
mapred-site.xml 添加下面两个
<property>
<name>mapred.job.tracker.http.address</name>
<value>0.0.0.0:50030</value>
</property>
<property>
<name>mapred.task.tracker.http.address</name>
<value>0.0.0.0:50060</value>
</property>
hdfs-site.xml 添加下面配置
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:50070</value>
</property>
然后停止所有进程,删除name、data文件夹下数据重新格式化,重新启动后访问正常
希望对大家有所帮助
(0.0.0.0是本地地址,可以替换成其他地址。我在ubuntu 16.04中采用同样的配置但并没有出现这种情况,希望哪位大神能够清晰的解释这个问题)