一、环境准备:《CentOS6.5环境下编译hadoop2.8.1源码》中编译好的源码
su - root
ll /opt/sourcecode/hadoop-2.8.1-src/hadoop-dist/target/hadoop-2.8.1.tar.gz
cd /opt/software
cp /opt/sourcecode/hadoop-2.8.1-src/hadoop-dist/target/hadoop-2.8.1.tar.gz /opt/software
二、伪分布式安装hadoop
1.新建hadoop用户,赋予sodu权限
su - root
useradd hadoop
vi /etc/sudoers
2.解压安装包
cd /opt/software
tar -xzvf hadoop-2.8.1.tar.gz
3.创建软连接并修改所有者
cd /opt/software
ln -s /opt/software/hadoop-2.8.1 /opt/software/hadoop
chown -R hadoop:hadoop hadoop
chown -R hadoop:hadoop hadoop/*
chown -R hadoop:hadoop hadoop-2.8.1
chown -R hadoop:hadoop hadoop-2.8.1/*
4.设置环境变量
vi /etc/profile
#add hadoop
export HADOOP_HOME=/opt/software/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
source /etc/profile
which hadoop
5.java和ssh环境检查
java -version
service sshd status
6.切换用户,查看安装目录主要作用
su - hadoop
ll /opt/software/hadoop/
bin:可执行文件
etc:配置文件
sbin:启动关闭hdfs、yarn的shell脚本
7.配置hadoop用户的ssh信任关系
su - hadoop
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
检查ssh信任关系配置情况:第一次输入命令需要输入yes,第二次之后直接出现日期则ssh信任关系配置成功
ssh localhost date
ssh hadoop002 date
8.配置文件
su - hadoop
vi /opt/software/hadoop/etc/hadoop/core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.119.131:9000</value>
</property>
vi /opt/software/hadoop/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
9.格式化和启动
配置java环境变量
echo $JAVA_HOME
/usr/java/jdk1.8.0_45
vi /opt/software/hadoop/etc/hadoop/hadoop-env.sh
格式化
which hdfs
cd /opt/software/hadoop
bin/hdfs namenode -format
启动,secondary namenodes启动时输入yes
cd /opt/software/hadoop
sbin/start-dfs.sh
10.查看启动日志
cd /opt/software/hadoop/logs
11.检查服务
jps
http://192.168.119.131:50070
12.修改配置文件以hadoop002启动hdfs
原启动情况:
hadoop002启动:namenode
localhost启动:datanode
0.0.0.0启动:secondary namenode
配置datanode
cd /opt/software/hadoop/etc/hadoop/
echo "hadoop002" > slaves
cat slaves
配置secondary namenode,将以下配置写入:hdfs-site.xml
vi /opt/software/hadoop/etc/hadoop/hdfs-site.xml
#add secondary namenode
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop002:50090</value>
</property>
<property>
<name>dfs.namenode.secondary.https-address</name>
<value>hadoop002:50090</value>
</property>
启动停止hdfs进程:
cd /opt/software/hadoop/sbin
./stop-dfs.sh
./start-dfs.sh
现启动情况:
13.配置MapRedurce和Yarn
cd /opt/software/hadoop/etc/hadoop
cp mapred-site.xml.template mapred-site.xml
vi mapred-site.xml
#add mapred
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
cd /opt/software/hadoop/etc/hadoop
vi yarn-site.xml
#add yarn
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
14.启动Yarn
cd /opt/software/hadoop/sbin
./start-yarn.sh
打开web界面
http://192.168.119.131:8088
15.运行MapRedurce作业
cd /opt/software/hadoop/share/hadoop/mapreduce
hadoop jar hadoop-mapreduce-examples-2.8.1.jar pi 5 10
至此,CentOS6.5环境下伪分布式安装hadoop2.8.1我们已经完成