3台虚拟机
•OS:RHEL_6.X_X86-64
172.16.34.94 lsfvm01172.16.34.95 lsfvm02
172.16.34.96 lsfvm03
安装准备
•3台机器之间网络联通、主机名正确解析、关闭防火墙•配置ssh无密码访问
•安装jdk
配置ssh无密码访问
1.登录lsfvm01, 做如下操作:[root@lsfvm01 .ssh]# cd /root/.ssh
[root@lsfvm01 .ssh]# ssh-keygen -t rsa
[root@lsfvm01 .ssh]# ls id_rsa id_rsa.pub
[root@lsfvm01 .ssh]# cat id_rsa.pub >> authorized_keys
[root@lsfvm01 .ssh]# ssh-copy-id -i id_rsa.pub root@lsfvm02
[root@lsfvm01 .ssh]# ssh-copy-id -i id_rsa.pub root@lsfvm03
[root@lsfvm01 .ssh]# chmod 700 /root/.ssh
[root@lsfvm01 .ssh]# chmod 600 /root/.ssh/authorized_keys
2. 依次登录lsfvm02, lsfvm03重复步骤1
安装jdk
1. 安装jdk包[root@lsfvm01 ~]# rpm -ivh jdk-7u79-linux-x64.rpm
[root@lsfvm01 ~]# ls /usr/java/jdk1.7.0_79/
2. 设置java环境变量:
在/etc/profile末尾加入:
[root@lsfvm01 ~]# vim /etc/profile
#for Java
export JAVA_HOME=/usr/java/jdk1.7.0_79
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=$CLASSPATH:.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
使添加环境变量立即生效
[root@lsfvm01 ~]# source /etc/profile
3. 验证安装
[root@lsfvm01 ~]# java -version
java version "1.7.0_79”
4. 在lsfvm02和lsfvm03上重复上述工作
安装Hadoop (一)
1. 准备hadoop 安装目录[root@lsfvm01 .ssh]# mkdir /usr/hadoop
[root@lsfvm01 .ssh]# mkdir /usr/hadoop/tmp
[root@lsfvm01 .ssh]# mkdir /usr/hadoop/hdfs
[root@lsfvm01 .ssh]# mkdir /usr/hadoop/hdfs/data
[root@lsfvm01 .ssh]# mkdir /usr/hadoop/hdfs/name
2. 复制hadoop安装包到/usr/hadoop,并解压
[root@lsfvm01 ~]# cp hadoop-2.7.1.tar.gz /usr/hadoop/
[root@lsfvm01 hadoop]# tar zxvf hadoop-2.7.1.tar.gz
3. 设置环境变量
[root@lsfvm01 hadoop]# vim /etc/profile
#for Hadoop
export HADOOP_HOME=/usr/hadoop/hadoop-2.7.1
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
使环境变量立即生效
[root@lsfvm01 hadoop]# source /etc/profile
安装Hadoop(二)
4. 修改hadoop配置 需修改7个配置文件[root@lsfvm01 hadoop]# cd $HADOOP_HOME/etc/hadoop
hadoop-env.sh yarn-env.sh core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml slaves
1)[root@lsfvm01 hadoop]# vim hadoop-env.sh
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/java/jdk1.7.0_79
2) [root@lsfvm01 hadoop]# vim yarn-env.sh
#export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/java/jdk1.7.0_79
安装Hadoop(三)
3) [root@lsfvm01 hadoop]# vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://172.16.34.94:9000</value>
<description>HDFS的URI,文件系统://namenode标识:端口号</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop/tmp</value>
<description>namenode上本地的hadoop临时文件夹</description>
</property>
</configuration>
4) [root@lsfvm01 hadoop]# vim hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/usr/hadoop/hdfs/name</value>
<description>namenode上存储hdfs名字空间元数据 </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/hadoop/hdfs/data</value>
<description>datanode上数据块的物理存储位置</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
<description>副本个数,配置默认是3,应小于datanode机器数量</description>
</property>
</configuration>
5) [root@lsfvm01 hadoop]# vim mapred-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
6) [root@lsfvm01 hadoop]# vim yarn-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>172.16.34.94:8099</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>172.16.34.94:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>172.16.34.94:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>172.16.34.94:8031</value>
</property>
</configuration>
7) [root@lsfvm01 hadoop]# vim slaves
lsfvm01
lsfvm02
lsfvm03
安装Hadoop(八)
5. 将Hadoop的安装包复制到其他节点:lsfvm02 lsfvm031) 将lsfvm01上的$HADOOP_HOME目录下所有内容复制到其他节点相同目录
2) 复制设置环境变量(Java和Hadoop)
例如以lsfvm02为例:
[root@lsfvm01 hadoop]# scp -r /usr/hadoop/ root@lsfvm02:/usr/
[root@lsfvm01 hadoop]# ssh lsfvm02
[root@lsfvm02 ~]# vim /etc/profile
#for Hadoop
export HADOOP_HOME=/usr/hadoop/hadoop-2.7.1
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
CLASSPATH=$CLASSPATH:$HADOOP_HOME/share/hadoop/common/hadoop-common-2.7.1.jar:$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.1.jar
PATH=$PATH:$HOME/bin
export PATH
export CLASSPATH
[root@lsfvm02 ~]# source /etc/profile
启动并检查hadoop状态
1)启动#第一次启动格式化namenode
[root@lsfvm01 hadoop-2.7.1]# hdfs namenode -format
[root@lsfvm01 hadoop-2.7.1]# start-dfs.sh
[root@lsfvm01 hadoop-2.7.1]# start-yarn.sh
2)检查状态
1)namenode
[root@lsfvm01 hadoop-2.7.1]# jps
17148 ResourceManager
18147 Jps
10920 DataNode
17247 NodeManager
15807 SecondaryNameNode
11793 NameNode
2) datanode
[root@lsfvm02 ~]# jps
24008 NodeManager
29349 Jps
23721 DataNode
3) 查看web
lsfvm01 172.16.34.94
http://172.16.34.94:50070
http://172.16.34.94:8099
提交测试作业
1. 新建一个测试文本文档[root@lsfvm01 hadoop]# vim /tmp/world.txt
2. 在HDFS创建一个目录,并将测试文档传送到HDFS目录中
[root@lsfvm01 hadoop]# hdfs dfs -mkdir /test
[root@lsfvm01 hadoop]# hdfs dfs -ls /
[root@lsfvm01 hadoop]# hdfs dfs -put /tmp/world.txt /test
[root@lsfvm01 hadoop]# hdfs dfs -ls /test
3.提交作业
[root@lsfvm01 hadoop]#hadoop jar /usr/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /test /out
4.查看结果 [root@lsfvm01 hadoop]# hdfs dfs -ls /out
Found 2 items
-rw-r--r-- 3 root supergroup 0 2016-11-11 06:40 /out/_SUCCESS
-rw-r--r-- 3 root supergroup 70 2016-11-11 06:40 /out/part-r-00000
[root@lsfvm01 hadoop]# hdfs dfs -cat /out/part-r-00000
I 1
am 1
are 2
fine 1
good 3