8、had8.1 安装
将hadoop压缩包hadoop-2.6.0.tar.gz放在/home/hduser目录下,并解压缩到本地,重命名为hadoop;配置hadoop环境变量,执行:
sudogedit /etc/profile
将以下复制到profile内:
#hadoop
exportHADOOP_HOME=/home/hduser/hadoop
exportPATH=$HADOOP_HOME/bin:$PATH
执行:
source /etc/profile
注意:Ubuntu1、ubuntu2都要配置以上步骤;
8.2 配置
主要涉及的配置文件有7个:都在/hadoop/etc/hadoop文件夹下,可以用gedit命令对其进行编辑。
(1)进去hadoop配置文件目录
cd /home/hduser/hadoop/etc/hadoop/
(2)配置 hadoop-env.sh文件-->修改JAVA_HOME
gedit hadoop-env.sh
添加如下内容
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386/
(3)配置 yarn-env.sh 文件-->>修改JAVA_HOME
添加如下内容
# some Java parameters
exportJAVA_HOME=/opt/jdk1.6.0_45
(4)配置slaves文件-->>增加slave节点
(删除原来的localhost)
添加如下内容
Ubuntu1
Ubuntu2
(5)配置 core-site.xml文件-->>增加hadoop核心配置
(hdfs文件端口是9000、file:/home/hduser/hadoop/tmp)
添加如下内容
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://Ubuntu1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hduser/hadoop/tmp</value>
<description>Abasefor other temporarydirectories.</description>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
<description>Should native hadoop libraries, if present, beused.</description>
</property>
</configuration>
(6)配置 hdfs-site.xml 文件-->>增加hdfs配置信息
(namenode、datanode端口和目录位置)
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>Ubuntu1:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value> file:/home/hduser/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
(7)配置 mapred-site.xml 文件-->>增加mapreduce配置
(使用yarn框架、jobhistory使用地址以及web地址)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>Ubuntu1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value> Ubuntu1:19888</value>
</property>
</configuration>
(8)配置 yarn-site.xml 文件-->>增加yarn功能
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>Ubuntu1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>Ubuntu1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>Ubuntu1:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>Ubuntu1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>Ubuntu1:8088</value>
</property>
</configuration>
(9)将配置好的Ubuntu1中/hadoop/etc/hadoop文件夹复制到到Ubuntu2对应位置(删除Ubuntu2原来的文件夹/hadoop/etc/hadoop)
scp-r /home/hduser/hadoop/etc/hadoop/hduser@Ubuntu2:/home/hduser/hadoop/etc/
8.3 验证
下面验证Hadoop配置是否正确:
(1)格式化namenode:
Ubuntu1:
cd hadoop
./bin/hdfs namenode -format
Ubuntu2:
cd hadoop
./bin/hdfs namenode -format
(2)启动hdfs:
cd hadoop
./sbin/start-dfs.sh
15/04/2704:18:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... using builtin-java classes where applicable
Startingnamenodes on [Ubuntu1]
Ubuntu1:starting namenode, logging to/home/hduser/hadoop/logs/hadoop-hduser-namenode-Ubuntu1.out
Ubuntu1:starting datanode, logging to /home/hduser/hadoop/logs/hadoop-hduser-datanode-Ubuntu1.out
Ubuntu2:starting datanode, logging to/home/hduser/hadoop/logs/hadoop-hduser-datanode-Ubuntu2.out
Startingsecondary namenodes [Ubuntu1]
Ubuntu1:starting secondarynamenode, logging to /home/hduser/hadoop/logs/hadoop-hduser-secondarynamenode-Ubuntu1.out
15/04/2704:19:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... using builtin-java classes where applicable
查看java进程(Java Virtual Machine Process Status Tool)
hduser@Ubuntu1:~/hadoop$ jps
8008 NameNode
8443 Jps
8158 DataNode
8314SecondaryNameNode
使用 jps发现NameNode进程没有正确运行,
停止服务,
重新格式化namenode,
hadoop namenode -format
start-all.sh
NameNode进程已运行
(3)停止hdfs:
cd hadoop
./sbin/stop-dfs.sh
Stoppingnamenodes on [Ubuntu1]
Ubuntu1:stopping namenode
Ubuntu1:stopping datanode
Ubuntu2:stopping datanode
Stoppingsecondary namenodes [Ubuntu1]
Ubuntu1:stopping secondarynamenode
查看java进程
hduser@Ubuntu1:~/hadoop$ jps
8850 Jps
(4)启动yarn:
cd hadoop
./sbin/start-yarn.sh
starting yarndaemons
startingresourcemanager, logging to/home/hduser/hadoop/logs/yarn-hduser-resourcemanager-Ubuntu1.out
Ubuntu2:starting nodemanager, logging to/home/hduser/hadoop/logs/yarn-hduser-nodemanager-Ubuntu2.out
Ubuntu1:starting nodemanager, logging to/home/hduser/hadoop/logs/yarn-hduser-nodemanager-Ubuntu1.out
查看java进程
cd hadoop
jps
8911ResourceManager
9247 Jps
9034NodeManager
(5)停止yarn:
cd hadoop
./sbin/stop-yarn.sh
stopping yarndaemons
stoppingresourcemanager
Ubuntu1:stopping nodemanager
Ubuntu2:stopping nodemanager
no proxyserverto stop
查看java进程
cd hadoop
jps
9542 Jps
(6)查看集群状态:
首先启动集群:
./sbin/start-dfs.sh
cd hadoop
./bin/hdfs dfsadmin -report
ConfiguredCapacity: 39891361792 (37.15 GB)
PresentCapacity: 28707627008 (26.74 GB)
DFS Remaining: 28707569664(26.74 GB)
DFS Used: 57344(56 KB)
DFS Used%: 0.00%
Under replicatedblocks: 0
Blocks withcorrupt replicas: 0
Missing blocks:0
-------------------------------------------------
Live datanodes(2):
Name:192.168.159.132:50010 (Ubuntu2)
Hostname:Ubuntu2
DecommissionStatus : Normal
ConfiguredCapacity: 19945680896 (18.58 GB)
DFS Used: 28672(28 KB)
Non DFS Used:5575745536 (5.19 GB)
DFS Remaining:14369906688 (13.38 GB)
DFS Used%: 0.00%
DFS Remaining%:72.05%
Configured CacheCapacity: 0 (0 B)
Cache Used: 0 (0B)
Cache Remaining:0 (0 B)
Cache Used%:100.00%
CacheRemaining%: 0.00%
Xceivers: 1
Last contact:Mon Apr 27 04:26:09 PDT 2015
Name:192.168.159.131:50010 (Ubuntu1)
Hostname:Ubuntu1
DecommissionStatus : Normal
ConfiguredCapacity: 19945680896 (18.58 GB)
DFS Used: 28672(28 KB)
Non DFS Used:5607989248 (5.22 GB)
DFS Remaining:14337662976 (13.35 GB)
DFS Used%: 0.00%
DFS Remaining%:71.88%
Configured CacheCapacity: 0 (0 B)
Cache Used: 0 (0B)
Cache Remaining:0 (0 B)
Cache Used%:100.00%
CacheRemaining%: 0.00%
Xceivers: 1
Last contact:Mon Apr 27 04:26:08 PDT 2015
(7)查看hdfs:http://Ubuntu1:50070/