一.3台机器配置hadoop集群
192.168.80.1* hmaster
192.168.80.2* hslave1
192.168.80.3* hslave2
1. 分别设置主机名(重启后生效):vi /etc/sysconfig/network、hosts解析文件:vi /etc/hosts ,例如80.39的配置如下
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=hmaster
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 hmaster localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.80.1* hmaster
192.168.80.2* hslave1
192.168.80.3* hslave2
2.安装JDK,并设置环境变量(此处略过)
3. 下载Hadoop,并设置环境变量
a)下载地址:http://mirror.bit.edu.cn/apache/hadoop/common,这里选择2.6.4版本
b)拷贝到/opt目录下,解压缩命令: tar -zxvf hadoop-2.6.4.tar.gz,在/etc/profile中配置hadoop的环境变量
执行命令:source /etc/profile使其生效
c)修改hadoop-env.sh,设置JAVA_HOME;
d)编辑hadoop配置文件,执行命令:vim /opt/hadoop-2.6.4/.bashrc
<span style="font-size:12px;">export HADOOP_PREFIX=/opt/hadoop-2.6.4
export HADOOP_HOME=$HADOOP_PREFIX
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export PATH=$PATH:$HADOOP_PREFIX/sbin:$HADOOP_PREFIX/bin</span>
e)远程拷贝jdk和hadoop及其配置信息,案例:将hadoop文件夹拷贝到hslave2机器上,执行命令:scp -r hadoop-2.6.4/ hslave2:/opt,依次类推
4.编辑配置文件
a)在master机器上编辑core-site.xml,执行命令:vim /opt/hadoop-2.6.4/etc/hadoop/core-site.xml,并拷贝到slave机器上
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hmaster:9000/</value>
</property>
</configuration>
<span style="font-size:12px;"><configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hmaster:9000/</value>
</property>
</configuration></span>
b)在master机器上编辑hdfs-site.xml,配置namenode和datanode
<span style="font-size:12px;"><configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>/home/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/datanode</value>
</property>
</configuration></span>
c)在hslave1和hslave2上编辑hdfs-site.xml,配置datamode
<span style="font-size:12px;"><configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/datanode</value>
</property>
</configuration></span>
d)在hmaster上配置mapred-site.xml
<span style="font-size:12px;"><configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value> <!-- and not local (!) -->
</property>
</configuration></span>
e)在hmaster上配置yarn-site.xml
<span style="font-size:12px;"><property>
<name>yarn.resourcemanager.hostname</name>
<value>hmaster</value>
</property>
<property>
<name>yarn.nodemanager.hostname</name>
<value>hmaster</value> <!-- or hslave1, hslave2, hslave3 -->
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property></span>
f)在hslave1、hslave2上配置yarn-site.xml
<span style="font-size:12px;"><property>
<name>yarn.nodemanager.hostname</name>
<value>hmaster</value> <!-- or hslave1, hslave2, hslave3 -->
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property></span>
g)在hmaster上编辑文件slaves
<span style="font-size:12px;">hmaster
hslave1
hslave2</span>
5.修改权限
a)在hmaster上修改
<span style="font-size:12px;"># chown hadoop /opt/hadoop-2.6.4/ -R
# chgrp hadoop /opt/hadoop-2.6.4/ -R
# mkdir /home/hadoop/datanode
# chown hadoop /home/hadoop/datanode/
# chgrp hadoop /home/hadoop/datanode/
</span>
b)在hslave1、hslave2上修改
<span style="font-size:12px;"># mkdir /home/hadoop/namenode
# chown hadoop /home/hadoop/namenode/
# chgrp hadoop /home/hadoop/namenode/
</span>
6.在hmaster上执行格式化hdfs命令:hadoop namenode -format
7.启动hadoop,目录/sbin下执行命令:./start-all.sh,启动后jps查看启动情况
在hmaster上
在hslave上
8.测试hadoop:拷贝本地文件到hdfs,使用自带wordcount工具统计单词个数
a)关闭safe mode,执行指令:hdfs dfsadmin -safemode leave
b)上传文件test.txt
本地文件夹/home/zhenzhen/下创建文件test.txt;
hdfs上创建文件夹input:hdfs dfs -mkdir /input
拷贝test.txt到input文件夹下:hdfs dfs -copyFromLocal /home/zhenzhen/test.txt /input/test.txt
查看hdfs上的test.txt:hdfs dfs -cat /input/test.txt |head
c)执行工具类wordcount,将结果输出到output1:
hadoop jar /opt/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount /input/test.txt /output1
在output1下查看输出结果文件:
hdfs dfs -ls -R /output1
查看统计结果
hdfs dfs -cat /output1/part-r-00000|head
以上,待续!