注意:用于官网不提供32位的下载,所以我在centos 64位虚拟机环境下重新编译了下,得到了hadoop-2.4.1文件
HDFS伪分布搭建
修改配置文件etc/hadoop/hadoop-env.sh:
JAVA_HOME=/usr/local/jdk1.7.0_45
修改配置文件etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://crxy0:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
</configuration>
修改配置文件etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
格式化文件系统:
$ bin/hdfs namenode -format
启动HDFS集群:
$ sbin/start-dfs.sh
访问web浏览器:
NameNode - http://localhost:50070/
练习:
创建目录:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/root
复制文件:
$ bin/hdfs dfs -put /etc/profile input
关闭集群:
$ sbin/stop-dfs.sh
安装Yarn
修改配置文件etc/hadoop/mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
修改配置文件etc/hadoop/yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
启动Yarn集群:
$ sbin/start-yarn.sh
访问web浏览器:
ResourceManager - http://localhost:8088/
运行例子:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar wordcount input output
查看结果:
$ bin/hdfs dfs -cat output/*
关闭Yarn集群:
$ sbin/stop-yarn.sh