一、准备工作
- 安装Linux
- 添加hadoop用户,配置权限
- 配置ssh无密码登录
- 安装JDK环境
二、安装Hadoop 2.6.0
2.1 下载安装包
使用镜像 http://mirror.bit.edu.cn/apache/hadoop/common/,将安装包下载到$HOME目录下
$ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
2.2 安装hadoop
- 将Hadoop安装至/usr/local目录下。
$ sudo tar -zxf ~/hadoop-2.6.0.tar.gz -C /usr/local/
$ cd /usr/local/
$ sudo mv ./hadoop-2.6.0/ ./hadoop
$ sudo chown -R hadoop:hadoop ./hadoop/
- 测试hadoop是否可用
$ /usr/local/hadoop/bin/hadop -version
三、单机配置模式
默认模式是单机模式,这方便进行调试,Hadoop附带丰富的例子,含有wordcount、terasort、join、grep等,可通过下面命令查看所有例子。
$ cd /usr/local/hadoop
$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount ./input ./output
$ cat ./output/part-r-00000
$ rm -rf output
$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep ./input ./output 'dfs[a-z.]+'
...
File Input Format Counters
Bytes Read=123
File Output Format Counters
Bytes Written=23
1. Hadoop 默认不会覆盖结果文件,再次运行上面实例会提示出错,需要先将 ./output 删除
rm -rf output
。
2. CentOS设置到这一步遇到问题(尚未解决)。
[hadoop@pseudo hadoop]$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep ./input ./output 'dfs[a-z.]+' 16/08/25 23:42:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/08/25 23</