今天将Hadoop 权威指南天气数据示例代码在hadoop集群上跑通,记录一下。
之前在百度/Google上怎么也没有找到怎么样将自己的Map-Reduce方法跑在集群上的每一步都具体描述,经过一番痛苦的无头苍蝇式的摸索,成功了,心情不错...
1准备天气预报数据(权威指南上的数据的简化版 5-9为year,15-19为temperature)
aaaaa1990aaaaaa0039a
bbbbb1991bbbbbb0040a
ccccc1992cccccc0040c
ddddd1993dddddd0043d
eeeee1994eeeeee0041e
aaaaa1990aaaaaa0031a
bbbbb1991bbbbbb0020a
ccccc1992cccccc0030c
ddddd1993dddddd0033d
eeeee1994eeeeee0031e
aaaaa1990aaaaaa0041a
bbbbb1991bbbbbb0040a
ccccc1992cccccc0040c
ddddd1993dddddd0043d
eeeee1994eeeeee0041e
aaaaa1990aaaaaa0044a
bbbbb1991bbbbbb0045a
ccccc1992cccccc0041c
ddddd1993dddddd0023d
eeeee1994eeeeee0041e
2 编写Map-Reduce函数和调度函数(Job)
简单点:如下
package hadoop.test;
import java.io.第;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.I