更多大数据专栏文章请点击 : –> 小马哥大数据专栏博文导航 <–
MapReduce案例
案例1: 统计求和
1.1 需求
统计每个手机号的上行数据包总和,下行数据包总和,上行总流量之和,下行总流量之和分析:以手机号码作为key值,上行流量,下行流量,上行总流量,下行总流量四个字段作为value值,然后以这个key,和value作为map阶段的输出,reduce阶段的输入.
数据格式如下:
1.2 思路
1, map输出:
key: 手机号码msisdn
value: 原始line
2, reduce输出:
key: 手机号码msisdn
value: 对四个字段 upPackNum, downPackNum, upPayLoad, downPayLoad累计求和
1.3 代码
JavaBean类
import org.apache.hadoop.io.WritableComparable;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
/**
* 代表流量记录的JavaBean
*/
public class Flow implements WritableComparable<Flow> {
private String phoneNum; //手机号码
private Long upPackNum; //上行数据包数量
private Long downPackNum; //下行数据包数量
private Long upPayLoad; //上行总流量
private Long downPayLoad; //下行总流量
private Long totalUpPackNum; //上行数据包数量_总和
private Long totalDownPackNum; //下行数据包数量_总和
private Long totalUpPayLoad; //上行总流量_总和
private Long totalDownPayLoad; //下行总流量_总和
public Flow() {
}
public Flow(Long totalUpPackNum, Long totalDownPackNum, Long totalUpPayLoad, Long totalDownPayLoad) {
this.totalUpPackNum = totalUpPackNum;
this.totalDownPackNum = totalDownPackNum;
this.totalUpPayLoad = totalUpPayLoad;
this.totalDownPayLoad = totalDownPayLoad;
}
public String getPhoneNum() {
return phoneNum;
}
// ... 省略getter与setter方法
@Override
public String toString() {
return totalUpPackNum +
"\t" + totalDownPackNum +
"\t" + totalUpPayLoad +
"\t"