本篇博客将详细介绍如何利用Hadoop进行手机号流量统计。我们将统计每个手机号的上行流量和、下行流量和、总流量和,并按照手机号的前缀进行区分,将结果输出到不同的文件中。
项目背景
我们需要统计每个手机号的上行流量和、下行流量和、总流量和,并且根据手机号前缀进行分区。数据源为一个日志文件access.log
,每行格式如下:
字段1 字段2(手机号) 字段3 ... 倒数第三个字段(上行流量) 倒数第二个字段(下行流量) 字段N
开发环境
-
Hadoop 3.1.3
-
Java 8
-
Maven
项目结构
PhoneTraffic ├── src │ ├── main │ │ ├── java │ │ │ └── com │ │ │ └── example │ │ │ └── phonetraffic │ │ │ ├── Access.java │ │ │ ├── AccessDriver.java │ │ │ ├── AccessMapper.java │ │ │ ├── AccessPartitioner.java │ │ │ └── AccessReducer.java │ │ └── resources │ └── test │ └── java ├── pom.xml └── access.log
开发步骤
1. 自定义Access类
Access
类包括手机号、上行流量、下行流量、总流量四个属性:
package com.example.phonetraffic; import java.io.DataInput; import java.io.DataOutput; import java.io.IOException; import org.apache.hadoop.io.Writable; public class Access implements Writable { private String phone; private long up; private long down; private long sum; // Getters and setters public void write(DataOutput out) throws IOException { out.writeUTF(phone); out.writeLong(up); out.writeLong(down); out.writeLong(sum); } public void readFields(DataInput in) throws IOException { this.phone = in.readUTF(); this.up = in.readLong(); this.down = in.readLong(); this.sum = in.readLong(); } @Override public String toString() { return phone + "\t" + up + "\t" + down + "\t" + sum; } }
2. 编写Mapper类
AccessMapper
类负责将日志中的手机号、上行流量、下行流量提取出来,形成键值对输出:
package com.example.phonetraffic; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import java.io.IOException; public class AccessMapper extends Mapper<LongWritable, Text, Text, Access> { @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String[] fields = value.toString().split(" "); String phone = fields[1]; long up = Long.parseLong(fields[fields.length - 3]); long down = Long.parseLong(fields[fields.length - 2]); Access access = new Access(); access.setPhone(phone); access.setUp(up); access.setDown(down); access.setSum(up + down); context.write(new Text(phone), access); } }
剩余代码在下一篇文章