大数据——mapreduce实际应用的案例(map输出value为对象)

13 篇文章 0 订阅
5 篇文章 0 订阅

前提:需要掌握一定的java知识,并且导hadoop-client包进入项目中。(自己导包或者使用Maven都可以)
案例一:
Log文件内容:

用户ID                      用户名                       网络信息                        IP                网址                         **   **  请求流量 响应流量 状态码
1363157985066 	13726230503	00-FD-07-A4-72-B8:CMCC	120.196.100.82	i02.c.aliimg.com		24	27	2481	24681	200
1363157995052 	13826544101	5C-0E-8B-C7-F1-E0:CMCC	120.197.40.4			4	0	264	0	200
1363157991076 	13926435656	20-10-7A-28-CC-0A:CMCC	120.196.100.99			2	4	132	1512	200
1363154400022 	13926251106	5C-0E-8B-8B-B1-50:CMCC	120.197.40.4			4	0	240	0	200
1363157993044 	18211575961	94-71-AC-CD-E6-18:CMCC-EASY	120.196.100.99	iface.qiyi.com	视频网站	15	12	1527	2106	200
1363157995074 	84138413	5C-0E-8B-8C-E8-20:7DaysInn	120.197.40.4	122.72.52.12		20	16	4116	1432	200
1363157993055 	13560439658	C4-17-FE-BA-DE-D9:CMCC	120.196.100.99			18	15	1116	954	200
1363157995033 	15920133257	5C-0E-8B-C7-BA-20:CMCC	120.197.40.4	sug.so.360.cn	信息安全	20	20	3156	2936	200
1363157983019 	13719199419	68-A1-B7-03-07-B1:CMCC-EASY	120.196.100.82			4	0	240	0	200
1363157984041 	13660577991	5C-0E-8B-92-5C-20:CMCC-EASY	120.197.40.4	s19.cnzz.com	站点统计	24	9	6960	690	200
1363157973098 	15013685858	5C-0E-8B-C7-F7-90:CMCC	120.197.40.4	rank.ie.sogou.com	搜索引擎	28	27	3659	3538	200
1363157986029 	15989002119	E8-99-C4-4E-93-E0:CMCC-EASY	120.196.100.99	www.umeng.com	站点统计	3	3	1938	180	200
1363157992093 	13560439658	C4-17-FE-BA-DE-D9:CMCC	120.196.100.99			15	9	918	4938	200
1363157986041 	13480253104	5C-0E-8B-C7-FC-80:CMCC-EASY	120.197.40.4			3	3	180	180	200
1363157984040 	13602846565	5C-0E-8B-8B-B6-00:CMCC	120.197.40.4	2052.flash2-http.qq.com	综合门户	15	12	1938	2910	200
1363157995093 	13922314466	00-FD-07-A2-EC-BA:CMCC	120.196.100.82	img.qfc.cn		12	12	3008	3720	200
1363157982040 	13502468823	5C-0A-5B-6A-0B-D4:CMCC-EASY	120.196.100.99	y0.ifengimg.com	综合门户	57	102	7335	110349	200
1363157986072 	18320173382	84-25-DB-4F-10-1A:CMCC-EASY	120.196.100.99	input.shouji.sogou.com	搜索引擎	21	18	9531	2412	200
1363157990043 	13925057413	00-1F-64-E1-E6-9A:CMCC	120.196.100.55	t3.baidu.com	搜索引擎	69	63	11058	48243	200
1363157988072 	13760778710	00-FD-07-A4-7B-08:CMCC	120.196.100.82			2	2	120	120	200
1363157985066 	13726238888	00-FD-07-A4-72-B8:CMCC	120.196.100.82	i02.c.aliimg.com		24	27	2481	24681	200
1363157993055 	13560436666	C4-17-FE-BA-DE-D9:CMCC	120.196.100.99			18	15	1116	954	200

需求:需要得到同一用户名的请求流量和响应流量,以及流量总和。
分析:map return数据为key:用户名。value:对象。
第一步:创建user对象(特别注意:如果map return 的value为对象的时候,需要在创建对象时候将其包装成为mapreduce类型的IO,原因是节省map task 到reduce task的传输速率。)

//实现mapreduce数据传输的io接口
public class User implements Writable {
    private int num;
    private int a;
    private int b;
    private String phone;

    public User() {
    }

    public User(int a, int b, String phone) {
        this.num = a + b;
        this.a = a;
        this.b = b;
        this.phone = phone;
    }

    public int getNum() {
        return num;
    }

    public void setNum(int num) {
        this.num = num;
    }

    @Override
    public String toString() {
        return
                " " + num +
                " " + a +
                " " + b ;

    }

    public int getA() {
        return a;
    }

    public void setA(int a) {
        this.a = a;
    }

    public int getB() {
        return b;
    }

    public void setB(int b) {
        this.b = b;
    }

    public String getPhone() {
        return phone;
    }

    public void setPhone(String phone) {
        this.phone = phone;
    }





//   重写方法,封装对象IO操作
    @Override
    public void write(DataOutput dataOutput) throws IOException {
        dataOutput.writeInt(b);
        dataOutput.writeInt(a);
        dataOutput.writeInt(num);
        dataOutput.writeUTF(phone);
    }

    @Override
    public void readFields(DataInput dataInput) throws IOException {
        this.a = dataInput.readInt();
        this.b = dataInput.readInt();
        this.num = dataInput.readInt();
        this.phone = dataInput.readUTF();
    }
}

第二步:写map逻辑

public class Logmap extends Mapper<LongWritable, Text,Text,LongWritable> {


    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String s = value.toString();
        String[] ss = s.split(" ");
        for (String s1 : ss) {
            context.write(new Text(s1),new LongWritable(1));
        }
    }
}

第三步:写reduce 逻辑

public class Logreduceer extends Reducer<Text, User,Text,User> {

    @Override
    protected void reduce(Text key, Iterable<User> values, Context context) throws IOException, InterruptedException {
        int a = 0;
        int b = 0;


//        (User,User,User)(1,1,1,1,1)
        for (User value : values) {
            a += value.getA();
            b += value.getB();
        }

        context.write(key,new User(a,b,key.toString()));
    }
}

第四步:在main方法中,写job对象相关设置文件(main方法是执行操作的入口方法)

public class Submit01 {
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

        Configuration configuration = new Configuration();

        Job job = Job.getInstance(configuration);

        //你要运行那个mian方法
        job.setJarByClass(Submit01.class);

        job.setMapperClass(Logmapper.class);
        job.setReducerClass(Logreduceer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(User.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(User.class);

        File file = new File("f:\\output1");
        if (file.exists()){
            FileUtil.fullyDelete(file);
        }

        FileInputFormat.addInputPath(job, new Path("F:\\test\\flow.log"));
        FileOutputFormat.setOutputPath(job, new Path("f:\\output1"));

        job.waitForCompletion(true);
        System.out.println(job.isSuccessful()?0:1);
    }
}

结果:

13480253104	 360 180 180
13502468823	 117684 110349 7335
13560436666	 2070 954 1116
13560439658	 7926 5892 2034
13602846565	 4848 2910 1938
13660577991	 7650 690 6960
13719199419	 240 0 240
13726230503	 27162 24681 2481
13726238888	 27162 24681 2481
13760778710	 240 120 120
13826544101	 264 0 264
13922314466	 6728 3720 3008
13925057413	 59301 48243 11058
13926251106	 240 0 240
13926435656	 1644 1512 132
15013685858	 7197 3538 3659
15920133257	 6092 2936 3156
15989002119	 2118 180 1938
18211575961	 3633 2106 1527
18320173382	 11943 2412 9531
84138413	 5548 1432 4116

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值