MapReduce数据筛选

最新推荐文章于 2024-04-24 11:34:13 发布

603946254

最新推荐文章于 2024-04-24 11:34:13 发布

阅读量1.4k

点赞数 1

分类专栏： hadoop 文章标签： mapreduce

本文链接：https://blog.csdn.net/qq603946254/article/details/75635108

版权

需求：

编写MapReduce程序算出高峰时间段（如9-10点）哪张表被访问的最频繁的表，以及这段时间访问这张表最多的用户，以及这个用户访问这张表的总时间开销。

测试数据：

TableName(表名)，Time(时间)，User(用户)，TimeSpan(时间开销)

*t003 6:00 u002 180
*t003 7:00 u002 180
*t003 7:08 u002 180
*t003 7:25 u002 180
*t002 8:00 u002 180
*t001 8:00 u001 240

*t001 9:00 u002 300
*t001 9:11 u001 240
*t003 9:26 u001 180
*t001 9:39 u001 300
*t001 10:00 u001 200

代码

方法一：

package com.table.main;

import java.io.IOException;
import java.util.HashMap;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class TableUsed {

    public static class MRMapper extends Mapper<LongWritable, Text, Text, Text> {
        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String[] split = value.toString().substring(1).spli

最低0.47元/天解锁文章

603946254

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
MapReduce数据筛选

需求：编写MapReduce程序算出高峰时间段（如9-10点）哪张表被访问的最频繁的表，以及这段时间访问这张表最多的用户，以及这个用户访问这张表的总时间开销。测试数据：TableName(表名)，Time(时间)，User(用户)，TimeSpan(时间开销)*t003 6:00 u002 180 *t003 7:00 u002 180 *t003 7:08 u002 180 *t
复制链接

扫一扫