mapreduce获取共同好友分析

MapReduce 获取共同好友分析

一、题目

冒号前是一个用户,冒号后是该用户的所有好友(数据中的好友关系是单向的)

A:B,C,D,F,E,O
B:A,C,E,K
C:F,A,D,I
D:A,E,F,L
......

求出哪些人两两之间有共同好友,及他俩的共同好友都有谁?即两个用户之间有哪些共同好友,比如:

A-B:C,E
A-C:D,F
......

二、分析

由底向上分析

假设:某个map reduce结束后可以获得上述答案,可以进一步推测reduce输出格式为:

key:A-B
value:C,E...
......
// ':'可以忽略,"..."代表从格式上来说可能还有其它数据

由reduce对iterator<>values归并的特性,可以分解reduce的输出态key-values为reduce的输入态key-value为:

A-B:C
A-B:E
A-C:D
A-C:F
......
//将拥有C为好友的所有用户切分出来,这些用户是没有顺序的,所以再切分之前需要按照字典序进行排序;否则可能会形成A-B:C,B-A:E这样的数据,导致不能统计两人的共同好友

因为reduce输入状态是map的输出状态,再由map的切分特性可以合并map的输出状态为map的输入状态:

C:A,B...
E:A,B...
D:A,C...
F:A,c...
......

由逻辑意义,可以得出C被A和B共同认为是好友(好友为单向的,即C有可能没有A、B好友),反之得出
A有好友C、E、D、F;B有好友C、E;…由此即可推断出原数据的关系,即冒号后是该用户的所有好友
再增加一个map reduce程序,后一个map的输入是前一个reduce的输出,所以可以根据reduce的归并特性逆向得出reduce的输入态:

C:A
C:B
E:A
E:B
D:A
F:A
......

reduce的输入态即是map的输出态,所以根据map的切分特性,可以得出map的输入态(即原始数据):

A:C,E,D,F,...
B:C,E,...
......

到此由结果数据推导原数据成功,即原数据由以上相应的map reduce转换后即可得到题目要求的答案数据。

三、总结

1由上述情况可以总结map reduce程序的两个特性:

1.1、map程序处理数据总是将一个数据切分,然后组成新的数据;逆操作是切分,然后组成原数据。
1.2、reduce程序处理数据总是将values组合起来,然后结合key进行最终输出;逆操作是分开key-value,然后对value进行切分,再结合key还原原数据。

四、备注:

1、map处理数据有可能使用InputFormat直接进行预处理,而自身不再进行切分操作,只是单纯传输。

2、reduce的key可能被舍弃;如果出现value被舍弃的情况,即没有组合values和shuffle操作,为避免reduce及shuffle过程带来的大量资源消耗,建议尽量不进行reduce操作。

五、代码:

1、IsFriendMapper.java:

/**
 * description: a mapper program of mutual friend;A have friends,a friend is A`s friend
 * author: bob yy
 * since: 1.8
 **/
public class IsFriendMapper extends Mapper<LongWritable, Text, Text, Text> {
    private Text outKey = new Text();
    private Text outValue = new Text();

    /**
     * outKey:outValue`s friend; outValue:a user
     *
     * @param key     number of row
     * @param value   a row data
     * @param context context of InputFormat
     * @throws IOException IOException
     * @throws InterruptedException InterruptedException
     */
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] keyValue = value.toString().split(":");
        outValue.set(keyValue[0]);
        for (String user : keyValue[1].split(",")) {
            outKey.set(user);
            context.write(outKey, outValue);
        }
    }
}

2、IsFriendReducer.java:

/**
 * description: a reducer program of mutual friend,merge user
 * author: bob yy
 * since: 1.8
 **/

public class IsFriendReducer extends Reducer<Text, Text, Text, Text> {
    private Text outValue = new Text();

    /**
     * outKey: outValues`s friend; outValues: a group user for merge users
     *
     * @param key     outKey
     * @param values  users
     * @param context InputFormat context
     * @throws IOException          IOException
     * @throws InterruptedException InterruptedException
     */
    @Override
    protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
        StringBuilder sb = new StringBuilder();
        boolean isFirst = true;
        for (Text value : values) {
            if (!isFirst) {
                sb.append("," + value.toString());
            } else {
                isFirst = false;
                sb.append(value.toString());
            }
        }
        outValue.set(sb.toString());
        context.write(key, outValue);
    }
}

3、MutualFriendMapper.java:

/**
 * description: a mapper program for mutual friend,get mutual friend from between two users
 * author: bob yy
 * since: 1.8
 **/
public class MutualFriendMapper extends Mapper<LongWritable, Text, Text, Text> {
    private Text outValue = new Text();
    private Text outKey = new Text();

    /**
     * outKey: two users; outValue: a friend of between two users
     *
     * @param key     line number
     * @param value   a friend and a text for users with ':' connect
     * @param context InputFormat context
     * @throws IOException          IOException
     * @throws InterruptedException InterruptedException
     */
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] split = value.toString().split("\t");
        outValue.set(split[0]);
        String[] users = split[1].split(",");
        // users need sorted
        Arrays.sort(users);
        // get tow users for bubbling
        for (int i = 0; i < users.length; i++) {
            for (int j = i + 1; j < users.length; j++) {
                outKey.set(users[i] + "-" + users[j]);
                context.write(outKey, outValue);
            }
        }
    }
}

4、MutualFriendReducer.java:

/**
 * description: a reducer program of mutual friend, merge mutual friends between two user
 * author: bob yy
 * since: 1.8
 **/
public class MutualFriendReducer extends Reducer<Text,Text,Text,Text> {
    private Text outValue = new Text();
    @Override
    protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
        StringBuilder sb = new StringBuilder();
        boolean isFirst = true;
        for (Text value : values) {
            if(!isFirst){
                sb.append(","+value.toString());
            }else{
                isFirst = false;
                sb.append(value.toString());
            }
        }
        outValue.set(sb.toString());
        context.write(key, outValue);
    }
}

5、MutualFriendDriver.java:

/**
 * description: a driver program of mutual friend
 * author: bob yy
 * since: 1.8
 **/
public class MutualFriendDriver {
    public static void main(String[] args) throws IOException {
        Path inputPath = new Path("J:\\data\\friends\\input");
        Path outputPath1 = new Path("J:\\data\\friends\\output1");
        Path outputPath2 = new Path("J:\\data\\friends\\output2");
        FileSystem fs = SimpleFileSystem.getLocalFileSystem();
        if (fs.exists(outputPath1)) {
            fs.delete(outputPath1, true);
        }
        if (fs.exists(outputPath2)) {
            fs.delete(outputPath2, true);
        }

        Job job1 = Job.getInstance();
        Job job2 = Job.getInstance();

        /********************************/
        // set job1
        job1.setMapperClass(IsFriendMapper.class);
        job1.setReducerClass(IsFriendReducer.class);
        job1.setOutputKeyClass(Text.class);
        job1.setOutputValueClass(Text.class);
        FileInputFormat.setInputPaths(job1, inputPath);
        FileOutputFormat.setOutputPath(job1, outputPath1);
        /********************************/
        // set job2
        job2.setMapperClass(MutualFriendMapper.class);
        job2.setReducerClass(MutualFriendReducer.class);
        job2.setOutputKeyClass(Text.class);
        job2.setOutputValueClass(Text.class);
        FileInputFormat.setInputPaths(job2, outputPath1);
        FileOutputFormat.setOutputPath(job2, outputPath2);
        /********************************/
        job1.setJobName("is friend");
        job2.setJobName("mutual friend");
        // get job control
        JobControl jobControl = new JobControl("mutual friend");
        ControlledJob controlledJob1 = new ControlledJob(job1.getConfiguration());
        ControlledJob controlledJob2 = new ControlledJob(job2.getConfiguration());
        // connect job for control job
        controlledJob2.addDependingJob(controlledJob1);
        jobControl.addJob(controlledJob1);
        jobControl.addJob(controlledJob2);

        // set daemon
        Thread jobControlThread = new Thread(jobControl);
        jobControlThread.setDaemon(true);
        jobControlThread.start();

        // wait for completion
        while (true) {
            if (jobControl.allFinished()) {
                System.out.println(jobControl.getSuccessfulJobList());
                return;
            }
        }
    }
}

以上分析的普适性未经证明,仅供参考

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值