目标:如果用户A与用户C同时都跟B是好友,但用户A与用户C又不是好友,则向用户A推荐C,向用户C推荐A,同时说明A与C的共同好友有哪些
例如:
有如下的好友关系:
1 2,3,4,5,6,7,8
2 1,3,4,5,7
3 1,2
4 1,2,6
5 1,2
6 1,4
7 1,2
8 1
其中每一行空格前的元素为用户ID,空格后的元素为用户的好友ID列表
其对应的好友关系图为
期望输出为:
1
2 6(2:[4, 1]),8(1:[1]),
3 4(2:[1, 2]),5(2:[2, 1]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
4 3(2:[2, 1]),5(2:[1, 2]),7(2:[1, 2]),8(1:[1]),
5 3(2:[2, 1]),4(2:[1, 2]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
6 2(2:[1, 4]),3(1:[1]),5(1:[1]),7(1:[1]),8(1:[1]),
7 3(2:[1, 2]),4(2:[2, 1]),5(2:[2, 1]),6(1:[1]),8(1:[1]),
8 2(1:[1]),3(1:[1]),4(1:[1]),5(1:[1]),6(1:[1]),7(1:[1]),
即
对于用户1,因为它以及跟2,3,4,5,6,7,8都是好友,则不向其推荐任何好友
对于用户2,向其推荐6,因为2跟6可以通过4或者1认识;向其推荐8,因为2和8可以通过1认识
对于用户3,向其推荐4,因为3跟4可以通过1或者2认识;向其推荐5,因为3和5可以通过2或者1认识;向其推荐6,因为3和6可以通过1认识;向其推荐7,因为3和7可以通过1或者2认识;想起推荐8,因为3跟8可以通过1认识
...
思路:
对于每一行,例如4 1,2,6
map操作:
生成直接好友键值对(4,[1,-1]) (4,[2,-1]) (4,[6,-1])
生成间接好友键值对(1,[2,4]) (2,[1,4]) (1,[6,4]) (6,[1,4]) (2,[6,4]) (6,[2,4]]),其中(1,[2,4]),连接为向1推荐2,因为可以通过4认识,其他类似
reduce操作:
所有对于同一个用户的直接好友键值对和间接好友键值对能够到达同一个规约器
例如:对于用户4
key=4
以下键值对集合会到达同一个reduce
t2= FriendPair [user1=7, user2=1]
t2= FriendPair [user1=3, user2=2]
t2= FriendPair [user1=2, user2=-1]
t2= FriendPair [user1=6, user2=-1]
t2= FriendPair [user1=1, user2=2]
t2= FriendPair [user1=8, user2=1]
t2= FriendPair [user1=6, user2=1]
t2= FriendPair [user1=5, user2=1]
t2= FriendPair [user1=3, user2=1]
t2= FriendPair [user1=1, user2=6]
t2= FriendPair [user1=2, user2=1]
t2= FriendPair [user1=1, user2=-1]
t2= FriendPair [user1=7, user2=2]
t2= FriendPair [user1=5, user2=2]
对于用户4,维护一个Map<Long,List<Long>>,用来保存用户4的推荐好友以及跟该好友的共同好友列表
显然,对于4的直接好友:即user2为-1的,应该直接不对其推荐,只需要将<user1,null>放入Map中
对于4的间接好友,应该把推荐ID相同的记录的共同好友进行累加,如
t2= FriendPair [user1=3, user2=2]
t2= FriendPair [user1=3, user2=1]
则应将给用户4推荐的用户3的所有共同好友:用户2和用户1进行累加,将<3,[2,1]>放入Map中
代码实现:
1、自定义好友对
package FriendRecommendation;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableComparable;
public class FriendPair implements Writable, WritableComparable<FriendPair> {
private LongWritable user1 = new LongWritable();
private LongWritable user2 = new LongWritable();
public FriendPair(){}
public LongWritable getUser1() {
return user1;
}
public void setUser1(LongWritable user1) {
this.user1 = user1;
}
public LongWritable getUser2() {
return user2;
}
public void setUser2(LongWritable user2) {
this.user2 = user2;
}
public FriendPair(Long user1,Long user2)
{
/*if(user1 > user2)
{
this.user1.set(user1);
this.user2.set(user2);
}
else
{
this.user1.set(user2);
this.user2.set(user1);
}*/
this.user1.set(user1);
this.user2.set(user2);
}
@Override
public int compareTo(FriendPair pair) {
int compareValue = this.user1.compareTo(pair.user1);
if (compareValue == 0) {//如果年月相等,再比较温度
compareValue = this.user2.compareTo(pair.user2);
}
//return compareValue; // to sort ascending
return -1*compareValue; // to sort descending
}
@Override
public void readFields(DataInput in) throws IOException {
user1.readFields(in);
user2.readFields(in);
}
@Override
public void write(DataOutput out) throws IOException {
user1.write(out);
user2.write(out);
}
@Override
public String toString() {
return "FriendPair [user1=" + user1.get() + ", user2=" + user2.get() + "]";
}
}
2、
package FriendRecommendation;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import scala.Tuple2;
public class Main extends Configured implements Tool {
public static class FriendRecommendationMapper extends Mapper<LongWritable, Text, LongWritable, FriendPair>
{
LongWritable outputKey = new LongWritable();
@Override
protected void map(LongWritable key, Text value,Context context)
throws IOException, InterruptedException {
//1 2,3,4,5,6,7,8
System.out.println("map key=" + key);
System.out.println("map value=" + value);
Long userID = Long.valueOf( value.toString().split(" ")[0]);
String[] friends_str = value.toString().split(" ")[1].split(",");
//发出所有的直接好友关系
for(String friend : friends_str)
{
FriendPair directFriend = new FriendPair(Long.valueOf(friend), -1L);
outputKey.set(userID);
System.out.println("directFriend:" + userID + ","+ directFriend);
context.write(outputKey, directFriend);
}
//发出所有可能的好友关系
for(int i=0;i<friends_str.length;i++)
{
for(int j=i+1;j<friends_str.length;j++)
{
FriendPair possibleFriend1 = new FriendPair(Long.valueOf(friends_str[j]), userID);
outputKey.set(Long.valueOf(friends_str[i]));
System.out.println("possibleFriend1:" + outputKey.get() + ","+ possibleFriend1);
context.write(outputKey, possibleFriend1);
FriendPair possibleFriend2 = new FriendPair(Long.valueOf(friends_str[i]), userID);
outputKey.set(Long.valueOf(friends_str[j]));
System.out.println("possibleFriend2:" + outputKey.get() + ","+ possibleFriend2);
context.write(outputKey, possibleFriend2);
}
}
}
}
public static class FriendRecommendationReducer extends Reducer<LongWritable, FriendPair, Text, Text>
{
@Override
protected void reduce(
LongWritable key,
Iterable<FriendPair> values,
Context context)
throws IOException, InterruptedException {
System.out.println("reduce key = " + key);
Map<Long,List<Long>> mutualFriends = new HashMap<Long,List<Long>>();
Iterator<FriendPair> iterator = values.iterator();
while(iterator.hasNext())
{
FriendPair t2 = iterator.next();
System.out.println("t2= " + t2);
Long toUser = t2.getUser1().get();
Long mutualFriend = t2.getUser2().get();
boolean alreadyFriend = (mutualFriend == -1);
if(mutualFriends.containsKey(toUser))
{
if(alreadyFriend)
{
mutualFriends.put(toUser, null);
}
else if(mutualFriends.get(toUser) != null)
{
mutualFriends.get(toUser).add(mutualFriend);
}
}
else
{
if(alreadyFriend)
{
mutualFriends.put(toUser, null);
}
else
{
List<Long> list = new ArrayList<Long>();
list.add(mutualFriend);
mutualFriends.put(toUser,list);
}
}
}
String reducerOutput = buildOutput(mutualFriends);
Text outputKey = new Text();
Text outputValue = new Text();
outputKey.set("" + key);
outputValue.set(reducerOutput);
context.write(outputKey, outputValue);
}
}
public static String buildOutput(Map<Long,List<Long>> map)
{
String output = "";
for(Map.Entry<Long, List<Long>> entry : map.entrySet())
{
Long K = entry.getKey();
List<Long> V = entry.getValue();
if(V!=null)
output += K + "(" + V.size() + ":" + V + "),";
}
return output;
}
public static void main(String[] args) throws Exception {
args = new String[2];
args[0] = "input/friends2.txt";
args[1] = "output/friends2";
int jobStatus = submitJob(args);
System.exit(jobStatus);
}
public static int submitJob(String[] args) throws Exception {
int jobStatus = ToolRunner.run(new Main(), args);
return jobStatus;
}
@Override
public int run(String[] args) throws Exception {
Job job = new Job(getConf());
job.setJobName("CommonFriendsDriver");
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(FriendPair.class);
job.setMapperClass(FriendRecommendationMapper.class);
job.setReducerClass(FriendRecommendationReducer.class);
// args[0] = input directory
// args[1] = output directory
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
boolean status = job.waitForCompletion(true);
return status ? 0 : 1;
}
}
结果:
1
2 6(2:[4, 1]),8(1:[1]),
3 4(2:[1, 2]),5(2:[2, 1]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
4 3(2:[2, 1]),5(2:[1, 2]),7(2:[1, 2]),8(1:[1]),
5 3(2:[2, 1]),4(2:[1, 2]),6(1:[1]),7(2:[1, 2]),8(1:[1]),
6 2(2:[1, 4]),3(1:[1]),5(1:[1]),7(1:[1]),8(1:[1]),
7 3(2:[1, 2]),4(2:[2, 1]),5(2:[2, 1]),6(1:[1]),8(1:[1]),
8 2(1:[1]),3(1:[1]),4(1:[1]),5(1:[1]),6(1:[1]),7(1:[1]),
给出一个另一个简单输入的执行过程详解: