Google Suggestion ( Map Reduce)

Use MapReduce framework to build a key-value index for Google Suggestion where the key is the prefix of a query and the value is the top 10 searched queries.

You don't need go through all queries and calculate the number of searches, assume you are given a list of queries and their number of searches, which is the output of another map reduce problem - Word Count.

The key of the map function is the document id which you can ignore it. The value of the map function is a document instance which contains two member variables, word and count. e.g. "hello 100", that means the query "hello" has been searched 10 times. The output the the map function depending on your algorithm, we won't check it so you can output anything you want as key-value pairs.

The key, value of the reduce function is depending on what you output in the map function. The output of the reduce function is key-value pairs where the key is the prefix, the value is top 10 queries and their counts. Use the Document class to wrap them.

Example

Example1

Input:
[("apple",100), ("app",1200), ("app store",1200)]

Output: 
"a": [("app", 1200), ("app store", 1200), ("apple", 100)]
"ap": [("app", 1200), ("app store", 1200), ("apple", 100)]
"app": [("app", 1200), ("app store", 1200), ("apple", 100)]
"app ": [("app store", 1200)]
"app s": [("app store", 1200)]
"app st": [("app store", 1200)]
"app sto": [("app store", 1200)]
"app stor": [("app store", 1200)]
"app store": [("app store", 1200)]
"appl": [("apple", 100)]
"apple": [("apple", 100)]

思路:就是minheap取前十大,key word count就可以了;

/**
 * Definition of OutputCollector:
 * class OutputCollector<K, V> {
 *     public void collect(K key, V value);
 *         // Adds a key/value pair to the output buffer
 * }
 * Definition of Document:
 * class Document {
 *     public int count;
 *     public String content;
 * }
 *
 *class Pair {
 *   private String content;
 *   private int count;
 *   
 *   Pair(String key, int value) {
 *       this.key = key;
 *       this.value = value;
 *   }
 *   public String getContent(){
 *	 	 return this.content;
 *	 }
 *	public int getCount(){
 *   	 return this.count;
 *   }
 *
 *}
 */
public class GoogleSuggestion {

    public static class Map {
        public void map(Document value,
                        OutputCollector<String, Pair> output) {
            // Write your code here
            // Output the results into output buffer.
            // Ps. output.collect(String key, Pair value);
            String str = value.content;
            for(int i = 0; i < str.length(); i++) {
                String substr = str.substring(0,i+1);
                output.collect(substr, new Pair(value.content, value.count));
            }
        }
    }

    public static class Reduce {
        
        private class PairComparator implements Comparator<Pair> {
            @Override
            public int compare(Pair a, Pair b) {
                if(a.getCount() != b.getCount()){
                    return a.getCount() - b.getCount();
                } else {
                    return b.getContent().compareTo(a.getContent());
                }
            }
        }
        
        public void setup() {}   
		public void reduce(String key, Iterator<Pair> values, OutputCollector<String, Pair> output) {
    		// Write your code here
            // Output the results into output buffer.
            // Ps. output.collect(String key, Pair value);
            PriorityQueue<Pair> pq = new PriorityQueue(new PairComparator());
            while(values.hasNext()){
                Pair cur = values.next();
                if(pq.size() < 10){
                    pq.offer(cur);
                } else {
                    Pair peek = pq.peek();
                    PairComparator pairCmp = new PairComparator();
                    if(pairCmp.compare(cur, peek) > 0) {
                        pq.poll();
                        pq.offer(cur);
                    }
                }
            }
            
            List<Pair> list = new ArrayList<Pair>();
            while(!pq.isEmpty()){
                list.add(0, pq.poll());
            }
            
            for(int i = 0; i < list.size(); i++){
                Pair pair = list.get(i);
                 output.collect(key, pair);
            }
        }
    }
}

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值