MapReduce简单编程详解（4）：查找2012年美国人口最多的州，以及未满18岁人口最多的州；

最新推荐文章于 2021-12-24 17:52:29 发布

今天莲莲掉头发了吗

最新推荐文章于 2021-12-24 17:52:29 发布

阅读量281

点赞数

文章标签： hashmap

本文链接：https://blog.csdn.net/qq_40797864/article/details/106358760

版权

查找2012年美国人口最多的州

//每行就运行一次map函数
public class WordCountMapper extends Mapper<LongWritable, Text, Text, DoubleWritable> {//?DoubleWritable

    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        // TODO Auto-generated method stub
        //super.map(key, value, context);
        String s=value.toString();
        String[] ws=s.split(",");
        
        String k="abc";//得到州名ws[0]
        if(!ws[0].equals(k)) {
        	k=ws[0];
        }
        
       if(ws[2].equals("2012") && !ws[0].equals("state")) {//跳过第一行state/region,ages,year,population，且选定“2012”
    	   //context.write(new Text(Text.valueOF), new DoubleWritable(Double.valueOf(ws[3])));
    	   context.write(new Text(ws[0]), new DoubleWritable(Double.valueOf(ws[3])));
       }
    }
}

public class WordCountReducer extends Reducer<Text, DoubleWritable, Text, DoubleWritable> {
	Map<String,Double> map=new HashMap<String, Double>();

    @Override
    protected void reduce(Text key, Iterable<DoubleWritable> values, Context context)
            throws IOException, InterruptedException {
        // TODO Auto-generated method stub
        // super.reduce(arg0, arg1, arg2);

        double sum = 0;//求每个州的成年人、未成年人的人数
                   
        for (DoubleWritable value : values) {
        	sum += value.get();
            }
        
 		/*用putIfAbsent()保存数据的时候，如果该链表中保存的有相同key的值，那么就不会对我们当前的value进行保存
  	 	  用put()存储数据的时候，不管是该链表中是否有当前需要存储的key都会保存，我们所要保存的当前key所对应的value*/      		
		  //Map<String,Double> map=new HashMap<String, Double>();不能放这里
		  map.put(key.toString(),sum);//[('AK',918469.0),('AL',5935017.0),('AR',3660299.0)...]
		 							 //key不变，但涉及存储问题，转换成String类型；value=sum
		         
        //context.write(new Text(key), new DoubleWritable(sum));这个输出的只是各个州的总人口，不能最多的
        //还需要再比较，即下面代码   
    }
    
    	@Override		//cleanup
    	protected void cleanup(Context context) throws IOException,InterruptedException{
		 List<Map.Entry<String,Double>> list=new LinkedList<Map.Entry<String,Double>>(map.entrySet()); 
		 	//Map.Entry:取键值、值的集合												//map.entrySet()：整个键值对的集合，通过其得到
		 
		 //排序
		 Collections.sort(list, new Comparator<Map.Entry<String,Double>>(){
			 @Override
			 public int compare(Entry<String,Double> arg0,Entry<String,Double> arg1) {
				 return (int) (arg1.getValue()-arg0.getValue());
			 }
		 });
		 	for(int i=0;i<1;i++) {
		 		context.write(new Text("2012年人口最多的州："+list.get(i).getKey()+"\n数量："), 
		 				      new DoubleWritable(list.get(i).getValue()));
		 	}
	  }     
	 

}

代码分解：

protected void cleanup(Context context) throws IOException,InterruptedException{
		 List<Map.Entry<String,Double>> list=new LinkedList<Map.Entry<String,Double>>(map.entrySet()); 
		 	
		 Collections.sort(list, new Comparator<Map.Entry<String,Double>>(){
			 @Override
			 public int compare(Entry<String,Double> arg0,Entry<String,Double> arg1) {
				 return (int) (arg1.getValue()-arg0.getValue());
			 }
		 });

主要的比较：
如何给List集合排序Collections.sort(list,new Comparator)

延申的：
Collections.sort的2种用法

Java中Map的 entrySet() 详解以及用法(四种遍历map的方式)
Map.Entry的作用：
Map.Entry是为了更方便的输出map键值对。一般情况下，要输出Map中的key 和 value 是先得到key的集合keySet()，然后再迭代（循环）由每个key得到每个value。values()方法是获取集合中的所有值，不包含键，没有对应关系。而Entry可以一次性获得这两个值。

for(int i=0;i<1;i++) {
		 		context.write(new Text("2012年人口最多的州："+list.get(i).getKey()+"\n数量："), 
		 				      new DoubleWritable(list.get(i).getValue()));
		 	}

*cleanup函数以及排序

转换成List,以及list.get(i).getKey()取值的图在这里插入图片描述
未成年只改map这句

if (ws[2].equals("2012") && !ws[0].equals("state")  && ws[1].equals("under18") && !ws[0].equals("USA") )

今天莲莲掉头发了吗

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫