hadoop 之wordcount

5 篇文章 0 订阅
1 篇文章 0 订阅

除了这个之外当然还看了几个demo, 把wordcount放到最后,打算单独实现一遍。

最后还是在继承类的泛型的地方存在一些盲区,日后补上。


代码:


package wordcount;


import java.io.IOException; 
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.io.IntWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.Job; 
import org.apache.hadoop.mapreduce.Mapper; 
import org.apache.hadoop.mapreduce.Reducer; 
//import org.apache.hadoop.mapreduce.Reducer.Context;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 
import org.apache.hadoop.util.GenericOptionsParser; 
public class WordCount { 
  public static class Map extends Mapper<Object, Text, Text, IntWritable>{
	  private static Text line = new Text();//每行数据
	  private static IntWritable one = new IntWritable(1);
	  private static Text word = new Text(); 
	  public void map(Object key, Text value, Context context) throws IOException, InterruptedException{
		  line = value;
		 // context.write(line, new Text(""));one
		  StringTokenizer itr = new StringTokenizer(value.toString());
		  while(itr.hasMoreElements()){
			
			  word.set(itr.nextToken());
			  context.write(word, one);
		  }
		  
	  } 
  }	 
  public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable> {
	  public void reduce(Text key,Iterable<IntWritable> values, Context context) throws IOException, InterruptedException{
		 // context.write(key, new Text(""));
		  IntWritable res = new IntWritable();
	      int sum = 0;
	      for(IntWritable value: values){
	    	  sum+=value.get();//熟悉下intwritable方法,属性
	        }
	      res.set(sum);
	      context.write(key, res);
	  }
	  
  }
  public static void main(String args[]) throws IOException, ClassNotFoundException, InterruptedException{
	  Configuration conf = new Configuration();
	  conf.set("mapred.job.tracker", "127.0.0.1:9000");
	  String[] ioArgs=new String[]{"hdfs://localhost:9000/count_in","hdfs://localhost:9000/count_out"};
	  //??
	  String[] otherArgs = new GenericOptionsParser(conf,ioArgs).getRemainingArgs();
	  if(otherArgs.length!=2){
		  System.err.println("xx");
		  System.exit(2);//2是什么状态
	  }
	  
	  Job  job = new Job(conf,"data dedup");
	  job.setJarByClass(WordCount.class);
	  job.setMapperClass(Map.class);
	  job.setCombinerClass(Reduce.class);
	  job.setReducerClass(Reduce.class);
	  job.setOutputKeyClass(Text.class);
	  job.setOutputValueClass(IntWritable.class);
	  
	  FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
	  FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
	  System.exit(job.waitForCompletion(true)?0:1);
  }
	
} 

截图:

文本

      

结果:


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值