@WordCount词频统计详解(乱序版)
WordCount主要分三部分:
WordCountMain、WordCountMapper、WordcountReducerWordCountMain: 用来统筹map逻辑以及reducer逻辑
WordCountMapper:
切分,编写map逻辑使得<k1,v1>转换成<k2,v2>WordcountReducer:
编写reducer逻辑使得<k2,v2>转换成<k3,v3>
WordCountMain
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
//将map与reducer组织在一起
public class WordCountMain extends Configured implements Tool {