作业链条化编程
[MAP+ / REDUCE MAP*]
实现功能:
单词统计
需求:
1、过滤敏感词汇
2、过滤单词出现小于5的词汇
链条结构:
MapMapper1(映射) + MapMapper2(过滤敏感词汇) + Reducer(聚合) + ReduceMapper1(过滤小于5的词汇)
WCApp.class
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.chain.ChainMapper;
import org.apache.hadoop.mapreduce.lib.chain.ChainReducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
/*