MapReduce分布式计算系统
1、HDFS 分布式存储系统
2、MapReduce 分布式计算系统
3、YARN hadoop 的资源调度系统
Common 以上三大组件的底层支撑组件,提供基础工具包和 RPC 框架等
Map处理
public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>
Reduce处理
public class WordCountReducer extends Reducer <Text, IntWritable, Text, IntWritable>
Job配置
public class WordCountDriver
//关联使用的Mapper类
job.setMapperClass(WordCountMapper.class);
//关联使用的Reducer类
job.setReducerClass(WordCountReducer.class);
public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
//对数据进行打散
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
//输入数据 hello world love work
String line = value.toString();
//对数据切分
String[] words=line.split(" ");
//写出<hello, 1>
for(String w:words