项目场景:
在idea上运行Hadoop的MapReduce
问题描述:
处理得到的数据不在返回结果文件中,reducer类没有问题,调试发现reducer类未运行:
IntWritable key2 = new IntWritable();
Text value2 = new Text();
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
/**数据结构:
* emp:7499,ALLEN,SALESMAN,7698,1981/2/20,1600,300,30
* dept:20,RESEARCH,DALLAS
*/
System.out.println("偏移量:" + key + ",value : " + value.toString());
//1、分词
String[] splits = value.toString().split(",");
//2、区别emp和dept. check if its emp or dept file
if (splits.length >= 8){ //读取的是emp表数据
String empName = splits[1]; // employee name
String empDept = splits[7]; //deptno
key2.set(Integer.parseInt(empDept));
value2.set(empName);
}else{ //读取是dept表的数据
String detpNo = splits[0];
// remember to add a flag to identify if its dept file data
String deptName = "*" + splits[1];//加*的目的是标识当前的数据是属于部门表里面的
key2.set(Integer.parseInt(detpNo));
value2.set(deptName);
}
}
原因分析:
在map类中忘记调用context.write,进而导致没有数据传给reducer。
解决方案:
加上
IntWritable key2 = new IntWritable();
Text value2 = new Text();
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
/**数据结构:
* emp:7499,ALLEN,SALESMAN,7698,1981/2/20,1600,300,30
* dept:20,RESEARCH,DALLAS
*/
System.out.println("偏移量:" + key + ",value : " + value.toString());
//1、分词
String[] splits = value.toString().split(",");
//2、区别emp和dept. check if its emp or dept file
if (splits.length >= 8){ //读取的是emp表数据
String empName = splits[1]; // employee name
String empDept = splits[7]; //deptno
key2.set(Integer.parseInt(empDept));
value2.set(empName);
}else{ //读取是dept表的数据
String detpNo = splits[0];
// remember to add a flag to identify if its dept file data
String deptName = "*" + splits[1];//加*的目的是标识当前的数据是属于部门表里面的
key2.set(Integer.parseInt(detpNo));
value2.set(deptName);
}
//3、通过context写出去. Mapper output
context.write(key2,value2);
}