在指南第三版中直接用MRUnit来做单元测试。
MRUnit是由Couldera公司开发的专门针对 Hadoop中编写MapReduce单元测试的框架,基本原理是JUnit4和 EasyMock。MR就是Map和Reduce的缩写。MRUnit框架非常精简,其核心的单元测试依赖于JUnit。而且MRUnit实现了一套 Mock对象来控制OutputCollector的操作,从而可以拦截OutputCollector的输出,和我们的期望结果进行比较,达到自动断言 的目的。
有了MRUnit,对MR程序做重构的时候,只要明确输入和输出,就可以写出单元测试,并且在放到群集校验前进行试验,从而节省时间和资源,也 能更快的定位到问题。而进行重构的话,只要写得足够详细的单元测试都是绿色的话,那么基本就可以保证在群集运行的结果也是正常的。
MRUnit包含四种 Driver:MapDriver,ReduceDriver,MapReduceDriver,PipelineMapReduceDriver。可以 根据自己的需要选择合适的Driver。
示例一:Writing MRUnit test cases
下面的示例将用MRUnit去单元测试一个SMS CDR(呼叫明细记录)分析的Map Reduce程序。
记录示例如下:
CDRID; CDRType; Phone1; Phone2; SMS Status Code
655209;1;796764372490213;804422938115889;6
353415;0;356857119806206;287572231184798;4
835699;1;252280313968413;889717902341635;0
此MapReduce程序分析这些记录,找到所有记录中CDRType为1的,并记录相对应的SMS Status Code,例如
这个Mapper输出为:
6, 1
0, 1
Reducer任务将此输出作为输入,并输出CDR记录中包含的status code具体次数。
下面分别给出代码:
package com.tht.MRUnitTest;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class SMSCDRMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private Text status = new Text();
private final static IntWritable addOne = new IntWritable(1);
/**
* Returns the SMS status code and its count
*/
protected void map(LongWritable key, Text value, Context context)
throws java.io.IOException, InterruptedException {
//655209;1;796764372490213;804422938115889;6 is the Sample record format
String[] line = value.toString().split(";");
// If record is of SMS CDR
if (Integer.parseInt(line[1]) == 1) {
status.set(line[4]);
context.write(status, addOne);
}
}
}
package com.tht.MRUnitTest;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class SMSCDRReducer extends
Reducer<Text, IntWritable, Text, IntWritable> {
protected void reduce(Text key, Iterable<IntWritable> values,
Context context) throws java.io.IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}
package com.tht.MRUnitTest;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.junit.Before;
import org.junit.Test;
public class SMSCDRMapperReducerTest {
MapDriver<LongWritable, Text, Text, IntWritable> mapDriver;
ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver;
MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDriver;
@Before
public void setUp() {
SMSCDRMapper mapper = new SMSCDRMapper();
SMSCDRReducer reducer = new SMSCDRReducer();
mapDriver = MapDriver.newMapDriver(mapper);
;
reduceDriver = ReduceDriver.newReduceDriver(reducer);
mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
}
@Test
public void testMapper() throws IOException {
mapDriver.withInput(new LongWritable(), new Text(
"655209;1;796764372490213;804422938115889;6"));
mapDriver.withOutput(new Text("6"), new IntWritable(1));
mapDriver.runTest();
}
@Test
public void testReducer() throws IOException {
List<IntWritable> values = new ArrayList<IntWritable>();
values.add(new IntWritable(1));
values.add(new IntWritable(1));
reduceDriver.withInput(new Text("6"), values);
reduceDriver.withOutput(new Text("6"), new IntWritable(2));
reduceDriver.runTest();
}
}
示例二:测试计数器
自创建的计数器的一个常见用途是跟踪输入格式错误的记录。例如,当输入CDR记录不是SMS类型,Mapper可以忽略该记录并计数增加。
修改后的mapper:
public class SMSCDRMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private Text status = new Text();
private final static IntWritable addOne = new IntWritable(1);
static enum CDRCounter {
NonSMSCDR;
};
/**
* Returns the SMS status code and its count
*/
protected void map(LongWritable key, Text value, Context context) throws java.io.IOException, InterruptedException {
String[] line = value.toString().split(";");
// If record is of SMS CDR
if (Integer.parseInt(line[1]) == 1) {
status.set(line[4]);
context.write(status, addOne);
} else {// CDR record is not of type SMS so increment the counter
context.getCounter(CDRCounter.NonSMSCDR).increment(1);
}
}
}
修改后的testMapper()方法:
public void testMapper() {
mapDriver.withInput(new LongWritable(), new Text(
"655209;0;796764372490213;804422938115889;6"));
//mapDriver.withOutput(new Text("6"), new IntWritable(1));
mapDriver.runTest();
assertEquals("Expected 1 counter increment", 1, mapDriver.getCounters()
.findCounter(CDRCounter.NonSMSCDR).getValue());
}