Hadoop: the definitive guide 第三版 拾遗 第五章 之MRUnit

在指南第三版中直接用MRUnit来做单元测试。

MRUnit是由Couldera公司开发的专门针对 Hadoop中编写MapReduce单元测试的框架,基本原理是JUnit4和 EasyMock。MR就是Map和Reduce的缩写。MRUnit框架非常精简,其核心的单元测试依赖于JUnit。而且MRUnit实现了一套 Mock对象来控制OutputCollector的操作,从而可以拦截OutputCollector的输出,和我们的期望结果进行比较,达到自动断言 的目的。

有了MRUnit,对MR程序做重构的时候,只要明确输入和输出,就可以写出单元测试,并且在放到群集校验前进行试验,从而节省时间和资源,也 能更快的定位到问题。而进行重构的话,只要写得足够详细的单元测试都是绿色的话,那么基本就可以保证在群集运行的结果也是正常的。

MRUnit包含四种 Driver:MapDriver,ReduceDriver,MapReduceDriver,PipelineMapReduceDriver。可以 根据自己的需要选择合适的Driver。

示例一:Writing MRUnit test cases

下面的示例将用MRUnit去单元测试一个SMS CDR(呼叫明细记录)分析的Map Reduce程序。
记录示例如下:

    CDRID; CDRType; Phone1; Phone2; SMS Status Code
    655209;1;796764372490213;804422938115889;6
    353415;0;356857119806206;287572231184798;4
    835699;1;252280313968413;889717902341635;0

此MapReduce程序分析这些记录,找到所有记录中CDRType为1的,并记录相对应的SMS Status Code,例如
这个Mapper输出为:
6, 1
0, 1
Reducer任务将此输出作为输入,并输出CDR记录中包含的status code具体次数。

下面分别给出代码:

package com.tht.MRUnitTest;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class SMSCDRMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
	 
	  private Text status = new Text();
	  private final static IntWritable addOne = new IntWritable(1);
	 
	  /**
	   * Returns the SMS status code and its count
	   */
	  protected void map(LongWritable key, Text value, Context context)
	      throws java.io.IOException, InterruptedException {
	 
	    //655209;1;796764372490213;804422938115889;6 is the Sample record format
	    String[] line = value.toString().split(";");
	    // If record is of SMS CDR
	    if (Integer.parseInt(line[1]) == 1) {
	      status.set(line[4]);
	      context.write(status, addOne);
	    }
	  }
}

package com.tht.MRUnitTest;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class SMSCDRReducer extends
		Reducer<Text, IntWritable, Text, IntWritable> {

	protected void reduce(Text key, Iterable<IntWritable> values,
			Context context) throws java.io.IOException, InterruptedException {
		int sum = 0;
		for (IntWritable value : values) {
			sum += value.get();
		}
		context.write(key, new IntWritable(sum));
	}
}

package com.tht.MRUnitTest;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.junit.Before;
import org.junit.Test;

public class SMSCDRMapperReducerTest {

	MapDriver<LongWritable, Text, Text, IntWritable> mapDriver;
	ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver;
	MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDriver;

	@Before
	public void setUp() {
		SMSCDRMapper mapper = new SMSCDRMapper();
		SMSCDRReducer reducer = new SMSCDRReducer();
		mapDriver = MapDriver.newMapDriver(mapper);
		;
		reduceDriver = ReduceDriver.newReduceDriver(reducer);
		mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
	}

	@Test
	public void testMapper() throws IOException {
		mapDriver.withInput(new LongWritable(), new Text(
				"655209;1;796764372490213;804422938115889;6"));
		mapDriver.withOutput(new Text("6"), new IntWritable(1));
		mapDriver.runTest();
	}

	@Test
	public void testReducer() throws IOException {
		List<IntWritable> values = new ArrayList<IntWritable>();
		values.add(new IntWritable(1));
		values.add(new IntWritable(1));
		reduceDriver.withInput(new Text("6"), values);
		reduceDriver.withOutput(new Text("6"), new IntWritable(2));
		reduceDriver.runTest();
	}
}

示例二:测试计数器

自创建的计数器的一个常见用途是跟踪输入格式错误的记录。例如,当输入CDR记录不是SMS类型,Mapper可以忽略该记录并计数增加。

修改后的mapper:

public class SMSCDRMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
 
  private Text status = new Text();
  private final static IntWritable addOne = new IntWritable(1);
 
  static enum CDRCounter {
    NonSMSCDR;
  };
 
  /**
   * Returns the SMS status code and its count
   */
  protected void map(LongWritable key, Text value, Context context) throws java.io.IOException, InterruptedException {
 
    String[] line = value.toString().split(";");
    // If record is of SMS CDR
    if (Integer.parseInt(line[1]) == 1) {
      status.set(line[4]);
      context.write(status, addOne);
    } else {// CDR record is not of type SMS so increment the counter
      context.getCounter(CDRCounter.NonSMSCDR).increment(1);
    }
  }
}

修改后的testMapper()方法:
public void testMapper() {
    mapDriver.withInput(new LongWritable(), new Text(
        "655209;0;796764372490213;804422938115889;6"));
    //mapDriver.withOutput(new Text("6"), new IntWritable(1));
    mapDriver.runTest();
      assertEquals("Expected 1 counter increment", 1, mapDriver.getCounters()
              .findCounter(CDRCounter.NonSMSCDR).getValue());
  }




评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值