Mock代码不是桩代码

Mock代码不是桩代码

Martin Fowler

The term ‘Mock Objects’ has become a popular one to describe special case objects that mimic real objects for testing. Most language environments now have frameworks that make it easy to create mock objects. What’s often not realized, however, is that mock objects are but one form of special case test object, one that enables a different style of testing. In this article I’ll explain how mock objects work, how they encourage testing based on behavior verification, and how the community around them uses them to develop a different style of testing.

Mock对象已经成为了在测试中用来模拟真实对象的流行词。大多数语言现在都有相应的框架用来快捷的创建mock object。然而,常常容易被忽略的是,mock object 是可用用来实现特殊测试方式的测试对象。在这篇文章中,我会解释mock objects的工作方式, 他们如何促成测试过程中的行为验证,以及他们通用的使用方式

I first came across the term “mock object” a few years ago in the Extreme Programming (XP) community. Since then I’ve run into mock objects more and more. Partly this is because many of the leading developers of mock objects have been colleagues of mine at ThoughtWorks at various times. Partly it’s because I see them more and more in the XP-influenced testing literature.

我几年前在极限编程社区第一次接触对了mock object。从那时候开始,我越来越多的接触到了这个概念。这很大程度上是因为我在ThoughtWorks的同事在大量使用

This difference is actually two separate differences.On the one hand there is a difference is how test results are verifed: a distinction between state verification and behavior verification.On the other hand is a whole different philosophy to the way testing and design play together,which i term here as the classical and mockist styles of Test Driven Development.

他们之间的区别分别有两个方面。一、验证的侧重点方面不同。二、在测试代码中的设计方式不用。我在测试驱动开发中详细定义。

Regular Test

I’ll begin by illustrating the two styles with a simple example.(The example is in Java,but the principles make sence with any object-oriented language.) We want to take an order object and filll it from a warehorse object.The order is very simple,with only one product and a quantity. The warehouse holds inventories of different products. When we ask an order to fill itself froma warehouse there are two possible responses. If there’s enough product in the warehouse to fill the order,the order becomes filled and the warehouse’s amount of the product is reduced by the appropriate amount.If there isn’t enough product in the warehouse when the order isn’t filled and nothing happens in the warehouse.

举一个简单的订单填充产品的例子来说明这两种情况。我们创建一个订单对象,并且从库存中向其填充产品。

These tow behaviors imply a couple of tests,these look like pretty conventional Junit tests.

public class OrderStateTester extends TestCase {
  private static String TALISKER = "Talisker";
  private static String HIGHLAND_PARK = "Highland Park";
  private Warehouse warehouse = new WarehouseImpl();

  // 数据初始化
  protected void setUp() throws Exception {
    // 向库存中添加50个Tailsker产品
    warehouse.add(TALISKER, 50);
    warehouse.add(HIGHLAND_PARK, 25);
  }
  // 测试如果库存充足,订单中可以添加产品
  public void testOrderIsFilledIfEnoughInWarehouse() {
    Order order = new Order(TALISKER, 50);
    order.fill(warehouse);
    assertTrue(order.isFilled());
    assertEquals(0, warehouse.getInventory(TALISKER));
  }
  // 测试库存不足的情况
  public void testOrderDoesNotRemoveIfNotEnough() {
    Order order = new Order(TALISKER, 51);
    order.fill(warehouse);
    assertFalse(order.isFilled());
    assertEquals(50, warehouse.getInventory(TALISKER));
  }

xUnit test follow a typical four phase sequence: setup, exercise, verfiy, teardown. In this case the setup phase is done partly in the setUp method (setting up the warehouse) and partly in the test method(setting up the order). The call to order.fill is the exercise phase. This is where the object is prodded to do the thind that we want to test. The assert statements are then the verification stage, checking to see if the exercised method carried out its task correctly. In this case there’s no explicit teardown pahse, the garbage collector does this for us implicity.

此处说明了单元测试中应该存在的四个通用阶段:前置代码、执行、验证、后置代码。

So for this test I need the SUT(Order) and one collaborator(warehouse). I need the warehouse for two reasons: …I referred tht SUT as the “primary objecta” and collaborator as “secondary objects”)

此处说明了两个文章中的重要概念:SUT(被测对象-Order)和collaborator(合作对象-warehouse

This style of testing uses state verification: which means tha we determine whether the exercised methos workes correctly by examining the state of the SUT and its collaborators after the method as exercised. As we’ll see, mock objects enable a different approach to verification.

此处说明了马丁富勒认为的状态验证的定义:通过校验被测对象和合作对象在被执行之后的的状态来决定测试是否通过。

Tests with Mock Objects

Now I’ll take the same behavior and use mock objects. For this code I’m using the jMock library for defining mocks. jMock is a java mock object library. There are other mock object libraries out there, but this one is an up to date library written by the originators of the technique, so it makes a good one to start with.

使用Jmock的模块作为例子,这里还有很多其他的模块。

public class OrderInteractionTester extends MockObjectTestCase {
  private static String TALISKER = "Talisker";

  public void testFillingRemovesInventoryIfInStock() {
    //setup - data
    Order order = new Order(TALISKER, 50);
    Mock warehouseMock = new Mock(Warehouse.class);
    
    //setup - expectations
    warehouseMock.expects(once()).method("hasInventory")
      .with(eq(TALISKER),eq(50))
      .will(returnValue(true));
    warehouseMock.expects(once()).method("remove")
      .with(eq(TALISKER), eq(50))
      .after("hasInventory");

    //exercise
    order.fill((Warehouse) warehouseMock.proxy());
    
    //verify
    warehouseMock.verify();
    assertTrue(order.isFilled());
  }
  
  public void testFillingDoesNotRemoveIfNotEnoughInStock() {
    Order order = new Order(TALISKER, 51);    
    Mock warehouse = mock(Warehouse.class);
      
    warehouse.expects(once()).method("hasInventory")
      .withAnyArguments()
      .will(returnValue(false));

    order.fill((Warehouse) warehouse.proxy());

    assertFalse(order.isFilled());
  }

Concentrate on testFillingRemovesInventoryIfInStock first, as I’ve taken a couple of shortcuts with the later test.

To begin with, the setup phase is very different. For a start it’s divided into two parts: data and expectations. The data part sets up the objects we are interested in working with, in that sense it’s similar to the traditional setup. The difference is in the objects that are created. The SUT is the same - an order. However the collaborator isn’t a warehouse object, instead it’s a mock warehouse - technically an instance of the class Mock.

The second part of the setup creates expectations on the mock object.The expectations indicate which methods should be called on the mocks when the SUT is exercised.

重点说明第一个测试例子testFillingRemovesInventoryIfInStock,测试例子分为数据准备、测试期望。数据准备阶段用于设置对象的初始化数据,这和原始的测试阶段相似,只不过SUT(被测对象)没有改变,但是合作对象warehouse变成了mockWareHouse。第二步是设置mock对象的期望,他们暗示了mock对象在SUT在执行时的方法表现。

Once all the expectations are in place I exercise the SUT. After the exercise I then do verification, which has two aspects. I run asserts against the SUT - much as before. However I also verify the mocks - checking that they were called according to their expectations.
The key difference here is how we verify that the order did the right thing in its interaction with the warehouse. With state verification we do this by asserts against the warehouse’s state. Mocks use behavior verification, where we instead check to see if the order made the correct calls on the warehouse. We do this check by telling the mock what to expect during setup and asking the mock to verify itself during verification. Only the order is checked using asserts, and if the method doesn’t change the state of the order there’s no asserts at all.

在所有的期望准备好了之后,我们执行被测试对象,在那之后,执行验证阶段。验证阶段有两个部分,一个是对SUT的数据验证,一个是根据期望对Mock对象mockWareHouse的验证。这其中最关键的验证是关于warehouseorder交互行为。与验证warehouse的状态不同。Mock通过在验证阶段,Order对象是否正确的调用了warehouse的方法进行了行为验证。只有订单的状态验证,我们使用了断言,另外,如果Order的状态没有发生变化,这里根本没有必要进行断言。

In the second test I do a couple of different things. Firstly I create the mock differently, using the mock method in MockObjectTestCase rather than the constructor. This is a convenience method in the jMock library that means that I don’t need to explicitly call verify later on, any mock created with the convenience method is automatically verified at the end of the test. I could have done this in the first test too, but I wanted to show the verification more explicitly to show how testing with mocks works.

The second different thing in the second test case is that I’ve relaxed the constraints on the expectation by using withAnyArguments. The reason for this is that the first test checks that the number is passed to the warehouse, so the second test need not repeat that element of the test. If the logic of the order needs to be changed later, then only one test will fail, easing the effort of migrating the tests. As it turns out I could have left withAnyArguments out entirely, as that is the default.

此处说明了jMock测试的另外一种写法,主要的重点是说明诸如withAnyArguments()mock()等方法提供的便利性。

Using EasyMock

There are a number of mock object libraries out there. One that I come across a fair bit is EasyMock, both in its java and .NET versions. EasyMock also enable behavior verification, but has a couple of differences in style with jMock which are worth discussing. Here are the familiar tests again

介绍了另一个mock框架EasyMock。另外下面也介绍了jMockEasyMock之间的区别。

public class OrderEasyTester extends TestCase {
  private static String TALISKER = "Talisker";
  
  private MockControl warehouseControl;
  private Warehouse warehouseMock;
  
  public void setUp() {
    warehouseControl = MockControl.createControl(Warehouse.class);
    warehouseMock = (Warehouse) warehouseControl.getMock();    
  }

  public void testFillingRemovesInventoryIfInStock() {
    //setup - data
    Order order = new Order(TALISKER, 50);
    
    //setup - expectations
    warehouseMock.hasInventory(TALISKER, 50);
    warehouseControl.setReturnValue(true);
    warehouseMock.remove(TALISKER, 50);
    warehouseControl.replay();

    //exercise
    order.fill(warehouseMock);
    
    //verify
    warehouseControl.verify();
    assertTrue(order.isFilled());
  }

  public void testFillingDoesNotRemoveIfNotEnoughInStock() {
    Order order = new Order(TALISKER, 51);    

    warehouseMock.hasInventory(TALISKER, 51);
    warehouseControl.setReturnValue(false);
    warehouseControl.replay();

    order.fill((Warehouse) warehouseMock);

    assertFalse(order.isFilled());
    warehouseControl.verify();
  }
}

EasyMock uses a record/replay metaphor for setting expectations. For each object you wish to mock you create a control and mock object. The mock satisfies the interface of the secondary object, the control gives you additional features. To indicate an expectation you call the method, with the arguments you expect on the mock. You follow this with a call to the control if you want a return value. Once you’ve finished setting expectations you call replay on the control - at which point the mock finishes the recording and is ready to respond to the primary object. Once done you call verify on the control.
It seems that while people are often fazed at first sight by the record/replay metaphor, they quickly get used to it. It has an advantage over the constraints of jMock in that you are making actual method calls to the mock rather than specifying method names in strings. This means you get to use code-completion in your IDE and any refactoring of method names will automatically update the tests. The downside is that you can’t have the looser constraints.
The developers of jMock are working on a new version which will use other techniques to allow you use actual method calls.

  1. EasyMock通过record/replay来测试测试期望,具体的语法可以参见测试例子。
  2. 通过mock方法的形式,使得开发者可以使用IDE中的自动填充代码和重构。

The Difference Between Mocks and Stubs

When they were first introduced, many people easily confused mock objects with the common testing notion of using stubs. Since then it seems people have better understood the differences (and I hope the earlier version of this paper helped). However to fully understand the way people use mocks it is important to understand mocks and other kinds of test doubles. (“doubles”? Don’t worry if this is a new term to you, wait a few paragraphs and all will be clear.)

Mock和Stub之间的区别很容易被人们混淆,为了更好的了解他们,需要先引入比较重要的几个概念。

When you’re doing testing like this, you’re focusing on one element of the software at a time -hence the common term unit testing. The problem is that to make a single unit work, you often need other units - hence the need for some kind of warehouse in our example.

In the two styles of testing I’ve shown above, the first case uses a real warehouse object and the second case uses a mock warehouse, which of course isn’t a real warehouse object. Using mocks is one way to not use a real warehouse in the test, but there are other forms of unreal objects used in testing like this.

The vocabulary for talking about this soon gets messy - all sorts of words are used: stub, mock, fake, dummy. For this article I’m going to follow the vocabulary of Gerard Meszaros’s book. It’s not what everyone uses, but I think it’s a good vocabulary and since it’s my essay I get to pick which words to use.

测试的时候,应该一次关注一个被测试对象,为了让一个单元能运行起来,需要其他部分的配合,例如上面两个例子中的warehouse对象。它其实是一个虚拟的warehouse,在测试中还有其他几类相同的概念,诸如:stubmockfakedummy(它们来自于Gerard Meszaros的著作),虽然它们并不是通用的表述,但是却很容易理解。

Meszaros uses the term Test Double as the generic term for any kind of pretend object used in place of a real object for testing purposes. The name comes from the notion of a Stunt Double in movies. (One of his aims was to avoid using any name that was already widely used.) Meszaros then defined five particular kinds of double:

  • Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
  • Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
  • Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what’s programmed in for the test.
  • Spies are stubs that also record some information based on how they were called. One form of this might be an email service that records how many messages it was sent.
  • Mocks are what we are talking about here: objects pre-programmed with expectations which form a specification of the calls they are expected to receive.

Meszaros使用Test Doubles指代所有测试中为了达到某种目的而使用的替代对象,这个词来自于影视概念中的替身(Stunt Doubles)–为了避免和现有的概念冲突,Meszaros创建了这个词(Test Double),他将替代对象分为了五类。

  • Dummy 对象只是为了填充参数列表以通过编译,并不被实际使用。
  • Fake 对象拥有可用的实现,但通常是生产环境的建议实现(内存数据库就是一个很好的例子)
  • Stubs 只按照固定的方式回应在测试代码中定义好的内容。
  • Spies 是Stubs的扩展,它们可以记录被调用的信息。比如在代替email服务器的同时,记录它发送了多少信息。
  • Mock 替代对象包含预先的测试期望以及测试之后的期望验证。

Of these kinds of doubles, only mocks insist upon behavior verification. The other doubles can, and usually do, use state verification. Mocks actually do behave like other doubles during the exercise phase, as they need to make the SUT believe it’s talking with its real collaborators - but mocks differ in the setup and the verification phases.

To explore test doubles a bit more, we need to extend our example. Many people only use a test double if the real object is awkward to work with. A more common case for a test double would be if we said that we wanted to send an email message if we failed to fill an order. The problem is that we don’t want to send actual email messages out to customers during testing. So instead we create a test double of our email system, one that we can control and manipulate.

与其他替代对象通常使用状态验证不同的是,只有Mock包含了行为验证。在测试代码运行的阶段,Mock对象像其他的合作者一样运行,不同的是在数据初始阶段和结果验证阶段。

为了进一步说明这一点,需要扩展上面的例子。通常的情况下,人们只在真实对象难以实例化的时候使用替身对象。比方说一个通用的场景下,当我们在填写订单的产品信息失败的时候需要发送一封Email,这其中的问题在于测试时,我们并不期望真的发送一份Email,转而创建一个我们可以控制的email系统的测试替代对象。

Here we can begin to see the difference between mocks and stubs. If we were writing a test for this mailing behavior, we might write a simple stub like this.

public interface MailService {
  public void send (Message msg);
}
public class MailServiceStub implements MailService {
  private List<Message> messages = new ArrayList<Message>();
  public void send (Message msg) {
    messages.add(msg);
  }
  public int numberSent() {
    return messages.size();
  }
}     

上面的例子是创建的一个Email服务的桩代码。我们可以对桩代码进行如下的状态验证

  public void testOrderSendsMailIfUnfilled() {
    Order order = new Order(TALISKER, 51);
    MailServiceStub mailer = new MailServiceStub();
    order.setMailer(mailer);
    order.fill(warehouse);
    assertEquals(1, mailer.numberSent());
  }

Of course this is a very simple test - only that a message has been sent. We’ve not tested it was sent to the right person, or with the right contents, but it will do to illustrate the point.

当然,上面的只是一个极其简单的例子,虽然只测试了发送邮件的数量为1,没有包括内容、收件人的信息验证,但是也已经足够说明问题。

而使用mock的测试代码则完全不同。

  public void testOrderSendsMailIfUnfilled() {
    Order order = new Order(TALISKER, 51);
    Mock warehouse = mock(Warehouse.class);
    Mock mailer = mock(MailService.class);
    order.setMailer((MailService) mailer.proxy());

    mailer.expects(once()).method("send");
    warehouse.expects(once()).method("hasInventory")
      .withAnyArguments()
      .will(returnValue(false));

    order.fill((Warehouse) warehouse.proxy());
  }
}

In both cases I’m using a test double instead of the real mail service. There is a difference in that the stub uses state verification while the mock uses behavior verification.
In order to use state verification on the stub, I need to make some extra methods on the stub to help with verification. As a result the stub implements MailService but adds extra test methods.
Mock objects always use behavior verification, a stub can go either way. Meszaros refers to stubs that use behavior verification as a Test Spy. The difference is in how exactly the double runs and verifies and I’ll leave that for you to explore on your own.

在的两个例子中,我们都是用了测试替身来代替真正的测试替身,不同的是,Stub使用了测试验证,Mockes使用了行为验证,为了在Stub的基础上进行状态验证,我需要创建多余的numberSent()方法来帮助验证。Mockes通常使用行为验证。Meszaros指出stubs也可以像spy一样进行行为验证,这其中的具体内容,留待读者自行探索。

Classical and Mockist Testing(TDD中的经典测试和Mock测试)

Now I’m at the point where I can explore the second dichotomy: that between classical and mockist TDD. The big issue here is when to use a mock (or other double).

The classical TDD style is to use real objects if possible and a double if it’s awkward to use the real thing. So a classical TDDer would use a real warehouse and a double for the mail service. The kind of double doesn’t really matter that much.

A mockist TDD practitioner, however, will always use a mock for any object with interesting behavior. In this case for both the warehouse and the mail service.

现在到了探索什么时候使用Mock或者其他测试替身的情况了。经典的测试驱动开方式是尽量使用真实的测试对象,在创建真实对象困难是创建测试替身。因此,举例来说经典的测试驱动开发者,会选用真正的warehouse对象和mail service的测试替身(不论创建了哪一类的测试替身)。而Mock测试驱动开发者会用测试替身代替warehousemail service

An important offshoot of the mockist style is that of Behavior Driven Development (BDD). BDD was originally developed by my colleague Daniel Terhorst-North as a technique to better help people learn Test Driven Development by focusing on how TDD operates as a design technique. This led to renaming tests as behaviors to better explore where TDD helps with thinking about what an object needs to do. BDD takes a mockist approach, but it expands on this, both with its naming styles, and with its desire to integrate analysis within its technique. I won’t go into this more here, as the only relevance to this article is that BDD is another variation on TDD that tends to use mockist testing. I’ll leave it to you to follow the link for more information.

You sometimes see “Detroit” style used for “classical” and “London” for “mockist”. This alludes to the fact that XP was originally developed with the C3 project in Detroit and the mockist style was developed by early XP adopters in London. I should also mention that many mockist TDDers dislike that term, and indeed any terminology that implies a different style between classical and mockist testing. They don’t consider that there is a useful distinction to be made between the two styles.

Mock测试驱动开发的一个重要的分支是行为驱动开发(BDD),他最早是被我的同事开发出来,用来帮助人们理解测试驱动开发是如何作为一个设计手段的。其通过重命名测试为行为的方式来揭示一个对象的职责。关于行为驱动开发的具体信息留待读者自行探索。注意,在这这个过程中,可能看到“Detroit”(底特律)用来指代经典方式,“London”(伦敦)也都来指代Mock方式,原因是最早XP(极限编程)最早诞生于底特律的C3项目。而mockit风格最早发源于伦敦的极限编程者之间。虽然很多测试驱动开发者不喜欢这种表述,但是任然需要一些属于用来区分经典测试驱动开发和mock测试驱动开发。

Choosing Between the Differences

In this article I’ve explained a pair of differences: state or behavior verification / classic or mockist TDD. What are the arguments to bear in mind when making the choices between them? I’ll begin with the state versus behavior verification choice.
The first thing to consider is the context. Are we thinking about an easy collaboration, such as order and warehouse, or an awkward one, such as order and mail service?
If it’s an easy collaboration then the choice is simple. If I’m a classic TDDer I don’t use a mock, stub or any kind of double. I use a real object and state verification. If I’m a mockist TDDer I use a mock and behavior verification. No decisions at all.
If it’s an awkward collaboration, then there’s no decision if I’m a mockist - I just use mocks and behavior verification. If I’m a classicist then I do have a choice, but it’s not a big deal which one to use. Usually classicists will decide on a case by case basis, using the easiest route for each situation.

在这篇文章中,已经解释了部分的差异: 状态还是行为验证/经典方式还是和Mock方式。但是,如何在它们之间做出选择呢,可以先从行为验证和状态验证的差异开始入手。在开始的第一步,我们需要明确讨论的上下文,可以初步分为order对应的warehouse这样的简单情景,也包括orderemail service这样的复杂情景。前者来说相对简单,不论你是经典测试驱动开发者合适Mock测试驱动开发者,都会选择各自熟悉的方式。但是应对复杂的情况,作为Mock测试驱动开发,只有惊醒行为验证这一个选项。而经典测试驱动开发者则可以根据实际情况选择简单的方式(行为验证、状态验证、Mock对象,使用测试替身),只要能达到目标,都可以。

So as we see, state versus behavior verification is mostly not a big decision. The real issue is between classic and mockist TDD. As it turns out the characteristics of state and behavior verification do affect that discussion, and that’s where I’ll focus most of my energy.

But before I do, let me throw in an edge case. Occasionally you do run into things that are really hard to use state verification on, even if they aren’t awkward collaborations. A great example of this is a cache. The whole point of a cache is that you can’t tell from its state whether the cache hit or missed - this is a case where behavior verification would be the wise choice for even a hard core classical TDDer. I’m sure there are other exceptions in both directions.

As we delve into the classic/mockist choice, there’s lots of factors to consider, so I’ve broken them out into rough groups.

如果我们所见,是状态验证还是行为验证并不是最重要的,但是针对当前的情况,讨论他们之间的特征还是有必要,在开始之前,可以讨论一种边界情况。在某些情况下,即使合作对象不难创建,我们也很难使用状态验证,例如缓存。仅仅通过状态验证,很难说明缓存是命中了还是击穿了,这个是使用行为验证就是一个明智的选择。接下来,将在几个方面具体的指出它们之间的特征差异。

Driving TDD

Mock objects came out of the XP community, and one of the principal features of XP is its emphasis on Test Driven Development - where a system design is evolved through iteration driven by writing tests.

Thus it’s no surprise that the mockists particularly talk about the effect of mockist testing on a design. In particular they advocate a style called need-driven development. With this style you begin developing a user story by writing your first test for the outside of your system, making some interface object your SUT. By thinking through the expectations upon the collaborators, you explore the interaction between the SUT and its neighbors - effectively designing the outbound interface of the SUT.

Mock来源于极限编程社区,极限编程的一个特点就是TDD(测试驱动开发)-- 一个编写测试来迭代系统的设计方式。因此,在设计过程中,通常用mock表述来对象的行为。他们甚至还拥护一种叫做需求驱动开发的设计方式–我们通过创建一个系统边界之外的测试,来描述一个用户故事,通过设计测试期望,暴露被测对象和合作者之间的交互。以上方式有效的完成了被测对象的边界接口的接口设计。

Once you have your first test running, the expectations on the mocks provide a specification for the next step and a starting point for the tests. You turn each expectation into a test on a collaborator and repeat the process working your way into the system one SUT at a time. This style is also referred to as outside-in, which is a very descriptive name for it. It works well with layered systems. You first start by programming the UI using mock layers underneath. Then you write tests for the lower layer, gradually stepping through the system one layer at a time. This is a very structured and controlled approach, one that many people believe is helpful to guide newcomers to OO and TDD.

一旦第一个测试开始运行了,其中的设定的期望内容可以明确的指示你接下来测试的开发内容。通过依次实现被测对象的期望来推进自己的开发流程。这种方式也称作由外向内(outside-in)。它在分层系统中非常有用。你可以通过mock底层对象来开始编写UI层。接下来,你编写底层的测试,之后逐步向下。这种方式对于引导新手从OO到TDD非常有用。

Classic TDD doesn’t provide quite the same guidance. You can do a similar stepping approach, using stubbed methods instead of mocks. To do this, whenever you need something from a collaborator you just hard-code exactly the response the test requires to make the SUT work. Then once you’re green with that you replace the hard coded response with a proper code.

But classic TDD can do other things too. A common style is middle-out. In this style you take a feature and decide what you need in the domain for this feature to work. You get the domain objects to do what you need and once they are working you layer the UI on top. Doing this you might never need to fake anything. A lot of people like this because it focuses attention on the domain model first, which helps keep domain logic from leaking into the UI.

经典的TDD方法不尽相同。你可以使用桩代码来进行类似的流程–一旦你需要合作代码时候,通过硬编码的形式来设置返回,使得被测对象编译通过。一旦测试通过,你继续往下将硬编码的响应内容替换成合适的代码。当然,经典的测试开发方法也包含了其他方式。一种通用的方法,被称之为middle-out– 创建一个新的特性分支,然后决定这个分支可以有效工作的领域范围。一定领域对象开始工作,则开始上层的UI的编程工作。通过这种方式,永远不需要创建测试替身。很多人都喜欢这种方式,它让我们先行集中于领域对象,而不受UI层的影响。

Fixture Setup

With classic TDD, you have to create not just the SUT but also all the collaborators that the SUT needs in response to the test. While the example only had a couple of objects, real tests often involve a large amount of secondary objects. Usually these objects are created and torn down with each run of the tests.

Mockist tests, however, only need to create the SUT and mocks for its immediate neighbors. This can avoid some of the involved work in building up complex fixtures (At least in theory. I’ve come across tales of pretty complex mock setups, but that may be due to not using the tools well.)

在经典TDD方式中,为了测试能够运行,需要创建被测对象和其所需要的所有合作对象的信息。前面给出的例子中只有几个对象,而在真正的测试中这个数量要更多,每次测试运行的时候,都需要初始化和销毁大量的对象。Mock方式的测试则只需要我们创建被测对象和与被测对象直接关联的合作对象。这种方式避免了创建大量复杂的就设置。(只是理论上的,也存在使用工具不合理导致的复杂初始化内容)

In practice, classic testers tend to reuse complex fixtures as much as possible. In the simplest way you do this by putting fixture setup code into the xUnit setup method. More complicated fixtures need to be used by several test classes, so in this case you create special fixture generation classes. I usually call these Object Mothers, based on a naming convention used on an early ThoughtWorks XP project. Using mothers is essential in larger classic testing, but the mothers are additional code that need to be maintained and any changes to the mothers can have significant ripple effects through the tests. There also may be a performance cost in setting up the fixture - although I haven’t heard this to be a serious problem when done properly. Most fixture objects are cheap to create, those that aren’t are usually doubled.

As a result I’ve heard both styles accuse the other of being too much work. Mockists say that creating the fixtures is a lot of effort, but classicists say that this is reused but you have to create mocks with every test.

实际情况中,经典TDD方式的测试者倾向于重用复杂的固定代码。最简单的方式是将固定的初始化代码移动到setup方法中。一些更复杂的设置代码可能被更多的测试类所使用的,在这种情况下,创建特殊的设置代码创建类。我通常称之为Object Mothers,他通常在大型的对象中被使用,而且,增加了额外的维护和改动引起连锁反应的风险。而且,其中还会有性能上的多余开销。

最终,两种测试方式的拥趸都职责对方的方式需要大量的工作-Mock方式需要在每个测试中创建测试替身,经典方式则需要花大量精力管理固定代码。

Test Isolation

If you introduce a bug to a system with mockist testing, it will usually cause only tests whose SUT contains the bug to fail. With the classic approach, however, any tests of client objects can also fail, which leads to failures where the buggy object is used as a collaborator in another object’s test. As a result a failure in a highly used object causes a ripple of failing tests all across the system.

Mockist testers consider this to be a major issue; it results in a lot of debugging in order to find the root of the error and fix it. However classicists don’t express this as a source of problems. Usually the culprit is relatively easy to spot by looking at which tests fail and the developers can tell that other failures are derived from the root fault. Furthermore if you are testing regularly (as you should) then you know the breakage was caused by what you last edited, so it’s not difficult to find the fault.

如果在使用mockist的测试中引入了一个bug,通常只有包含bug的被测对象会测试失败。而在经典TDD的测试系统中,这个对象作为合作者的测试也会失败。最后会造成一些列贯穿系统的测试失败。Mock测试者认为这是一个主要问题,其会导致为了查找一个问题的根源而进行大量的debug.然而经典测试者则不担心这一点,通常,通过观察那些测试失败就可以轻而易举的发现罪魁祸首。只要经常性的运行测试,就可以轻而易举的发现异常。

One factor that may be significant here is the granularity of the tests. Since classic tests exercise multiple real objects, you often find a single test as the primary test for a cluster of objects, rather than just one. If that cluster spans many objects, then it can be much harder to find the real source of a bug. What’s happening here is that the tests are too coarse grained.

It’s quite likely that mockist tests are less likely to suffer from this problem, because the convention is to mock out all objects beyond the primary, which makes it clear that finer grained tests are needed for collaborators. That said, it’s also true that using overly coarse grained tests isn’t necessarily a failure of classic testing as a technique, rather a failure to do classic testing properly. A good rule of thumb is to ensure that you separate fine-grained tests for every class. While clusters are sometimes reasonable, they should be limited to only very few objects - no more than half a dozen. In addition, if you find yourself with a debugging problem due to overly coarse-grained tests, you should debug in a test driven way, creating finer grained tests as you go.

它们之间的另一个显著的特征是粒度。因为经典测试方式会执行多个真实对象,因此一个测试可以作为一组对象的主要测试类,但是这云开的也会造成我们那难以查找问题的源头,这就是所说的粒度太高。而Mock测试者则不会有这个问题,因为Mock的基本内容就是替换除了主要对象之外的所有其他对象,这是一个极好的粒度。但是,使用粗粒度测试的方式并不能说明经典测试方法是失败的,而是代表没有合理地进行经典测试。一个较好的原则是保证每一个测试类都拥有较好的力度。适当的多个对象聚合是合理的,但是不宜过多,如果发现debug的过程变得困难,应该参照测试驱动的方式,重新创建粒度合适的测试。

In essence classic xunit tests are not just unit tests, but also mini-integration tests. As a result many people like the fact that client tests may catch errors that the main tests for an object may have missed, particularly probing areas where classes interact. Mockist tests lose that quality. In addition you also run the risk that expectations on mockist tests can be incorrect, resulting in unit tests that run green but mask inherent errors.

It’s at this point that I should stress that whichever style of test you use, you must combine it with coarser grained acceptance tests that operate across the system as a whole. I’ve often come across projects which were late in using acceptance tests and regretted it.

经典测试不仅仅是单元测试,也包括了微集成测试。人们发现在客户端进行测试往往能发现单元测试中遗漏的问题,特别是在对象交互的场景中。Mock测试方式则没有这个能力,另外,(使用mock)还会有测试期望设置不正确的风险,虽然最终测试通过了,但是可能还会包含隐藏的错误。最后,不论使用哪种测试方式,都需要根据整个系统来选择合适的测试粒度。

Coupling Tests to Implementations(测试关联实现)

When you write a mockist test, you are testing the outbound calls of the SUT to ensure it talks properly to its suppliers. A classic test only cares about the final state - not how that state was derived. Mockist tests are thus more coupled to the implementation of a method. Changing the nature of calls to collaborators usually cause a mockist test to break.

当在编写一个mock的测试时,是在被测对象的外部测试其是否正确的在和外部对象交互。而一个经典测试则只验证数据的最终状态。因此,Mock方式的测试相对来说和方法的实现更加耦合,修改调用对象将会更加轻易的造成测试失败。

This coupling leads to a couple of concerns. The most important one is the effect on Test Driven Development. With mockist testing, writing the test makes you think about the implementation of the behavior - indeed mockist testers see this as an advantage. Classicists, however, think that it’s important to only think about what happens from the external interface and to leave all consideration of implementation until after you’re done writing the test.

Coupling to the implementation also interferes with refactoring, since implementation changes are much more likely to break tests than with classic testing.

This can be worsened by the nature of mock toolkits. Often mock tools specify very specific method calls and parameter matches, even when they aren’t relevant to this particular test. One of the aims of the jMock toolkit is to be more flexible in its specification of the expectations to allow expectations to be looser in areas where it doesn’t matter, at the cost of using strings that can make refactoring more tricky.

这种耦合将导致一些问题,其中对于测试驱动开发来说最重要的就是,使用mock方式的测试时,需要考虑方法的实现方式,毕竟mock方式的测试者认为这是一种优势。经典测试,则认为在完成测试代码之前,不应该将接口的实现形式作为考虑范围。与代码的实现耦合还会影响重构,因为实现形式的改变会更容易直接导致测试的失败。如果使用原生的mock框架,这种影响更加恶劣,它们使用字符串的方式调用(参见Jmock的测试例子),这导致方法重构的难度直线增加。

Design Style

One of the most fascinating aspects of these testing styles to me is how they affect design decisions. As I’ve talked with both types of tester I’ve become aware of a few differences between the designs that the styles encourage, but I’m sure I’m barely scratching the surface.

I’ve already mentioned a difference in tackling layers. Mockist testing supports an outside-in approach while developers who prefer a domain model out style tend to prefer classic testing.

思考这些测试方式在测试驱动开发过程中对于系统设计的影响是一个令人着迷的方面。在和两类测试者交流的过程中,我已经意识到了他们之间的区别,但是,也仅限于一些很浅显的层面。我已经提到了在处理软件层级方面的区别。

On a smaller level I noticed that mockist testers tend to ease away from methods that return values, in favor of methods that act upon a collecting object. Take the example of the behavior of gathering information from a group of objects to create a report string. A common way to do this is to have the reporting method call string returning methods on the various objects and assemble the resulting string in a temporary variable. A mockist tester would be more likely to pass a string buffer into the various objects and get them to add the various strings to the buffer - treating the string buffer as a collecting parameter.

Mockist testers do talk more about avoiding ‘train wrecks’ - method chains of style of getThis().getThat().getTheOther(). Avoiding method chains is also known as following the Law of Demeter. While method chains are a smell, the opposite problem of middle men objects bloated with forwarding methods is also a smell. (I’ve always felt I’d be more comfortable with the Law of Demeter if it were called the Suggestion of Demeter.)

我发现Mock测试者坑更喜欢用集合类型的对象,是不是直接返回数据的对象。举一个在大量对象中搜集数据的例子,通用的方式是调用这些对象的report方法,并将结果返回。mock测试开发者更喜欢在这写对象之间传递一个搜集对象,来搜集信息。Mock对象引以为豪的消除链式调用–也是迪米特法则中明确说明的。链式调用会造成代码的坏味道。对于OO(面向对象编程)中最难以把握的一个原则, “Tell Don’t Ask” principle–指的是直接让对象实现它的职责,而不是在对象之外获取其数据,然后在客户端代码中实现。正因为如此mock测试开发者,声称他们可以消除代码中导出弥漫的getter代码。

An acknowledged issue with state-based verification is that it can lead to creating query methods only to support verification. It’s never comfortable to add methods to the API of an object purely for testing, using behavior verification avoids that problem. The counter-argument to this is that such modifications are usually minor in practice.

Mockists favor role interfaces and assert that using this style of testing encourages more role interfaces, since each collaboration is mocked separately and is thus more likely to be turned into a role interface. So in my example above using a string buffer for generating a report, a mockist would be more likely to invent a particular role that makes sense in that domain, which may be implemented by a string buffer.

It’s important to remember that this difference in design style is a key motivator for most mockists. TDD’s origins were a desire to get strong automatic regression testing that supported evolutionary design. Along the way its practitioners discovered that writing tests first made a significant improvement to the design process. Mockists have a strong idea of what kind of design is a good design and have developed mock libraries primarily to help people develop this design style.

对于状态验证,一个非常出名的问题是,它可能造成仅仅为了验证,就需要创建一些查询代码,如果仅仅为了测试,就需要添加API,这样的体验非常不好。然而与此相反的论点是,这种修改在实践中通常很小。Mock方式测试者更加喜欢角色接口, 并且这种测试方式会创造更多的角色接口。那上面的例子来说,Mock方式的测试者会发明一个实现方式是string buffer的角色来实现对象信息的接收。上面提到的那么多区别就是mocke方式开发的主要驱动因素。TDD的初衷就是支持迭代的自动化回归测试,Mock方式对于促进系统良好设计有很强的促进作用。

So should I be a classicist or a mockist?

I find this a difficult question to answer with confidence. Personally I’ve always been a old fashioned classic TDDer and thus far I don’t see any reason to change. I don’t see any compelling benefits for mockist TDD, and am concerned about the consequences of coupling tests to implementation.

This has particularly struck me when I’ve observed a mockist programmer. I really like the fact that while writing the test you focus on the result of the behavior, not how it’s done. A mockist is constantly thinking about how the SUT is going to be implemented in order to write the expectations. This feels really unnatural to me.

I also suffer from the disadvantage of not trying mockist TDD on anything more than toys. As I’ve learned from Test Driven Development itself, it’s often hard to judge a technique without trying it seriously. I do know many good developers who are very happy and convinced mockists. So although I’m still a convinced classicist, I’d rather present both arguments as fairly as I can so you can make your own mind up.

So if mockist testing sounds appealing to you, I’d suggest giving it a try. It’s particularly worth trying if you are having problems in some of the areas that mockist TDD is intended to improve. I see two main areas here. One is if you’re spending a lot of time debugging when tests fail because they aren’t breaking cleanly and telling you where the problem is. (You could also improve this by using classic TDD on finer-grained clusters.) The second area is if your objects don’t contain enough behavior, mockist testing may encourage the development team to create more behavior rich objects.

大意就是,我很纠结,你想试用的话也可以。别是当你在粒度设计方面出现了问题,或者是设计过程中没有合适行为的对象的时候。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值