Item 32: Prefer Smaller, Cohesive Assemblies
优先选择小而内聚的程序集
This item should really be titled "Build Assemblies That Are the Right Size and Contain a Small Number of Public Types." But that's too wordy, so I titled it based on the most common mistake I see: developers putting everything but the kitchen sink in one assembly. That makes it hard to reuse components and harder to update parts of a system. Many smaller assemblies make it easier to use your classes as binary components.
本条款的标题真的应该是“构建大小合适并且包含少数公共类型的程序集”。但是字数太多了,因此我以我看到的最常见的错误来进行命名:开发者将几乎所有的东西都放在一个程序集里面。那样子使得组件重用变得困难,更新一个系统的部分也变得困难。很多小的程序集使得将你的类作为二进制组件很容易。
The title also highlights the importance of cohesion. Cohesion is the degree to which the responsibilities of a single component form a meaningful unit. Cohesive components can be described in a single simple sentence. You can see this in many of the .NET FCL assemblies. Two examples are: the System.Collections assembly provides data structures for storing sets of related objects and the System.Windows.Forms assembly provides classes that model Windows controls. Web forms and Windows Forms are in different assemblies because they are not related. You should be able to describe your own assemblies in the same fashion using one simple sentence. No cheating: The MyApplication assembly provides everything you need. Yes, that's a single sentence. But it's also lazy, and you probably don't need all of that functionality in My2ndApplication (though you'd probably like to reuse some of it. That "some of it" should be packaged in its own assembly).
该标题也显著的说明了内聚的重要性。内聚就是指单独的形成一个有意义的单位的组件的职责的度。内聚的组件可以用一个单独的简单的语句来描述。在.Net FCL程序集里面能够看到很多。其中两个例子是: System.Collections程序集提供存储相关对象集合的数据结构,System.Windows.Forms程序集提供窗口控件模型的类。Web窗体和Windows窗体位于不同的程序集里面,因为它们互不相关。你应该能够像这样使用一个简单的语句来描述你自己的程序集。毫不撒谎:MyApplication程序集提供了你需要的一切东西。是的,那是一个简单的句子。但是,那同时也是懒惰的,你很可能在My2ndApplication里面不需要所有的功能(尽管你很可能重用了其中的一些。这里的“其中的一些”就应该是处于自己程序集中的包。)
You should not create assemblies with only one public class. You do need to find the middle ground. If you go too far and create too many assemblies, you lose some benefits of encapsulation: You lose the benefits of internal types by not packaging related public classes in the same assembly (see Item 33). The JIT compiler can perform more efficient inlining inside an assembly than across assembly boundaries. This means that packaging related types in the same assembly is to your advantage. Your goal is to create the best-sized package for the functionality you are delivering in your component. This goal is easier to achieve with cohesive components: Each component should have one responsibility.
你不应该创建仅仅有一个公共类的程序集。其实你应该折衷。如果你走的太偏激,创建太多程序集,就失去了封装性的好处:失去了内部类型的好处,因为你没有在同一个程序集里面打包相关联的类(见Item 33)。在一个程序集内部与跨程序集相比较,JIT编译器可以高效的进行内联。这意味着,将相关的类在同一个程序集里面进行打包,是对你有利的。你的目标就是为你的组件中传递的功能,创建合适大小的包。这个目标很容易通过内聚组件来实现:每个组件应该只有一个职责。
In some sense, an assembly is the binary equivalent of class. We use classes to encapsulate algorithms and data storage. Only the public interfaces are part of the official contract, so only the public interfaces are visible to users. In the same sense, assemblies provide a binary package for a related set of classes. Only public and protected classes are visible outside an assembly. Utility classes can be internal to the assembly. Yes, they are more visible than private nested classes, but you have a mechanism to share common implementation inside that assembly without exposing that implementation to all users of your classes. Partitioning your application into multiple assemblies encapsulates related types in a single package.
在某种程度上,程序集就是类的二进制等价物。我们用类封装算法和存储数据。只有公共的接口是正式合约的一部分,因此,只有公共接口对用户来说才是可见的。在这个程度上,程序集为一系列相关的类提供了一个二进制包。在程序集外部只能看到公共的和受保护的类,它们比私有内嵌类有更高的可见性。但是,通过某种机制,你可以不向这些类的所有用户暴露具体实现就能在程序集内部共享一些通用的实现。将你的应用程序进行分组,分成多个程序集,在每个单独的包里面封装相关的类型。
Second, using multiple assemblies makes a number of different deployment options easier. Consider a three-tiered application, in which part of the application runs as a smart client and part of the application runs on the server. You supply some validation rules on the client so that users get feedback as they enter or edit data. You replicate those rules on the server and combine them with other rules to provide more robust validation. The complete set of business rules is implemented at the server, and only a subset is maintained at each client.
其次,使用多程序集,使得不同的部署选项更容易。考虑一个3部分的应用程序,一部分运行在智能客户端上,一部分运行在服务器上。在客户端上你支持一些验证规则,那样的话,当用户进入或者编辑数据的时候会得到反馈。你在服务器上复制这些规则,并将他们和其他一些规则相组合,提供更健壮的验证。完整的商业规格集合在服务器上被实现,只有一些子集在每个客户端上被维护。
Sure, you could reuse the source code and create different assemblies for the client and server-side business rules, but that would complicate your delivery mechanism. That leaves you with two builds and two installations to perform when you update the rules. Instead, separate the client-side validation from the more robust server-side validation by placing them in different assemblies. You are reusing binary objects, packaged in assemblies, rather than reusing object code or source code by compiling those objects into the multiple assemblies.
当然,你可以重用资源代码,为客户端创建不同的程序集和服务器端的商业规则,但是那会使得你的发布机制变得复杂。你在更新规则时,有两个不同的版本要编译和两个安装程序要执行。相反,应该通过将它们放置在不同的程序集里面,将客户端验证和更健壮的服务器端验证相分离。你正在重用打包在程序集里面的二进制对象,比下面这样做更好:重用对象代码或者源代码,将它们编译到多个程序集里面。
An assembly should contain an organized library of related functionality. That's an easy platitude, but it's much harder to implement in practice. The reality is that you might not know beforehand which classes will be distributed to both the server and client portions of a distributed application. Even more likely, the set of server- and client-side functionality will be somewhat fluid; you'll move features between the two locations. By keeping the assemblies small, you'll be more likely to redeploy more easily on both client and server. The assembly is a binary building block for your application. That makes it easier to plug a new component into place in a working application. If you make a mistake, make too many smaller assemblies rather than too few large ones.
一个程序集应该包含有组织的相关功能的库。这是个很容易的老说法,但是实践起来却很困难。本质在于,对于分布式应用程序,你可能事先不知道哪个类会被同时部署在服务器和客户端上。甚至更可能的情况是,服务器端和客户端的功能集合将会有点流动性;你可能需要在两者之间移动一些特性。通过保持程序集较小,在服务器端和客户端上进行重新部署的时候会更容易。对于你的应用程序来说,程序集就是二进制编译块。那样的话,向一个正在工作的应用程序插入一个新的组件将会很容易。如果你犯了一个错误,生成很多小的程序集也比一些大的程序集要好。
I often use Legos as an analogy for assemblies and binary components. You can pull out one Lego and replace it easily; it's a small block. In the same way, you should be able to pull out one assembly and replace it with another assembly that has the same interfaces. The rest of the application should continue as if nothing happened. Follow the Lego analogy a little farther. If all your parameters and return values are interfaces, any assembly can be replaced by another that implements the same interfaces (see Item 19).
我经常将Lego(一种类似积木的玩具)看作是程序集和二进制组件的一种类比。你可以抽出一个Lego,很容易的替换它;它就是一个小块。同样,你应该能够抽出一个程序集,使用另外一个有同样接口的程序集来替换它。应用程序的其他部分可以像什么也没有发生一样继续工作。进一步再看看Lego。如果所有的参数和返回值都是接口的话,那么任何一个程序集都可以被其它实现了同样接口的程序集替换掉(见Item 19)。
Smaller assemblies also let you amortize the cost of application startup. The larger an assembly is, the more work the CPU does to load the assembly and convert the necessary IL into machine instructions. Only the routines called at startup are JITed, but the entire assembly gets loaded and the CLR creates stubs for every method in the assembly.
小程序集也让你将应用程序启动时的花费分解开来。一个程序集越大,CPU在加载程序集、将IL转换成机器指令的时候所做的工作就越多。只有在启动时被调用的子程序才会被执行JIT,但是整个程序集会被加载,CLR会为每个程序集中的方法创建一份存档。
Time to take a break and make sure we don't go to extremes. This item is about making sure that you don't create single monolithic programs, but that you build systems of binary, reusable components. You can take this advice too far. Some costs are associated with a large program built on too many small assemblies. You will incur a performance penalty when program flow crosses assembly boundaries. The CLR loader has a little more work to do to load many assemblies and turn IL into machine instructions, particularly resolving function addresses.
休息一下,确认我们不会走上极端。该条款是关于要确保你不会创建单块用来进行集成的程序,而是要构建二进制系统,可重用的组件。但是,你也可能会走的太远。构建在很多小的程序集上的大程序,一般是要付出一些代价的。当程序流在程序集界限间流动时,将可能招致性能的惩罚。为了加载这么多的程序集,并且将IL转换成机器指令,尤其是要对方法地址进行解析,这将会花费CLR加载器有点多的时间。
Extra security checks also are done across assembly boundaries. All code from the same assembly has the same level of trust (not necessarily the same access rights, but the same trust level). The CLR performs some security checks whenever code flow crosses an assembly boundary. The fewer times your program flow crosses assembly boundaries, the more efficient it will be.
额外的安全检查在跨越程序集边界时也被执行。来自同一个程序集的代码具有同样的信任等级(不一定是同样的访问权限,是同样的信任级别)。无论何时当代码流跨越程序集边界时,CLR都执行一些安全检查。你的程序流跨越程序集边界越少,它就将会越高效。
None of these performance concerns should dissuade you from breaking up assemblies that are too large. The performance penalties are minor. C# and .NET were designed with components in mind, and the greater flexibility is usually worth the price.
任何一个这些性能上的关注点都不应该阻止你对很大的程序集进行分隔。性能代价是次要的。C#和.Net设计时就专注于组件,同时,为了较大的灵活性通常也值得付出这些性能上的代价。
So how do you decide how much code or how many classes go in one assembly? More important, how do you decide which code goesin an assembly? It depends greatly on the specific application, so there is not one answer. Here's my recommendation: Start by looking at all your public classes. Combine public classes with common base classes into assemblies. Then add the utility classes necessary to provide all the functionality associated with the public classes in that same assembly. Package related public interfaces into their own assemblies. As a final step, look for classes that are used horizontally across your application. Those are candidates for a broad-based utility assembly that contains your application's utility library.
那么,如何来决定该有多少代码或者该有多少类应该在一个程序集里面呢?更重要的是,如何决定哪个代码在哪个程序集里呢?很大程度上这取决于特定的应用程序,因此没有一个固定的答案。这是我的建议:由检视你的所有公共类开始。将使用公共基类的公共类合并到一个程序集中。然后为同一个程序集中的所有公共类的功能,添加一个必要的有效的类作为接口。将相关的公共接口打包到它们自己的程序集中。最后一步,寻找横穿你的应用程序的类,它们是包含有你的应用程序有用的库的很广泛的程序集的候选者。
The end result is that you create a component with a single related set of public classes and the utility classes necessary to support it. You create an assembly that is small enough to get the benefits of easy updates and easier reuse, while still minimizing the costs associated with multiple assemblies. Well-designed, cohesive components can be described in one simple sentence. For example, "Common.Storage.dll manages the offline data cache and all user settings" describes a component with low cohesion. Instead, make two components: "Common.Data.dll manages the offline data cache. Common.Settings.dll manages user settings." When you've split those up, you might need a third component: "Common.EncryptedStorage.dll manages file system IO for encrypted local storage." You can update any of those three components independently.
最后的结果就是:你创建了一个组件,包含了一个单独的相关的公共类的集合,并且必要的提供支持的有用的类。创建的程序集要足够小,来获得易更新易重用的好处,同时,使得多个程序集带来的代价做购销。设计精良的,内聚的组件可以用一个简单的句子进行描述。例如,“"Common.Storage.dll管理离线数据缓存和所有用户设置”描述了一个低内聚的组件。相反,应该变成2个组件:“Common.Data.dll管理离线数据缓存。Common.Settings.dll管理用户设置”。当你已经分开它们之后,可能需要第三个组件:“Common.EncryptedStorage.dll为加密的本地存储管理文件系统IO”。你可以独立的更新这三个组件中的任意一个。
Small is a relative term. Mscorlib.dll is roughly 2MB; System.Web.RegularExpressions.dll is merely 56KB. But both satisfy the core design goal of a small, reusable assembly: They contain a related set of classes and interfaces. The difference in absolute size has to do with the difference in functionality: mscorlib.dll contains all the low-level classes you need in every application. System.Web.RegularExpressions.dll is very specific; it contains only those classes needed to support regular expressions in Web controls. You will create both kinds of components: small, focused assemblies for one specific feature and larger, broad-based assemblies that contain common functionality. In either case, make them as small as what's reasonable, but not smaller.
“小”是个相对词。Mscorlib.dll大概有2MB;System.Web.RegularExpressions.dll只有56K。但是两者都满足“小且可重用”的程序集的核心设计目标:他们包含一系列相关的类和接口。绝对的大小和功能的不同有关:mscorlib.dll包含应用程序需要的所有底层类。System.Web.RegularExpressions.dll很特殊,只包含那些必须的用来支持Web控件的常规表达式。你将创建2种类型的组件:为了一个指定特性的小且集中的程序集;包含通用功能的广泛的程序集。无论哪种情况,都要让他们小的合理,而不是一味的追求小。