4.4 NON-TECHNICAL OBSTACLES
Why then is reuse not more common?
Most of the serious impediments to reuse are technical； removing them will be the subject of the following sections of this chapter (and of much of the rest of this book). But of course there are also some organizational, economical and political obstacles.
The NIH syndrome
An often quoted psychological obstacle to reuse is the famous Not Invented Here (“NIH”) syndrome. Software developers, it is said, are individualists, who prefer to redo everything by themselves rather than rely on someone else’s work.
This contention (commonly heard in managerial circles) is not borne out by experience. Software developers do not like useless work more than anyone else. When a good, well-publicized and easily accessible reusable solution is available, it gets reused.
Consider the typical case of lexical and syntactic analysis. Using parser generators such as the Lex-Yacc combination, it is much easier to produce a parser for a command language or a simple programming language than if you must program it from scratch. The result is clear: where such tools are available, competent software developers routinely reuse them. Writing your own tailor-made parser still makes sense in some cases, since the tools mentioned have their limitations. But the developers’ reaction is usually to go by default to one of these tools； it is when you want to use a solution not based on the reusable mechanisms that you have to argue for it. This may in fact cause a new syndrome, the reverse of NIH, which we may call HIN (Habit Inhibiting Novelty): a useful but limited reusable solution, so entrenched that it narrows the developers’ outlook and stifles innovation, becomes counter-productive. Try to convince some Unix developers to use a parser generator other than Yacc, and you may encounter HIN first-hand.
考虑一下词汇语法分析的典型情况.使用如Lex-Yacc组合这样的解析生成器,对一个命令语言或一个简单的程序语言产生一个解析器(parser),比您必须从头开发它更加容易.结果很清楚:当能利用这样的工具时,能干的软件开发者就会照例复用它们.由于这些所提及的工具有其局限性,写您自己度身订做的解析器在一些情况下仍然是有道理的.但是开发者通常是这些工具没有替代产品才会去做；而且这是当您使用了一个并不基于复用机制的解决方案的时候,您也不得不同意. 事实上这可能引起一种新的并发症,和NIH相反,我们可以称之为HIN(Habit Inhibiting Novelty墨守成规):一种有用但是有限的复用解决方案,如此难以改变导致了开发者的视野狭窄并扼杀革新,成为反生产力.试着让一些Unix开发者除了Yacc之外去使用另一个解析器,您也许就会直径遇到HIN.
Something which may externally look like NIH does exist, but often it is simply the developers’ understandably cautious reaction to new and unknown components. They may fear that bugs or other problems will be more difficult to correct than with a solution over which they have full control. Often such fears are justified by unfortunate earlier attempts at reusing components, especially if they followed from a management mandate to reuse at all costs, not accompanied by proper quality checks. If the new components are of good quality and provide a real service, fears will soon disappear.
What this means for the producer of reusable components is that quality is even more important here than for more ordinary forms of software. If the cost of a non-reusable, one-of-a-kind solution is N, the cost R of a solution relying on reusable components is never zero: there is a learning cost, at least the first time； developers may have to bend their software to accommodate the components； and they must write some interfacing software, however small, to call them. So even if the reusability savings
r = ------
and other benefits of reuse are potentially great, you must also convince the candidate reusers that the reusable solution’s quality is good enough to justify relinquishing control.
r = ------
This explains why it is a mistake to target a company’s reusability policy to the potential reusers (the consumers, that is to say the application developers). Instead you should put the heat on the producers, including people in charge of acquiring external components, to ensure the quality and usefulness of their offering. Preaching reuse to application developers, as some companies do by way of reusability policy, is futile: because application developers are ultimately judged by how effectively they produce their applications, they should and will reuse not because you tell them to but because you have done a good enough job with the reusable components (developed or acquired) that it will be profitable for their applications to rely on these components.
The economics of procurement
A potential obstacle to reuse comes from the procurement policy of many large corporations and government organizations, which tends to impede reusability efforts by focusing on short-term costs. US regulations, for example, make it hard for a government agency to pay a contractor for work that was not explicitly commissioned (normally as part of a Request For Proposals). Such rules come from a legitimate concern to protect taxpayers or shareholders, but can also discourage software builders from applying the crucial effort of generalization to transform good software into reusable components.
复用的一个潜在的障碍来自许多大的公司和政府机构的采购政策,因关注短期的费用而妨碍了复用性的成果.例如,美国法规使政府机关对没被明确委任工作的承包商支付费用很困难(通常为征求建议书(Request For Proposals, RFP)的一部份).这样的规则来自于对保护纳税人或股东的合法考虑,但是也阻止了软件构建者应用至关重要的泛化(generalization)成果把好的软件转变成可复用的组件.
On closer examination this obstacle does not look so insurmountable. As the concern for reusability spreads, there is nothing to prevent the commissioning agency from including in the RFP itself the requirement that the solution must be general-purpose and reusable, and the description of how candidate solutions will be evaluated against these criteria. Then the software developers can devote the proper attention to the generalization task and be paid for it.
Software companies and their strategies
Even if customers play their part in removing obstacles to reuse, a potential problem remains on the side of the contractors themselves. For a software company, there is a constant temptation to provide solutions that are purposely not reusable, for fear of not getting the next job from the customer — because if the result of the current job is too widely applicable the customer may not need a next job!
I once heard a remarkably candid expose of this view after giving a talk on reuse and object technology. A high-level executive from a major software house came to tell me that, although intellectually he admired the ideas, he would never implement them in his own company, because that would be killing the goose that laid the golden egg: more than 90% of the company’s business derived from renting manpower — providing analysts and programmers on assignment to customers — and the management’s objective was to bring the figure to 100%. With such an outlook on software engineering, one is not likely to greet with enthusiasm the prospect of widely available libraries of reusable components.
The comment was notable for its frankness, but it triggered the obvious retort: if it is at all possible to build reusable components to replace some of the expensive services of a software house’s consultants, sooner or later someone will build them. At that time a company that has refused to take this route, and is left with nothing to sell but its consultants’ services, may feel sorry for having kept its head buried in the sand.
It is hard not to think here of the many engineering disciplines that used to be heavily labor-intensive but became industrialized, that is to say tool-based — with painful economic consequences for companies and countries that did not understand early enough what was happening. To a certain extent, object technology is bringing a similar change to the software trade. The choice between people and tools need not, however, be an exclusive one. The engineering part of software engineering is not identical to that of mass-production industries； humans will likely continue to play the key role in the software construction process. The aim of reuse is not to replace humans by tools (which is often, in spite of all claims, what has happened in other disciplines) but to change the distribution of what we entrust to humans and to tools. So the news is not all bad for a software company that has made its name through its consultants. In particular:
• In many cases developers using sophisticated reusable components may still benefit from the help of experts, who can advise them on how best to use the components. This leaves a meaningful role for software houses and their consultants.
• As will be discussed below, reusability is inseparable from extendibility: good reusable components will still be open for adaptation to specific cases. Consultants from a company that developed a library are in an ideal position to perform such tuning for individual customers. So selling components and selling services are not necessarily exclusive activities； a components business can serve as a basis for a service business.
• More generally, a good reusable library can play a strategic role in the policy of a successful software company, even if the company sells specific solutions rather than the library itself, and uses the library for internal purposes only. If the library covers the most common needs and provides an extendible basis for the more advanced cases, it can enable the company to gain a competitive edge in certain application areas by developing tailored solutions to customers’ needs, faster and at lower cost than competitors who cannot rely on such a ready-made basis.
Another argument used to justify skepticism about reuse is the difficulty of the component management task: progress in the production of reusable software, it is said, would result in developers being swamped by so many components as to make their life worse than if the components were not available.
Cast in a more positive style, this comment should be understood as a warning to developers of reusable software that the best reusable components in the world are useless if nobody knows they exist, or if it takes too much time and effort to obtain them. The practical success of reusability techniques requires the development of adequate databases of components, which interested developers may search by appropriate keywords to find out quickly whether some existing component satisfies a particular need. Network services must also be available, allowing electronic ordering and immediate downloading of selected components.
These goals do raise technical and organizational problems. But we must keep things in proportion. Indexing, retrieving and delivering reusable components are engineering issues, to which we can apply known tools, in particular database technology； there is no reason why software components should be more difficult to manage than customer records, flight information or library books.
Reusability discussions used to delve forever into the grave question “how in the world are we going to make the components available to developers?”. After the advances in networking of the past few years, such debates no longer appear so momentous. With the World-Wide Web, in particular, have appeared powerful search tools (AltaVista, Yahoo¼) which have made it far easier to locate useful information, either on the Internet or on a company’s Intranet. Even more advanced solutions (produced, one may expect, with the help of object technology) will undoubtedly follow. All this makes it increasingly clear that the really hard part of progress in reusability lies not in organizing reusable components, but in building the wretched things in the first place.
A note about component indexing
On the matter of indexing and retrieving components, a question presents itself, at the borderline between technical and organizational issues: how should we associate indexing information, such as keywords, with software components?
The Self-Documentation principle suggests that, as much as possible, information about a module — indexing information as well as other forms of module documentation — should appear in the module itself rather than externally. This leads to an important requirement on the notation that will be developed in part C of this book to write software components, called classes. Regardless of the exact form of these classes, we must equip ourselves with a mechanism to attach indexing information to each component.
The syntax is straightforward. At the beginning of a module text, you will be invited to write an indexing clause of the form
index_word1: value, value, value¼
index_word2: value, value, value¼
¼ Normal module definition (see part C) ¼
Each index_word is an identifier； each value is a constant (integer, real etc.), an identifier, or some other basic lexical element.
这个语法简单易懂. 在模块代码的开头,您要先声明索引子句(indexing clause)
index_word1: value, value, value¼
index_word2: value, value, value¼
¼ Normal module definition (see part C) ¼
每个index_word是一个标识符； 每个value是一个常量(整数,实数等等), 标识符,或其它的基本语法元素.
There is no particular constraint on index words and values, but an industry, a standards group, an organization or a project may wish to define their own conventions. Indexing and retrieval tools can then extract this information to help software developers find components satisfying certain criteria.
As we saw in the discussion of Self-Documentation, storing such information in the module itself — rather than in an outside document or database — decreases the likelihood of including wrong information, and in particular of forgetting to update the information when updating the module (or conversely). Indexing clauses, modest as they may seem, play a major role in helping developers keep their software organized and register its properties so that others can find out about it.
Formats for reusable component distribution
Another question straddling the technical-organizational line is the form under which we should distribute reusable components: source or binary? This is a touchy issue, so we will limit ourselves to examining a few of the arguments on both sides.
横跨在技术-组织的界线上的另外一个问题是我们所发布的可复用组件的格式: 源代码或二进制码? 这是一个敏感的话题，因此我们将有限地讨论两方中的一些论点.
For a professional, for-profit software developer, it often seems desirable to provide buyers of reusable components with an interface description (the short form discussed in a later chapter) and the binary code for their platform of choice, but not the source form.This protects the developer’s investment and trade secrets.
Binary is indeed the preferred form of distribution for commercial application programs, operating systems and other tools, including compilers, interpreters and development environments for object-oriented languages. In spite of recurring attacks on the very idea, emanating in particular from an advocacy group called the League for Programming Freedom, this mode of commercial software distribution is unlikely to recede much in the near future. But the present discussion is not about ordinary tools or application programs: it is about libraries of reusable software components. In that case one can also find some arguments in favor of source distribution.
对于商业应用程序,操作系统和其它的工具,包括面向对象语言的编译器,解释器和开发环境而言,二进制代码的确是发布的首选形式.尽管遭到不断的抨击,特别是来自一个被称为自由编程协会(League for Programming Freedom)的拥护组织,但这个商业软件发布的形式在不久的将来不太可能会改变.但是目前的讨论并不涉及通常的工具或是应用程序:它是有关于可复用的软件组件库.在那种情况下也能找到一些论据来支持源代码发布.
[注] 自由编程协会(League for Programming Freedom) 是一民间组织，由教授，学生，经商者，程序员，用户和软件公司组成，献身于重新得到编写程序的自由．该协会并不反对议会所希望的法律系统一一个体程序的版权。该协会的目标是扭转由特殊利益引起的判决所造成的变化(有关软件的版权).
For the component producer, an advantage of source distribution is that it eases porting efforts. You stay away from the tedious and unrewarding task of adapting software to the many incompatible platforms that exist in today’s computer world, relying instead on the developers of object-oriented compilers and environments to do the job for you. (For the consumer this is of course a counter-argument, as installation from source will require more work and may cause unforeseen errors.)
Some compilers for object-oriented languages may let you retain some of the portability benefit without committing to full source availability: if the compiler uses C as intermediate generated code, as is often the case today, you can usually substitute portable C code for binary code. It is then not difficult to devise a tool that obscures the C form, making it almost as difficult to reverse-engineer as a binary form.
Also note that at various stages in the history of software, dating back to UNCOL (UNiversal COmputing Language) in the late fifties, people have been defining low-level instruction formats that could be interpreted on any platform, and hence could provide a portable target for compilers. The ACE consortium of hardware and software companies was formed in 1988 for that purpose. Together with the Java language has come the notion of Java bytecode, for which interpreters are being developed on a number of platforms. But for the component producer such efforts at first represent more work, not less: until you have the double guarantee that the new format is available on every platform of interest and that it executes target code as fast as platform-specific solutions, you cannot forsake the old technology, and must simply add the new target code format to those you already support. So a solution that is advertized as an end-all to all portability problems actually creates, in the short term, more portability problems.
同时也要注意在软件历史的各种不同阶段中,远在五十年代的UNCOL时期(通用计算语言UNiversal COmputing Language),人们就已经定义了低级指令格式以便能在任何的平台上解释,想从此可以对编译器提供一个可移植的标准.为了这个目的,在1988年成立了包括硬件和软件公司在内的ACE联盟.加之JAVA语言带来了JAVA字节码的观念,其解释器正在许多的平台上被开发.但对于组件作者,这样的话在开始会需要做更多的工作,而非减少:除非您加倍保证新格式在每个重要的平台上都同样有效,并且运行目标代码就像在特定平台的解决方案一样快速,那么在此之前,您就不能放弃旧有的技术,还必须能轻松地把新的目标代码格式加入到您已经支持的平台上.因此,就眼前来说,一个号称能解决所有的移植性问题的解决方案实际上可能产生更多的移植性问题.
[注]60年代,计算机界有过面向通用计算机的统一语言(Universal Computer Oriented Language,UNCOL)用虚拟机实现统一编译的思想,即各种语言都编译到UNCOL,然后在本地机上实现UNCOL,但后来没有流行起来.现在一般还都是针对具体机器、具体的操作系统研制特定语言的编译系统.
Perhaps more significant, as an argument for source code distribution, is the observation that attempts to protect invention and trade secrets by removing the source form of the implementation may be of limited benefit anyway. Much of the hard work in the construction of a good reusable library lies not in the implementation but in the design of the components’ interfaces； and that is the part that you are bound to release anyway. This is particularly clear in the world of data structures and algorithms, where most of the necessary techniques are available in the computing science literature. To design a successful library, you must embed these techniques in modules whose interface will make them useful to the developers of many different applications. This interface design is part of what you must release to the world.
Also note that, in the case of object-oriented modules, there are two forms of component reuse: as a client or, as studied in later chapters, through inheritance. The second form combines reuse with adaptation. Interface descriptions (short forms) are sufficient for client reuse, but not always for inheritance reuse.
Finally, the educational side: distributing the source of library modules is a good way to provide models of the producer’s best engineering, useful to encourage consumers to develop their own software in a consistent style. We saw earlier that the resulting standardization is one of the benefits of reusability. Some of it will remain even if client developers only have access to the interfaces； but nothing beats having the full text.
Be sure to note that even if source is available it should not serve as the primary documentation tool: for that role, we continue to use the module interface.
This discussion has touched on some delicate economic issues, which condition in part the advent of an industry of software components and, more generally, the progress of the software field. How do we provide developers with a fair reward for their efforts and an acceptable degree of protection for their inventions, without hampering the legitimate interests of users? Here are two opposite views:
• At one end of the spectrum you will find the positions of the League for Programming Freedom: all software should be free and available in source form.
• At the other end you have the idea of superdistribution, advocated by Brad Cox in several articles and a book. Superdistribution would allow users to duplicate software freely, charging them not for the purchase but instead for each use. Imagine a little counter attached to each software component, which rings up a few pennies every time you make use of the component, and sends you a bill at the end of the month. This seems to preclude distribution in source form, since it would be too easy to remove the counting instructions. Although JEIDA, a Japanese consortium of electronics companies, is said to be working on hardware and software mechanisms to support the concept, and although Cox has recently been emphasizing enforcement mechanisms built on regulations (like copyright) rather than technological devices, superdistribution still raises many technical, logistic, economic and psychological questions.
Any comprehensive approach to reusability must, along with the technical aspects, deal with the organizational and economical issues: making reusability part of the software development culture, finding the right cost structure and the right format for component distribution, providing the appropriate tools for indexing and retrieving components. Not surprisingly, these issues have been the focus of some of the main reusability initiatives from governments and large corporations, such as the STARS program of the US Department of Defense (Software Technology for Adaptable, Reliable Systems) and the “software factories” installed by some large Japanese companies.
任何复用性的综合方案,连同技术方面一起,都会涉及组织和经济方面的议题:使复用性成为软件开发文化的一部分,找出组件发布的合理费用结构和正确格式,为索引和检索组件提供适当的工具.并不令人惊讶的是,这些议题已经是一些主要的复用性行动的中心,其来自于政府和大型公司,像是美国国防部的STARTS程序(可适应的,可靠的系统之软件技术Software Technology for Adaptable, Reliable Systems)和被一些大型日本公司建立的"软件工厂".
Important as these questions are in the long term, they should not detract our attention from the main roadblocks, which are still technical. Success in reuse requires the right modular structures and the construction of quality libraries containing the tens of thousands of components that the industry needs.
The rest of this chapter concentrates on the first of these questions； it examines why common notions of module are not appropriate for large-scale reusability, and defines the requirements that a better solution — developed in the following chapters — must satisfy.
4.5 THE TECHNICAL PROBLEM
What should a reusable module look like?
Change and constancy
Software development, it was mentioned above, involves much repetition. To understand the technical difficulties of reusability we must understand the nature of that repetition.
Such an analysis reveals that although programmers do tend to do the same kinds of things time and time again, these are not exactly the same things. If they were, the solution would be easy, at least on paper； but in practice so many details may change as to defeat any simple-minded attempt at capturing the commonality.
A telling analogy is provided by the works of the Norwegian painter Edvard Munch, the majority of which may be seen in the museum dedicated to him in
一个生动的类比是由挪威画家Edvard Munch的作品所提供的,其中大多数在奥斯陆的博物馆中能看到,这也是Simula的诞生地. Munch沉迷于少数深刻,本质的主题: 爱,苦闷,妒忌,跳舞,死亡…他不断地汲取和绘画它们,在每段时间内使用相同的图案,但是不断地变更技术媒介,颜色,重点,大小,光,情绪等等.
Such is the software engineer’s plight: time and again composing a new variation that elaborates on the same basic themes.
Take the example mentioned at the beginning of this chapter: table searching. True, the general form of a table searching algorithm is going to look similar each time: start at some position in the table t； then begin exploring the table from that position, each time checking whether the element found at the current position is the one being sought, and, if not, moving to another position. The process terminates when it has either found the element or probed all the candidate positions unsuccessfully. Such a general pattern is applicable to many possible cases of data representation and algorithms for table searching, including arrays (sorted or not), linked lists (sorted or not), sequential files, binary trees, B-trees and hash tables of various kinds.
It is not difficult to turn this informal description into an incompletely refined routine:
has (t: TABLE, x: ELEMENT): BOOLEAN is
-- Is there an occurrence of x in t?
pos := INITIAL_POSITION (x, t)
EXHAUSTED (pos, t) or else FOUND ( pos, x, t)
pos := NEXT (pos, x, t)
Result := not EXHAUSTED (pos, t)
(A few clarifications on the notation: from ¼ until ¼ loop ¼ end describes a loop, initialized in the from clause, executing the loop clause zero or more times, and terminating as soon as the condition in the until clause is satisfied. Result denotes the value to be returned by the function. If you are not familiar with the or else operator, just accept it as if it were a boolean or.)
(符号中的一些声明: from ¼ until ¼ loop ¼ end描述了一个循环,从from子句开始,执行loop子句零次或多次,直到条件满足until子句结束. Result指定了函数返回值.如果您不熟悉or else运算符,就当它是一个布尔类型的or好了.)
Although the above text describes (through its lower-case elements) a general pattern of algorithmic behavior, it is not a directly executable routine since it contains (in upper case) some incompletely refined parts, corresponding to aspects of the table searching problem that depend on the implementation chosen: the type of table elements (ELEMENT), what position to examine first (INITIAL_POSITION), how to go from a candidate position to the next (NEXT), how to test for the presence of an element at a certain position (FOUND), how to determine that all interesting positions have been examined (EXHAUSTED).
Rather than a routine, then, the above text is a routine pattern, which you can only turn into an actual routine by supplying refinements for the upper-case parts.
The reuse-redo dilemma
All this variation highlights the problems raised by any attempt to come up with general-purpose modules in a given application area: how can we take advantage of the common pattern while accommodating the need for so much variation? This is not just an implementation problem: it is almost as hard to specify the module so that client modules can rely on it without knowing its implementation.
These observations point to the central problem of software reusability, which dooms simplistic approaches. Because of the versatility of software — its very softness — candidate reusable modules will not suffice if they are inflexible.
A frozen module forces you into the reuse or redo dilemma: reuse the module exactly as it is, or redo the job completely. This is often too limiting. In a typical situation, you discover a module that may provide you with a solution for some part of your current job, but not necessarily the exact solution. Your specific needs may require some adaptation of the module’s original behavior. So what you will want to do in such a case is to reuse and redo: reuse some, redo some — or, you hope, reuse a lot and redo a little. Without this ability to combine reuse and adaptation, reusability techniques cannot provide a solution that satisfies the realities of practical software development.
So it is not by accident that almost every discussion of reusability in this book also considers extendibility (leading to the definition of the term “modularity”, which covers both notions and provided the topic of the previous chapter). Whenever you start looking for answers to one of these quality requirements, you quickly encounter the other.
This duality between reuse and adaptation was also present in the earlier discussion of the Open-Closed principle, which pointed out that a successful software component must be usable as it stands (closed) while still adaptable (open).
The search for the right notion of module, which occupies the rest of this chapter and the next few, may be characterized as a constant attempt to reconcile reusability and extendibility, closure and openness, constancy and change, satisfying today’s needs and trying to guess what tomorrow holds in store.
4.6 FIVE REQUIREMENTS ON MODULE STRUCTURES
How do we find module structures that will yield directly reusable components while preserving the possibility of adaptation?
The table searching issue and the has routine pattern obtained for it on the previous page illustrate the stringent requirements that any solution will have to meet. We can use this example to analyze what it takes to go from a relatively vague recognition of commonality between software variants to an actual set of reusable modules. Such a study will reveal five general issues:
• Type Variation.
• Routine Grouping.
• Implementation Variation.
• Factoring Out Common Behaviors.
The has routine pattern assumes a table containing objects of a type ELEMENT. A particular refinement might use a specific type, such as INTEGER or BANK_ACCOUNT, to apply the pattern to a table of integers or bank accounts.
But this is not satisfactory. A reusable searching module should be applicable to many different types of element, without requiring reusers to perform manual changes to the software text. In other words, we need a facility for describing type-parameterized modules, also known more concisely as generic modules. Genericity (the ability for modules to be generic) will turn out to be an important part of the object-oriented method； an overview of the idea appears later in this chapter.
Even if it had been completely refined and parameterized by types, the has routine pattern would not be quite satisfactory as a reusable component. How you search a table depends on how it was created, how elements are inserted, how they are deleted. So a searching routine is not enough by itself as a unit or reuse. A self-sufficient reusable module would need to include a set of routines, one for each of the operations cited — creation, insertion, deletion, searching.
This idea forms the basis for a form of module, the “package”, found in what may be called the encapsulation languages:
The has pattern is very general； there is in practice, as we have seen, a wide variety of applicable data structures and algorithms. Such variety indeed that we cannot expect a single module to take care of all possibilities； it would be enormous. We will need a family of modules to cover all the different implementations.
A general technique for producing and using reusable modules will have to support this notion of module family.
A general form of reusable module should enable clients to specify an operation without knowing how it is implemented. This requirement is called Representation Independence. Assume that a client module C from a certain application system — an asset management program, a compiler, a geographical information system¼ — needs to determine whether a certain element x appears in a certain table t (of investments, of language keywords, of cities). Representation independence means here the ability for C to obtain this information through a call such as
present := has (t, x)
without knowing what kind of table t is at the time of the call. C’s author should only need to know that t is a table of elements of a certain type, and that x denotes an object of that type. Whether t is a binary search tree, a hash table or a linked list is irrelevant for him； he should be able to limit his concerns to asset management, compilation or geography. Selecting the appropriate search algorithm based on t’s implementation is the business of the table management module, and of no one else.
一个可复用模块的通用形式应该使客户端能够使用一个运算而不必知道它是如何实现的.这个需求称之为表示法独立.假设一个来自某个应用程序系统的客户端模块C－一个资产管理程序,一个编译器,一个地理信息系统…－需要确定某个元素x是否出现在表t中(投资表,语言关键字表,城市表).在这里,表示法独立意谓着,通过如 present := has (t, x) 这样的调用C获得信息的能力,而在调用的时候并不需要知道表t的类型.C的作者应该只需要知道t是具有某种类型元素的一个表,而x表示那个类型的一个对象.是否t是一个二进制查询树,一个哈希表或一个链表对他来说无关紧要；他应该能够把他的重点放在资产管理,编译或地理学上.基于t的实现选择适当的查寻算法是表管理模块的事务,而和其它模块无关.
This requirement does not preclude letting clients choose a specific implementation when they create a data structure. But only one client will have to make this initial choice； after that, none of the clients that perform searches on t should ever have to ask what exact kind of table it is. In particular, the client C containing the above call may have received t from one of its own clients (as an argument to a routine call)； then for C the name t is just an abstract handle on a data structure whose details it may not be able to access.
You may view Representation Independence as an extension of the rule of Information Hiding, essential for smooth development of large systems: implementation decisions will often change, and clients should be protected. But Representation Independence goes further. Taken to its full consequences, it means protecting a module’s clients against changes not only during the project lifecycle but also during execution — a much smaller time frame! In the example, we want has to adapt itself automatically to the run-time form of table t, even if that form has changed since the last call.
您可以把表示法独立看成是信息隐藏规则的一个扩展,对大系统的平滑开发很有必要: 实现的结果将会经常被改变,而且客户端应该被保护.但是表示法独立更进一步.利用其全面的作用,这意味着保护模块的客户端免于变化,不只有在项目周期期间而且也在执行期间－一个更加小的时间范围! 在这个例子中，我们希望has对表t的运行时形式能自动地自适应,即使那种形式由于上一个调用后已经改变了.
Satisfying Representation Independence will also help us towards a related principle encountered in the discussion of modularity: Single Choice, which directed us to stay away from multi-branch control structures that discriminate among many variants, as in
满足表示法独立也将会帮助我们使用一项相关原则，这就是我们在模块性的讨论中遇到的: 单选(Single Choice),它指导我们避免在许多变体之间作出区分的多分支控制结构, 如
if “t is an array managed by open hashing” then
“Apply open hashing search algorithm”
elseif “t is a binary search tree” then
“Apply binary search tree traversal”
It would be equally unpleasant to have such a decision structure in the module itself (we cannot reasonably expect a table management module to know about all present and future variants) as to replicate it in every client. The solution is to hide the multi-branch choice completely from software developers, and have it performed automatically by the underlying run-time system. This will be the role of dynamic binding, a key component of the object-oriented approach, to be studied in the discussion of inheritance.
Factoring Out Common Behaviors
If Representation Independence reflects the client’s view of reusability — the ability to ignore internal implementation details and variants –, the last requirement, Factoring Out Common Behaviors, reflects the view of the supplier and, more generally, the view of developers of reusable classes. Their goal will be to take advantage of any commonality that may exist within a family or sub-family of implementations.
The variety of implementations available in certain problem areas will usually demand, as noted, a solution based on a family of modules. Often the family is so large that it is natural to look for sub-families. In the table searching case a first attempt at classification might yield three broad sub-families:
• Tables managed by some form of hash-coding scheme.
• Tables organized as trees of some kind.
• Tables managed sequentially.
Each of these categories covers many variants, but it is usually possible to find significant commonality between these variants. Consider for example the family of sequential implementations — those in which items are kept and searched in the order of their original insertion.
Possible representations for a sequential table include an array, a linked list and a file. But regardless of these differences, clients should be able, for any sequentially managed table, to examine the elements in sequence by moving a (fictitious) cursor indicating the position of the currently examined element. In this approach we may rewrite the searching routine for sequential tables as:
has (t: SEQUENTIAL_TABLE； x: ELEMENT): BOOLEAN is
-- Is there an occurrence of x in t?
from start until
after or else found (x)
Result := not after
This form relies on four routines which any sequential table implementation will be able to provide:
• start, a command to move the cursor to the first element if any.
• forth, a command to advance the cursor by one position. (Support for forth is of course one of the prime characteristics of a sequential table implementation.)
• after, a boolean-valued query to determine if the cursor has moved past the last element； this will be true after a start if the table was empty.
• found (x), a boolean-valued query to determine if the element at cursor position has value x.
· forth ,一个移动光标前进至下一个元素的命令(支持forth当然是一个顺序表实现的首要特性)
· found (x),一个布尔值,其值决定于是否光标所在位置的元素含有值x.
At first sight, the routine text for has at the bottom of the preceding page resembles the general routine pattern used at the beginning of this discussion, which covered searching in any table (not just sequential). But the new form is not a routine pattern any more； it is a true routine, expressed in a directly executable notation (the notation used to illustrate object-oriented concepts in part C of this book). Given appropriate implementations for the four operations start, forth, after and found which it calls, you can compile and execute the latest form of has.
乍一看,在前一页的has例程代码好似最初讨论所采用的通用例程模式,其复盖了在任何表中的查询(不仅仅是顺序表).但是新的形式不再是一个例程模式；它是一个真实的例程,直接地表示了可执行符号(这些符号在本书的C部份用来描述面向对象的概念).对所调用的四个运算start, forth, after和found,给定适当的实现后,您就能编译而且执行has最后的形式.
For each possible sequential table representation you will need a representation for the cursor. Three example representations are by an array, a linked list and a file.
The first uses an array of capacity items, the table occupying positions 1 to count. Then you may represent the cursor simply as an integer index ranging from 1 to count + 1. (The last value is needed to represent a cursor that has moved “after” the last item.)
第一个使用了一个 capacity项的数组,表的位置是从1到count. 您可以用index简单的表示光标,它是一个从1到count + 1的整数.(最后的整数值表示移动光标到最后一条之后).
The second representation uses a linked list, where the first cell is accessible through a reference first_cell and each cell is linked to the next one through a reference right. Then you may represent the cursor as a reference cursor.
The third representation uses a sequential file, in which the cursor simply represents the current reading position.
The implementation of the four low-level operations start, forth, after and found will be different for each variant. The following table gives the implementation in each case. (The notation t @ i denotes the i-th element of array t, which would be written t [i] in Pascal or C； Void denotes a void reference； the Pascal notation f, for a file f, denotes the element at the current file reading position.)
这四个底层的操作start, forth, after和found，其实现对每一个变体来说都是不同的.下表给出了它们的实现.(符号t @ i表示数组t第i个元素,在Pascal和C中写作t [i]；Void表示一个空引用；对于文件f，Pascal符号f指示了在当前文件中读位置上的元素.)
The challenge of reusability here is to avoid unneeded duplication of software by taking advantage of the commonality between variants. If identical or near-identical fragments appear in different modules, it will be difficult to guarantee their integrity and to ensure that changes or corrections get propagated to all the needed places； once again, configuration management problems may follow.
All sequential table variants share the has function, differing only by their implementation of the four lower-level operations. A satisfactory solution to the reusability problem must include the text of has in only one place, somehow associated with the general notion of sequential table independently of any choice of representation. To describe a new variant, you should not have to worry about has any more； all you will need to do is to provide the appropriae versions of start, forth, after and found.