【A Relational Model of Data for Large Shared Data Banks】E.F Codd

最新推荐文章于 2024-08-29 11:58:38 发布

ck7233

最新推荐文章于 2024-08-29 11:58:38 发布

阅读量1.9k

点赞数 4

文章标签：数据库数据结构与算法

摘要
Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation).今后大型数据库用户必须要知道如何保护组织的数据机(国内代表). A prompting service which supplies such information is not a satisfactory solution.一个提示服务用品等资料并非圆满解决. Activities of users at terminals and most application programs should remain unaffected when the internal representation of data is changed and even when some aspects of the external representation are changed.活动用户终端和应用程序应不受影响,当大部分国内代表性数据的变化,即使在某些方面有代表性的外部变化. Changes in data representation will often be needed as' a result of changes in query, update, and report traffic and natural growth in the types of stored information.数据变化往往需要任职的变化导致查询、更新、公共交通和自然增长率类型储存的信息.

Existing non inferential, formatted data systems provide users with tree-structured files or slightly more general network models of the data.现有不推理,为用户提供数据系统格式化树型档案或稍多一般网络模型数据. In Section 1 , inadequacies of these models are discussed.第一节,这些模式的不足讨论. A model based on n -ary relations, a normal form. for data base relations, and the concept of a universal data sub language are introduced.基于示范n元关系,数据库为常态关系观念和语言介绍了通用数据分. In Section 2, certain operations on relations (other than logical inference) are discussed and applied to the problems of redundancy and consistency in the user's model.在第2、某些业务关系(逻辑推理以外)讨论问题,并应用于冗余和一致性用户的模式.

Key Words and Phrases关键词句
data bank, data base, data structure, data organization;, hierarchies of data, network of data, relations, derivability, redundancy, consistency, composition, join, retrieval language, predicate calculus, security, data integrity资料库、数据库、数据结构、数据组织;,等级数据、网络数据、关系,可导,冗余,一致性,组成参加,检索语言、上游积分、安全、数据完整性

1. Relational Model and Normal Form关联模型与范式

--------------------------------------------------------------------------------

1.1 Introduction1月1日实施
This paper is concerned with the application of elementary relation theory to systems which provide shared access to large banks of formatted data.本文是有关应用基础理论与系统提供共享进入大银行的数据格式. Except for a paper by Childs [1] , the principal application of relations to data systems has been to deductive question - answering systems.除了一张纸的疾病治疗[1]主要应用系统的数据关系一直演绎答疑系统. Levein and Maron [2] provide numerous references to work in this area.levein并提供众多个体[2]述这方面的工作.
In contrast, the problems treated here are those of data independence - the independence of application programs and terminal activities from growth in data types and changes in data representation Ñ and certain kinds of data inconsistency which are expected to become troublesome even in nondeductive systems.相比之下,这里是那些问题的处理数据的独立性--独立的应用程序和终端活动从增长数据类型和变化,某些类型的数据表达410-97数据不一致所预期的麻烦甚至成为nondeductive系统.

The relational view (or model) of data described in Section 1 appears to be superior in several respects to the graph or network model [ 3 , 4 ] presently in vogue for non-inferential systems.鉴于相关数据(或模型)第一节叙述似乎是在几个方面的优势还是网络模型图[3,4〕目前盛行的非推理系统. It provides a means of describing data with its natural structure only -- that is, without superimposing any additional structure for machine representation poses.它提供的数据描述手段自然只有结构--即无任何附加结构叠加构成机器代表性. Accordingly, it provides a basis for a high level data language which will yield maximal independence between programs on the one hand and machine representation and organization of data on the other.据此,它提供了一个高层次的基础数据的语言发展将产生极大的独立节目之间,一方面组织代表和数据机等.

A further advantage of the relational view is that it forms a sound basis for treating derivability, redundancy, and consistency of relations - these are discussed in Section 2.再利用关系的看法是,它的健全基础治疗可导,裁员,关性和一致性,这些讨论在第2. The network model, on the other hand, has spawned a number of confusions, not the least of which is mistaking the derivation of connections for the derivation of relations (see remarks in Section 2 on the "connection trap" ).网络模型,在另一方面,也引起了一些混乱,最重要的当然是没有错的推导推导关系的联系(见第2话的"陷阱连接").

Finally, the relational view permits a clearer evaluation of the scope and logical limitations of present formatted data systems, and also the relative merits (from a logical standpoint) of competing representations of data within a single system.最后,关联观点明确许可范围和评价本格式化数据系统逻辑局限性,也是相对优点(从逻辑上来看)竞合交涉单独一个数据系统. Examples of this clearer perspective are cited in various parts of this paper.这一观点是清楚的例子多处引用本文. Implementations of systems to support the relational model are not discussed.施系统支持关系模型不讨论.

2月1日本系统数据依赖
The provision of data description tables in recently developed information systems represents a major advance toward the goal of data independence [ 5 , 6 , 7 ].提供数据描述统计表最近开发信息系统的一大目标前进数据独立性〔5,6,7〕. Such tables facilitate changing certain characteristics of the data representation stored in a data bank.这种改变某些特征表方便数据储存在数据库代表性. However, the variety of data representation characteristics which can be changed without logically impairing some application programs is still quite limited.但是各种数据所代表的特色是可以改变一些不损害逻辑应用程序还是相当有限. Further, the model of data with which users interact is still cluttered with representational properties, particularly in regard to the representation of collections of data (as opposed to individual items ).此外,数据模型与用户的互动仍杂乱与代表性物业尤其是在任职搜集(相对于个别项目). Three of the principal kinds of data dependencies which still need to be removed are: ordering dependence, indexing dependence, and access path dependence.3种主要的数据仍需要删除属地:订购依赖性,依赖度,准入和路径依赖. In some systems these dependencies are not clearly separable from one another.这些系统在一些属地划分不明确,互相离不开.

1.2.1. Ordering Dependence订购依赖性
Elements of data in a data bank may be stored in a variety of ways, some involving no concern for ordering, some permitting each element to participate in one ordering only, others permitting each element to participate in several orderings.数据资料库的内容可以储存多种方式,有些涉及不关心订购,有的允许每个元素只参加一个订货、允许他人参与每个元素序数. Let us consider those existing systems which either require or permit data elements to be stored in at least one total ordering which is closely associated with the hardwired-determined ordering of addresses.让我们看看那些现有制度要求或允许数据元素被储存在其中至少有一个共订购是与硬定订购地址. For example, the records of a file concerning parts might be stored in ascending order by part serial number.例如,关于部分档案记录可以储存部分序号排列. Such systems normally permit application pro- grams to assume that the order of presentation of records from such a file is identical to (or is a subordering of) the stored ordering.这种系统通常申请许可证亲克假定顺序介绍从这个档案记录相同(或者是subordering条)储存订购. Those application programs which take advantage of the stored ordering of a file are likely to fail to operate correctly if for some reason it becomes necessary to replace that ordering by a different one.这些应用程序所利用的档案存放有序运作,可能不正确,如果因为某些原因,所以才需要更换一个不同的排序. Similar remarks hold for a stored ordering implemented by means of pointers.类似的言论进行了实施存储订购方式指点.
It is unnecessary to single out any system as an example, because all the well-known information systems that are marketed today fail to make a clear distinction between order of presentation on the one hand and stored ordering on the other.这是不需要任何挑出制度为例因为所有著名的行销资讯系统今天无法划清顺序介绍,一方面对其他储存订购. Significant implementation problems must be solved to provide this kind of independence.执行必须解决的重大问题提供这种独立.

1.2.2. Indexing Dependence索引依赖性
In the context of formatted data, an index is usually thought of as a purely performance-oriented component of the data representation.在对格式化数据索引通常视之为纯粹表演为主组成的数据表达. It tends to improve response to queries and updates and, at the same time, slow down response to insertions and deletions.它趋于改善和更新,并回答问题,同时,放慢响应插入和缺失. From an informational standpoint, an index is a redundant component of the data representation.从信息角度来看,是多余的指标组成的数据表达. If a system uses indices at all and if it is to perform. well in an environment with changing patterns of activity on the data bank, an ability to create and destroy indices from time to time will probably be necessary.如果各指标体系和用途,如果它要在一个环境良好的活动模式的改变与数据库、创造和毁灭能力指标不时可能需要. The question then arises: Can application programs and terminal activities remain invariant as indices come and go?于是衍生问题:终端和应用程序可以保持不变,作为活动指数出没?
Present formatted data systems take widely different approaches to indexing.格式化数据系统目前普遍采取不同方法索引. TDMS [7] unconditionally provides indexing on all attributes.tdms[7]无条件提供所有索引属性. The presently released version of IMS [5] provides the user with a choice for each file: a choice between no indexing at all (the hierarchic sequential organization) or indexing on the primary key only (the hierarchic indexed sequential organization).目前公布的版本管理系统[5]提供档案为每个用户提供一个选择:没有任何抉择索引(层次顺序组织)或索引的主要关键只(索引顺序组织的层次). In neither case is the user's application logic dependent on the existence of the unconditionally provided indices.在两宗个案是用户的应用逻辑依赖无条件提供的指标存在. IDS [8] , however, permits the file designers to select attributes to be indexed and to incorporate indices into the file structure by means of additional chains.[8]身份证,但设计师挑选许可证档案索引和属性将被纳入指数的结构方式附加档案链. Application programs taking advantage of the performance benefit of these indexing chains must refer to those chains by name.应用程序利用这些有利的表现一定是指那些索引链链的名字. Such programs do not operate correctly if these chains are later removed.这类节目并不正确,如果这些连锁店经营是后来拆除.

许多原有系统为用户提供数据格式树型档案或稍多一般网络模型数据. Application programs developed to work with these systems tend to be logically impaired if the trees or networks are changed in structure.应用软件系统开发工作,也往往是逻辑上的树木受损,如果有任何变化或网络结构. A simple example follows.一个简单的例子如下.
Suppose the data bank contains information about parts and projects.假设数据库载有零部件项目. For each part, the part number, part name, part description, quantity-on-hand, and quantity-on-order are recorded.每一部分的若干部分,第一部分的名字,说明部分,数量手头、数量按命令记录. For each project, the project number, project name, project description are recorded.每个投资项目数、项目名称、项目说明录音. Whenever a project makes use of a certain part, the quantity of that part committed to the given project is also recorded.每当一个项目借助于某一部分,数量承诺的那部分工程也给予记录. Suppose that the system requires the user or file designer to declare or define the data in terms of tree structures.假设用户的系统需要设计师或档案资料申报或确定在树结构. Then, any one of the hierarchical structures may be adopted for the information mentioned above (see Structures 1-5 ).届时,任何一个层次的结构,可通过上述资料(见结构1-5).

Now, consider the problem of printing out the part number, part name, and quantity committed for every part used in the project whose project name is "alpha."现在考虑的问题打印出若干部分,第一部分的名字,和数量承诺的一部分用于项目的每个项目的名字是"阿尔法" The following observations may be made regardless of which available tree-oriented information system is selected to tackle this problem.以下意见可不管是哪可树为本的信息系统是选择来解决这个问题. If a program P is developed for this problem assuming one of the five structures aboveÑthat is, P makes no test to determine which structure is in effect - then P will fail on at least three of the remaining structures.如果P是一个发达的计划之一,对于这个问题,假设五是搭建aboveñthat,磷没有测试,以确定它的结构实际上是那么至少会失败磷其余三人结构. More specifically, if P succeeds with structure 5 , it will fail with all the others; if P succeeds with structure 3 or 4 , it will fail with at least 1 , 2 , and 5 ; if P succeeds with 1 or 2 , it will fail with at least 3 , 4 , and 5 .具体来说,若P与结构5成,将无法与他人所有;如果成功,与磷结构3或4,它将无法提供最少1,2、5条;若P1或与成功2,它会失败至少3,4、5. The reason is simple in each case.原因很简单,在每一个个案. In the absence of a test to determine which structure is in effect, P fails because an attempt is made to execute a reference to a nonexistent file (available systems treat this as an error) or no attempt is made to execute a reference to a file containing needed information.在没有确定哪些结构是一个考验,实际上,磷因为是企图未能执行参考珠档案(可视之为一种系统误差)或没有试图做出执行档案载述所需资料. The reader who is not convinced should develop sample programs for this simple problem.读者应该发展谁不相信这个简单的抽样方案问题.

Since, in general, it is not practical to develop application programs which test for all tree structuring permitted by the system, these programs fail when a change in structure becomes necessary.因为,一般而言,这不是开发应用程序的实际测试树结构允许的所有系统这些方案未能在必要时改变结构.

Systems which provide users with a network model of the data run into similar difficulties.它为用户提供网络系统的数据模型碰到类似的困难. In both the tree and network cases, the user (or his program) is required to exploit a collection of user access paths to the data.在树上、网络案件用户(或其纲领)是利用收集用户所需的数据获取途径. It does not matter whether these paths are in close correspondence with pointer - (defined paths in the stored representation - in IDS the correspondence is extremely simple, in TDMS it is just the opposite. The consequence, regardless of the stored representation, is that terminal activities and programs become dependent on the continued existence of the user access paths.不管这些路径密切配合指针(在界定存放路径代表性--在IDS书信十分简单,在tdms是适得其反.后果,不管其存放代表性就是靠终端活动和节目成为用户接入的继续存在路径.

One solution to this is to adopt the policy that once a user access path is defined it will not be made obsolete until all application programs using that path have become obsolete.这是一个解决办法,采取的政策是,一旦用户访问路径确定它不会使用过时的应用程序,直至所有路径已经过时. Such a policy is not practical, because the number of access paths in the total model for the community of users of a data bank would eventually become excessively large.这种政策不符合实际,由于进出道路的总人数为示范社区用户资料库最终成为过大.