什么是持久化数据库

几乎所有的应用程序都需要持久化数据。持久化在应用程序开发中是基本概念之一。如果一个信息系统在断电时没有保存数据,这个系统就没有什么实用价值了。当我们在Java中谈到持久化时,一般是指利用SQL在关系数据库中存储数据。我们先简单地看看这项技术,以及如何在Java中使用它。有了这个信息基础,再接着讨论持久化,以及如何在面向对象的应用程序中实现它。

1  关系数据库

就像大部分其他的开发人员一样,你可能已经使用过关系数据库。我们大部分人每天都在使用关系数据库。关系技术是个已知数,仅此一点就成为许多组织选择它的一个充分理由。但是只提这一点有些贬低了它应得的尊重。关系数据库的地位如此根深蒂固,是因为它们是一种出奇灵活和稳健的数据管理方法。由于关系数据模型完整且一致的理论基础,关系数据库可以有效保证和保护数据的完整性,这是它众多的优良特性之一。有些人甚至会说计算领域的最后一项大发明就是用于数据管理的关系概念,它由E.F Codd(Codd,1970)于30多年前首先提出。

关系数据库管理系统既不特定于Java,也不是一种特定于某个特殊应用程序的关系数据库。这个重要的原理就是数据独立(data independence)。换句话说,我们无法充分强调这个重要的事实:数据比任何应用程序都存在得更长久。关系技术提供了一种在不同应用程序或者构成同一应用程序(例如事务引擎和报告引擎)的不同技术之间共享数据的方式。关系技术是许多异构的系统和技术平台的一个共同特性。因此,关系型数据模型经常是业务实体常用的企业级表示法。

关系数据库管理系统具有基于SQL的应用编程接口(Application Programming Interface ,API);因此,我们称当今的关系数据库产品为SQL数据库管理系统(database management system),或者当我们谈到特定系统时,称之为SQL数据库(database)。

在更详细地探讨SQL数据库应用程序方面之前,必须提到一个重要的问题:虽然有些产品也作为关系数据库销售,但是只提供SQL数据语言接口的数据库系统并不是真正的关系数据库,并且在很多方面甚至与原始概念相去甚远。自然,这样就导致了混乱。SQL从业者抱怨关系型数据模型在SQL语言方面的不足,而关系型数据管理专家则报怨SQL标准在关系模型和理念方面实现得不够。应用程序开发人员被夹在其中,承受着传送一些有效东西的压力。我们将在本书中始终强调有关这个问题的一些重要而有意义的方面,但是通常关注应用程序方面的。

2  理解SQL

要有效地使用Hibernate,扎实地理解关系模型和SQL是前提条件。你需要理解关系模型,以及像保证数据完整性的标准化这样的话题,还要利用你的SQL知识调优Hibernate应用程序的性能。Hibernate让许多重复的编码任务自动化,但是如果要利用现代SQL数据库的全部功能,你的持久化技术必须扩充至超越Hibernate本身。记住,根本的目标是稳健、高效的持久化数据管理。

回顾一些本书中用到的SQL术语。你用SQL作为数据定义语言(Data Definition Language,DDL),用CREATE和ALTER语句创建数据库Schema。创建了表(和索引、序列等)之后,又用SQL作为数据操作语言(Data Manipulation Language,DML)来操作和获取数据。操作数据的操作包括插入(insertion)、更新(update)和删除(deletion)。通过限制(restriction)、投影(projection)和联结(join)操作(包括笛卡儿积,Cartesian product)执行查询来获取数据。为了有效地生成报表,可视需要使用SQL对数据进行分组(group)、排序(order)和统计(aggregate)。甚至可以相互嵌套SQL语句;这种技术使用了子查询(subselect)。

你可能已经使用SQL多年,熟悉这门语言的基本操作和语句编写。但我们从自身的经验中知道,有时候SQL仍然难以记住,而且一些术语的用法也很不同。要理解这本书,我们必须使用相同的术语和概念,因此如果我们提到的有些术语对你来说是陌生的或者不够清楚,建议你读一下附录A。

如果需要更多的资料,尤其是有关任何性能方面和SQL如何执行的,去找一本Dan Tow在2003年出版的优秀著作SQL Tuning。也看看Chris Date在2003年出版的著作An Introduction to Database Systems,了解(关系)数据库系统的理论、概念和思想。对于你在数据库和数据管理方面可能遇到的所有问题,后者是一本极好的参考书。

虽然关系数据库是ORM的一部分,但是,另一部分却由Java应用程序中的对象组成,它们需要用SQL持久化到数据库中和从数据库中加载。

3  在Java中使用SQL

在Java应用程序中使用SQL数据库时,Java代码通过Java数据库连通性(Java DataBase Connectivity,JDBC)API把SQL语句发到数据库。无论是手工编写SQL并嵌入到Java代码里面,还是由Java代码在运行中生成,都要用JDBC API绑定实参,来准备查询参数、执行查询、滚动查询结果表、从结果集中获取值,等等。这些都是底层的数据访问任务;作为应用程序开发人员,我们更关注需要这些数据访问的业务问题。我们真正想编写的是把对象的代码——类的实例——保存和获取到数据库,或者从数据库获取,使我们从这类底层的苦差事中解脱出来。

由于数据访问任务通常很单调乏味,我们不禁要问:关系型数据模型和(特别是)SQL都适合面向对象应用程序中的持久化吗?我们立即回答:是的!SQL数据库支配了计算行业有许多原因——关系数据库管理系统是唯一公认的数据管理技术,并且它们通常是任何Java项目的必要条件(requirement)。

然而,在过去的15年里,开发人员一直在谈论范式不匹配的问题。这种不匹配解释了为什么都要在每一个企业项目中与持久化相关的问题上付出如此巨大的努力。这里所说的范式(paradigm)是指对象模型和关系模型,或者可能是面向对象编程(Object-Oriented Programming,OOP)和SQL。

让我们通过询问在面向对象的应用程序开发环境中,持久化意味着什么,来开始对不匹配问题的探讨。首先,把本节开头所述的过分简化的持久化定义,扩展为在维护和使用持久化数据中对所涉及内容的一个更广泛、更成熟的理解。

4  面向对象应用程序中的持久化

在面向对象的应用程序中,持久化允许一个对象在创建之后依然存在。对象的这种状态可以被保存到磁盘,且相同状态的对象可以在未来的某个时候被重新创建。

这并非只限于单独的对象——整个关联对象网络也可以被持久化,且以后在一个新的进程中被重新创建。大多数对象并不是持化久的;瞬时(transient)对象的生命周期有限,由实例化它的进程的寿命所决定。几乎所有的Java应用程序都混合包含了持久对象和瞬时对象;因此,我们需要一个子系统来管理持久化数据。

现代的关系数据库为持久化数据提供了一个结构化的表示法,能够对数据进行操作、排序、搜索和统计。数据库管理系统负责管理并发性和数据的完整性;它们负责在多用户和多应用程序之间共享数据。它们通过已经利用约束实现的完整性规则来保证数据的完整性。数据库管理系统提供数据级的安全性。当我们在本书中讨论持久化时,考虑以下这些事情:结构化数据的储存、组织和获取;并发性和数据完整性;数据共享。

特别是,我们正在使用领域模型的面向对象的应用程序环境中考虑这些问题。

使用领域模型的应用程序并不直接使用业务实体的表格式表示法;该应用程序有它自己的业务实体的面向对象模型。例如,如果一个在线拍卖系统的数据库有ITEM和BID表,Java应用程序就会定义Item和Bid类。

然后,业务逻辑并不直接在SQL结果集的行和列上进行工作,而是与这个面向对象的领域模型及其作为关联对象网络的运行时实现进行交互。Bid的每个实例都引用一个拍卖Item,而且每个Item都可以有一个对Bid实例的引用集合。业务逻辑并不在数据库中执行(作为SQL存储过程);而是在应用层的Java中实现的。这就允许业务逻辑使用高级的面向对象的概念,例如继承和多态。比如,我们可以使用众所周知的设计模式,如Strategy(策略)、Mediator(中介者)和Composite(组合)(Gamma等,1995),所有这些模式都依赖于多态的方法调用。

现在给你一个警告:并非所有的Java应用程序都以这种方式设计,它们也不应该只以这种方式设计。简单的应用程序不用领域模型可能更好。复杂的应用程序可能必须重用现有的存储过程。SQL和JDBC API对于纯表格式数据的处理堪称完美,并且JDBC的行集合(RowSet)使CRUD操作变得更容易了。使用持久化数据的表格式表示法很直接且易于理解。

然而,对于含有重要业务逻辑的应用程序来说,领域模型方法帮助明显改善代码的可重用性和可维护性。实际上,这两种策略都是常用和必需的。许多应用程序都需要执行修改大组数据、接近数据的过程。同时,在应用层中执行一般在线事务处理逻辑的面向对象的领域模型时,其他的应用程序模块可以从中受益。你需要一种有效地把持久化数据带近应用程序代码的方法。

如果我们再次考虑SQL和关系数据库,最终会发现两种范式之间的不匹配。SQL操作如投影和联结始终会导致结果数据的表格式表示法。[这就是传递闭包(transitive closure),关联操作的结果也始终是一种关联。]这与Java应用程序中用来执行业务逻辑的关联对象网络大不相同。这些是根本不同的模型,而不只是把同一模型形象化的不同方式。

带着这些认识,就可以开始看一些问题了——有些已经十分了解,有些则还不太了解——必须通过一个结合了这两种数据表示法的应用程序来解决:一个面向对象的领域模型和一个持久化的关系模型。让我们深入探讨一下所谓的范式不匹配。


What is persistence

Almost all applications require persistent data. Persistence is one of the fundamental concepts in application development. If an information system didn’t preserve data when it was powered off, the system would be of little practical use. When we talk about persistence in Java, we’re normally talking about storing data in a relational database using SQL. We’ll start by taking a brief look at the technology and how we use it with Java. Armed with that information, we’ll then continue our discussion of persistence and how it’s implemented in object-oriented applications.

1  Relational databases

You, like most other developers, have probably worked with a relational database.Most of us use a relational database every day. Relational technology is a known quantity, and this alone is sufficient reason for many organizations to choose it.But to say only this is to pay less respect than is due. Relational databases are entrenched because they’re an incredibly flexible and robust approach to data management. Due to the complete and consistent theoretical foundation of the relational data model, relational databases can effectively guarantee and protect the integrity of the data, among other desirable characteristics. Some people would even say that the last big invention in computing has been the relational concept for data management as first introduced by E.F. Codd (Codd, 1970) more than three decades ago. 

Relational database management systems aren’t specific to Java, nor is a relational database specific to a particular application. This important principle is known as data independence. In other words, and we can’t stress this important fact enough, data lives longer than any application does. Relational technology provides a way of sharing data among different applications, or among different technologies that form parts of the same application (the transactional engine and the reporting engine, for example). Relational technology is a common denominator of many disparate systems and technology platforms. Hence, the relational data model is often the common enterprise-wide representation of business entities. 

Relational database management systems have SQL-based application programming interfaces; hence, we call today’s relational database products SQL database management systems or, when we’re talking about particular systems, SQL databases.

Before we go into more detail about the practical aspects of SQL databases, we have to mention an important issue: Although marketed as relational, a database system providing only an SQL data language interface isn’t really relational and in many ways isn’t even close to the original concept. Naturally, this has led to confusion. SQL practitioners blame the relational data model for shortcomings in the SQL language, and relational data management experts blame the SQL standard for being a weak implementation of the relational model and ideals. Application developers are stuck somewhere in the middle, with the burden to deliver something that works. We’ll highlight some important and significant aspects of this issue throughout the book, but generally we’ll focus on the practical aspects. If you’re interested in more background material, we highly recommend Practical Issues in Database Management: A Reference for the Thinking Practitioner by Fabian Pascal


2  Understanding SQL

To use Hibernate effectively, a solid understanding of the relational model and SQL is a prerequisite. You need to understand the relational model and topics such as normalization to guarantee the integrity of your data, and you’ll need to use your knowledge of SQL to tune the performance of your Hibernate application.

Hibernate automates many repetitive coding tasks, but your knowledge of persistence technology must extend beyond Hibernate itself if you want to take advantage of the full power of modern SQL databases. Remember that the underlyinggoal is robust, efficient management of persistent data.

Let’s review some of the SQL terms used in this book. You use SQL as a data definitionlanguage (DDL) to create a database schema with CREATE and ALTER statements.After creating tables (and indexes, sequences, and so on), you use SQL as adata manipulation language (DML) to manipulate and retrieve data. The manipulation operations include insertions, updates, and deletions. You retrieve data by executing queries with restrictions, projections, and join operations (including the Cartesian product). For efficient reporting, you use SQL to group, order, and aggregate data as necessary. You can even nest SQL statements inside each other; this technique uses subselects.

You’ve probably used SQL for many years and are familiar with the basic operations and statements written in this language. Still, we know from our own experience that SQL is sometimes hard to remember, and some terms vary in usage. To understand this book, we must use the same terms and concepts, so we advise you to read appendix A if any of the terms we’ve mentioned are new or unclear.

If you need more details, especially about any performance aspects and how SQL is executed, get a copy of the excellent book SQL Tuning by Dan Tow (Tow, 2003). Also read An Introduction to Database Systems by Chris Date (Date, 2003) for the theory, concepts, and ideals of (relational) database systems. The latter book is an excellent reference (it’s big) for all questions you may possibly have about databases and data management.     

Although the relational database is one part of ORM, the other part, of course, consists of the objects in your Java application that need to be persisted to and loaded from the database using SQL.

3  Using SQL in Java

When you work with an SQL database in a Java application, the Java code issues SQL statements to the database via the Java Database Connectivity (JDBC) API. Whether the SQL was written by hand and embedded in the Java code, or generated on the fly by Java code, you use the JDBC API to bind arguments to prepare query parameters, execute the query, scroll through the query result table, retrieve values from the result set, and so on. These are low-level data access tasks; as application developers, we’re more interested in the business problem that requires this data access. What we’d really like to write is code that saves and retrieves objects—the instances of our classes—to and from the database, relieving us of this low-level drudgery.

Because the data access tasks are often so tedious, we have to ask: Are the relational data model and (especially) SQL the right choices for persistence in objectoriented applications? We answer this question immediately: Yes! There are many reasons why SQL databases dominate the computing industry—relational database management systems are the only proven data management technology, and they’re almost always a requirement in any Java project.

However, for the last 15 years, developers have spoken of a paradigm mismatch. This mismatch explains why so much effort is expended on persistence-related concerns in every enterprise project. The paradigms referred to are object modeling and relational modeling, or perhaps object-oriented programming and SQL.

Let’s begin our exploration of the mismatch problem by asking what persistence means in the context of object-oriented application development. First we’ll widen the simplistic definition of persistence stated at the beginning of this section to a broader, more mature understanding of what is involved in maintaining and using persistent data.

4  Persistence in object-oriented applications

In an object-oriented application, persistence allows an object to outlive the process that created it. The state of the object can be stored to disk, and an object with the same state can be re-created at some point in the future.

This isn’t limited to single objects—entire networks of interconnected objects can be made persistent and later re-created in a new process. Most objects aren’t persistent; a transient object has a limited lifetime that is bounded by the life of the process that instantiated it. Almost all Java applications contain a mix of persistent and transient objects; hence, we need a subsystem that manages our persistent data.

 Modern relational databases provide a structured representation of persistent data, enabling the manipulating, sorting, searching, and aggregating of data. Database management systems are responsible for managing concurrency and data integrity; they’re responsible for sharing data between multiple users and multiple applications. They guarantee the integrity of the data through integrity rules that have been implemented with constraints. A database management system provides data-level security. When we discuss persistence in this book, we’re thinking of all these things:Storage, organization, and retrieval of structured data;Concurrency and data integrity;Data sharing

And, in particular, we’re thinking of these problems in the context of an objectoriented application that uses a domain model.

 An application with a domain model doesn’t work directly with the tabular representation of the business entities; the application has its own object-oriented model of the business entities. If the database of an online auction system has ITEM and BID tables, for example, the Java application defines Item and Bid classes.

Then, instead of directly working with the rows and columns of an SQL result set, the business logic interacts with this object-oriented domain model and its runtime realization as a network of interconnected objects. Each instance of a Bid has a reference to an auction Item, and each Item may have a collection of references to Bid instances. The business logic isn’t executed in the database (as an SQL stored procedure); it’s implemented in Java in the application tier. This allows business logic to make use of sophisticated object-oriented concepts such as inheritance and polymorphism. For example, we could use well-known design patterns such as Strategy, Mediator, and Composite (Gamma and others, 1995), all of which depend on polymorphic method calls.

Now a caveat: Not all Java applications are designed this way, nor should they be. Simple applications may be much better off without a domain model. Complex applications may have to reuse existing stored procedures. SQL and the JDBC API are perfectly serviceable for dealing with pure tabular data, and the JDBC RowSet makes CRUD operations even easier. Working with a tabular representation of persistent data is straightforward and well understood.

However, in the case of applications with nontrivial business logic, the domain model approach helps to improve code reuse and maintainability significantly. In practice, both strategies are common and needed. Many applications need to execute procedures that modify large sets of data, close to the data. At the same time, other application modules could benefit from an object-oriented domain model that executes regular online transaction processing logic in the application tier. An efficient way to bring persistent data closer to the application code is required.

If we consider SQL and relational databases again, we finally observe the mismatch between the two paradigms. SQL operations such as projection and join always result in a tabular representation of the resulting data. (This is known as transitive closure; the result of an operation on relations is always a relation.) This is quite different from the network of interconnected objects used to execute the business logic in a Java application. These are fundamentally different models, not just different ways of visualizing the same model.

With this realization, you can begin to see the problems—some well understood and some less well understood—that must be solved by an application that combines both data representations: an object-oriented domain model and a persistent relational model. Let’s take a closer look at this so-called paradigm mismatch..

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

等天晴i

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值