信息集成: DBA 手中的 Rational Data Architect

最新推荐文章于 2024-10-16 11:14:53 发布

jaminwm

最新推荐文章于 2024-10-16 11:14:53 发布

阅读量3.9k

点赞数

分类专栏： DB2 文章标签： database 数据库 eclipse validation websphere wizard

DB2 专栏收录该内容

54 篇文章 0 订阅

订阅专栏

原文出自Nelson King 他是一位超过25年开发经验的程序员，他现在为widely出版社签约技术作者。

为让各位专才能够更加准确的了解RDA，我在译文后附上了原文，供大家甄别。

DBA 手中的 RDA

如果要我说出 IBM Rational Data Architect（RDA）的最佳特性，我会说是该产品的多面性。当然，创建联邦数据源的简便性和成熟的基于 Eclipse 的 Workbench 也是极好的高级特性，但是它们只是对 RDA 本身用处有贡献。真正使 RDA 与众不同的是，它从逻辑数据模型到物理数据模型的转换能力、对已有数据库的管理和反向工程能力、分析模式的能力、应用业务规则的能力、创建存储过程的能力、发布模式和报告（打印或放在网上）的能力以及处理大量其他与数据库相关的任务的能力。

RDA 就像是企业数据库管理方面的瑞士军刀。这是否意味着它无所不能？不是的，瑞士军刀也不能作为铲车的替代品。但 RDA 是一个精心组织的工具箱，可以迅速适应各种情况。

为了说得更明白一点儿，假设我们使用 RDA 来处理 DBA 或数据架构师的一些日常工作。虽然这些事件是虚构的，但这些场景在现实中都能碰到。

上午 8 点：一切正常

当检查完服务器监视器和数据库状态仪表板之后，DBA 或数据架构师会一边喝一天中的第一杯咖啡，一边打开 RDA。至于我们，一开始的工作是继续为气象产品线设计一个传感器数据库。一开始还比较小的传感器属性列表，随着产品的增多而逐渐变长。设计过程采用典型的 “哦耶（oh yeah）” 方法 —— 就像这样，“哦耶，我们需要将它转换成华氏温度。” 对此我们望而生畏，因为我们没有用计算属性，对吗？我们有过争论，但是由于有个存储过程将在一个繁忙的循环中使用这个数字，我们决定破例。

Rational Data Architect eases the daily grind.

If I were asked to name the best feature of IBM Rational Data Architect (RDA), I’d have to say it’s the product’s versatility. Sure, the ease of creating federated data sources and the slick Eclipse-based Workbench are terrific high-level features, but they’re only contributors to RDA’s overall usefulness. It’s the ability to swing from logical to physical data models, explore and reverse engineer existing databases, analyze schemas, apply business rules, create stored procedures, publish schemas and reports (print or Web), and a raft of other database related tasks that make RDA stand out.

RDA is like the Swiss Army Knife of enterprise database management. Does this mean it does everything? No, and a Swiss Army Knife is not a substitute for a forklift either. But RDA is a well-organized toolkit that can quickly become second nature to use in all kinds of situations.

To show you what I mean, let’s pretend we’re using RDA to tackle jobs typical in a DBA or data architect’s day. These events are fictitious, but the situations are real enough.

8 a.m.: In Normal Form
After checking server monitors and database status dashboards, a DBA or data architect is likely to fire up RDA with the day’s first cup of coffee. For us, the day starts with the continuing design of a sensor database for the meteorology product line. What started with a small list of sensor attributes has grown with each product. The design process has taken the typical “oh yeah” approach — as in, “Oh yeah, we need to carry Fahrenheit conversion on that.” We wince at this because we don’t do calculated attributes, right? We argue, but because of a stored procedure that will use the number in a busy loop, we make an exception.

Jumping into RDA, we open the sensor database model, which is still in the logical design phase (see Figure 1), and then add the attribute with copious annotations. We use the RDA workbench to create a stub for the eventual stored procedure, with notes about the calculated value. RDA has a very good Analyze Model function, which checks for duplicate relations, normal forms, model and SQL syntax, and some business rules we built with Object Constraint Language (OCL). A validation routine may complain about this attribute, so we need useful documentation. RDA can store almost everything about a design, including text files, diagrams, bookmarks — whatever it takes to document the work.

10 a.m.: A Rogue Database
The phone rings. It’s a new department manager who wants to be a good scout. She informs us of a “test database” she discovered in her department that’s actually being used to track important sales results (with hours of manual data entry). She thinks we need to retain the information but wonders if it shouldn’t be integrated with corporate databases. We don’t usually operate like the Borg; however, resistance is futile — this rogue database will be assimilated. After some touchy negotiations and an exchange of what might be called data definitions, RDA is put to work before noon. (It happens this fast only in fiction.)

To get at the data, we use the Data Connection Wizard. Not everyone likes them in general, but most of the wizards provided in RDA are helpful not only for novices but also for anyone who wants to do routine work quickly. We scramble to establish a connection to the rogue database. It’s on the intranet, but it isn’t in DB2. Fortunately, RDA can handle a fairly wide range of data sources with JDBC (DB2, Oracle, SQL Server, Informix, and Sybase). A phone call finally gets the right password.

In this case, there is no schema or DDL, so we use the Physical Data Model Wizard to reverse engineer the model from the database. Looking at this model, we can immediately see there are similarities with official schemas, but we need a precise mapping. Integration modeling — mapping between schemas to discover relationships — is something RDA does very well. We also set up some transformations and other routines that will help clean the incoming data. Even with the graphical RDA tools, mapping requires experienced eyeballing and a lot of manipulation of details (see Figure 2). It takes time.

2 p.m.: Nineteen Remote Programmers
During an at-the-workbench lunch session, we discuss the plight of 19 of our developers who were recently moved off-campus. They’re already complaining about being out of the loop with their new project, so we decide to publish the entire set of RDA schema and entity diagrams to the Web. We’ll give them a couple of days to study the schemas, and then we’ll run a training session to explain what and why. RDA almost makes the publishing a one-button affair.

Some of these developers have asked if they can handle RDA themselves. Sure, we say, RDA can alter the functionality and UI for different roles including the Eclipse Developer, Java Developer, and data specialist. We try to make it clear that it helps to understand basic database management and design, but newbies with some database background are welcome in RDA. They also need to get their team support — in this case CVS — up and running.

4 p.m.: CL_MXPRF_TO?
One of the best things about RDA is the relative ease of going from a UML modeling tool into an integrated database design tool. I’ve seen plenty of decent modeling tools, but few that can convert from the logical into the physical database components with so much control.

We’re back to working on the sensor database, which needs to become a real database very soon. We’ll let RDA generate a syntax-specific DDL schema from the model, which is saved as a script and fed to the host RDBMS (in this case DB2) to execute. But first, we continue to run validation checks on the model. This afternoon, one test kicks out an attribute name that doesn’t comply with the corporate standard: CL_MXPRF_TO. We don’t know who put it into the schema, and we certainly don’t know what it is, because it isn’t annotated. Could it be a ringer to test us?

No, it’s something that we forgot to annotate; must’ve been late in the day. We figure out the attribute is the first of many that will come from WebSphere Information Integrator federated functions, but we haven’t worked out all of the connections. When we do, we’ll need to look at the whole thing with the RDA Impact Analysis window to study the dependencies, and make sure all the references are resolved. We put both of these items in the RDA Task List. We also make a note to think about the many different validation tests available in RDA (some are explicit, others are implicit) and decide priority and sequence for them.

Getting Carried Away
This brings up an important element of RDA: It’s so rich in features, including details of the UI, that it takes a while to learn them all and appreciate their use. I’m still exploring, for example, how RDA uses the Eclipse Modeling Framework (EMF) which, among other things, makes extending the functionality of RDA much easier (although not necessarily simple). I’m toying with the idea of learning how to build some data management Eclipse plug-ins. I’ve got some ideas. Just for fun, after hours, of course.…

进入 RDA ，我们打开传感器数据库模型，它仍然处于逻辑设计阶段（见图 1 ），然后添加带大量注释的属性。我们使用 RDA workbench 为最终的存储过程创建一个桩模块，其中含有关于计算值的注释。 RDA 有一个非常好的 Analyze Model 功能，它可以检查重复关系、范式、模型和 SQL 语法以及我们用 Object Constraint Language （ OCL ）创建的一些业务规则。验证例程可能会抱怨这个属性，所以我们需要有用的文档。 RDA 可以存储关于设计的几乎所有东西，包括文本文件、图、书签 —— 它用来将工作编制成文档的一切素材。

上午 10 点：一个孤立的数据库

电话响起。是一位好心的新任部门经理打来的。她告诉我们，她在她的部门发现一个 “测试数据库”，这个测试数据库实际上用于跟踪重要的销售结果（通过手工录入数据）。她认为我们需要保留这些信息，但是又想知道是否不应该将它与公司数据库集成。我们通常不会像 Borg 那样做。然而，反对是徒劳的 —— 这个孤立的数据库将被吸收。经过艰难的协商，并就什么是数据定义交换了意见之后，在中午之前， RDA 投入了工作。（只有在虚构的故事中进展才会这么快。）

为了找到数据，我们使用 Data Connection Wizard 。一般情况下并不是每个人都喜欢它们，但是 RDA 提供的大多数向导不仅对新手很有帮助，而且对于每个想快速完成例行工作的人来说帮助也很大。我们匆忙建立一个到那个孤立数据库的连接。这个数据库在内部网上，但是不在 DB2 中，幸运的是， RDA 可以用 JDBC 处理很多类型的数据源（ DB2 、 Oracle 、 SQL Server 、 Informix 和 Sybase ）。打了一通电话之后，我们最终获得了正确的密码。

由于这个数据库没有模式或 DDL ，所以我们使用 Physical Data Model Wizard 对数据库中的模型进行反向工程。我们一下子就看出这个模型与正式模式的类似之处，不过我们需要精确的映射。集成建模 —— 通过模式之间的映射发现关系 —— 是 RDA 很擅长的事情。我们还设置了一些转换例程和其他例程，它们将帮助清洗传入的数据。即使使用图形化 RDA 工具，这种映射也需要有经验的目测和大量细致的操作（见图 2 ）。这需要时间。

下午 2 点：19 个远程程序员

我们在工作台上一边会餐，一边讨论我们的 19 个开发人员的情况，他们刚走出校园。他们已经在抱怨，他们还没有融入到新项目中。所以我们决定将整套 RDA 模式和实体图发布到 Web 上。我们将给他们两天的时间来学习模式，然后举行一次培训会议，向他们解释什么以及为什么的问题。 RDA 使发布工作简单到只需按一个按钮。

有些开发人员问过，他们是否可以自己处理 RDA 。当然可以，我们说， RDA 可以为不同的角色（包括 Eclipse 开发者、 Java 开发者和数据专家）改变功能和 UI 。我们尽量让他们明白，理解基本的数据库管理和设计会有所帮助，但是 RDA 欢迎有一些数据库背景知识的新手。他们还需要得到团队的支持。

下午 4 点：CL_MXPRF_TO?

关于 RDA 最棒的事情就是从 UML 建模工具转而使用集成数据库设计工具很容易。我见过很多的建模工具，但是很少有工具在将逻辑数据库组件转换为物理数据库组件时能进行这么多的控制。

我们又回到传感器数据库的工作上来，它需要很快地成为一个真正的数据库。我们将用 RDA 根据模型生成一个特定于语法的 DDL 模式，并将其保存为一个脚本，提供给主 RDBMS （在这里就是 DB2 ）来执行。但是首先，我们继续对模型进行验证检查。今天下午，一次测试挑出了一个不符合公司标准的属性名称： CL_MXPRF_TO 。我们不知道是谁将它放入到模式中的，当然我们也不知道它是什么，因为它没有带注释。会不会是用来考我们的圈套呢？

不，它是我们忘了加注释的一个什么东西。我们已经弄清楚，该属性是来自 WebSphere Information Integrator 联邦函数的很多属性中的第一个属性，但是我们还没有理清所有的联系。当我们理清这些联系之后，需要用 RDA Impact Analysis 窗口进行整体上的观察，以便研究依赖关系，确保所有引用都是有依据的。我们将这两样东西放入到 RDA Task List 中。我们还做了笔记，考虑 RDA 中可用的很多不同的验证测试（有些是显式的，有些是隐式的），并决定给它们排序。

忘乎所以

RDA 有如此丰富的特性，包括 UI 的细节，所以要学会它们并理解它们的用法需要不少时间。例如，我仍然在探索， RDA 是如何使用 Eclipse Modeling Framework （ EMF ）的，后者使 RDA 的功能扩展起来容易得多（虽然不一定很简单）。有空的时候我会想学习如何构建一些数据管理 Eclipse 插件。我已经有了一些想法。当然，这只是为了在闲暇时找点儿乐子……