ODI之知识模块(KM)学习笔记一（KM概括）

最新推荐文章于 2021-08-23 10:28:26 发布

hfxl1108

最新推荐文章于 2021-08-23 10:28:26 发布

阅读量1.1k

点赞数

分类专栏： ODI

本文链接：https://blog.csdn.net/hfxl1108/article/details/8109787

版权

ODI 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

ODI中KM主要有以下6种:

(1)Reverse-engineering KM(RKM):
Retrieves metadata to the Oracle Data Integrator work repository,
Used in models to perform a customized reverse-engineering

A typical RKM follows these steps:
1. Cleans up the SNP_REV_xx tables from previous executions using the OdiReverseResetTable
command
2. Retrieves sub models, datastores, columns, unique keys, foreign keys, conditions from the
metadata provider to SNP_REV_SUB_MODEL, SNP_REV_TABLE, SNP_REV_COL,
SNP_REV_KEY, SNP_REV_KEY_COL, SNP_REV_JOIN, SNP_REV_JOIN_COL,
SNP_REV_COND tables.
3. Updates the model in the work repository by calling the OdiReverseSetMetaData API

总结：RKM的执行过程是，通过 ODI API OdiReverseResetTable 清空 SNP_REV_xx 等临时表，再将元数据导入到相应的临时表中，最后再通过ODI API OdiReverseSetMetaData 更新到模型中去。

(2)Check KM(CKM)
Checks consistency of data against constraints.
Used in models, submodels and datastores for data integrity audit.
Used in interfaces for flow control or static control.

The CKM can be used in 2 ways:
- To check the consistency of existing data. This can be done on any datastore or within interfaces,by setting the STATIC_CONTROL option to "Yes". data in the target datastore is checked after it is loaded.
- To check consistency of the incoming data before loading the records to a target datastore. This is done by using the FLOW_CONTROL option.
In summary: the CKM can check either an existing table or the temporary "I$" table created by an IKM.It creates an "E$" error table which it writes all the rejected records to。

总结：CKM主要是用来检查数据的一致性。有static_control 和flow_control两种。static_control主要是检查已经存在的数据的一致性，检查的规则从源表导来。flow_control主要检查即将载入的数据的一致性，通常数据来源于IKM 导入到staging area 的“I$” 表，检查的规则从目标表导来。CKM会在staging area生成'E$'表，将违反一致性的数据以及错误的数据导入到该表中。

(3)Loading KM(LKM)
Loads heterogeneous data to a staging area.
Used in interfaces with heterogeneous sources.

1. The LKM creates the "C$" temporary table in the staging area. This table will hold records loaded
from the source server.
2. The LKM obtains a set of pre-transformed records from the source server by executing the
appropriate transformations on the source. Usually, this is done by a single SQL SELECT query
when the source server is an RDBMS. When the source doesn’t have SQL capacities (such as flat
files or applications), the LKM simply reads the source data with the appropriate method (read file
or execute API).
3. The LKM loads the records into the "C$" table of the staging area.

总结：LKM是在源数据和目标数据来自不同服务器时使用的，它会将源数据保存在staging area 产生的一张"C$"的临时表中。如果源数据和目标数据来自同一个服务器，那么LKM在该接口中就不会使用。

(4)Integration KM(IKM)
Integrates data from the staging area to a target,used in interfaces

There are 2 types of IKMs: staging area is on the same
server as the target datastore, and those that can be used when it is not.

When the staging area is on the target server, the IKM usually follows these steps:
1. The IKM executes a single set-oriented SELECT statement to carry out staging area and target
declarative rules on all "C$" tables and local tables (such as D in the figure). This generates a
result set.
2. Simple "append" IKMs directly write this result set into the target table. More complex IKMs create
an "I$" table to store this result set.
3. If the data flow needs to be checked against target constraints, the IKM calls a CKM to isolate
erroneous records and cleanse the "I$" table.
4. The IKM writes records from the "I$" table to the target following the defined strategy (incremental
update, slowly changing dimension, etc.).
5. The IKM drops the "I$" temporary table.
6. Optionally, the IKM can call the CKM again to check the consistency of the target datastore.

When the staging area is different from the target server, as shown in Figure 6, the IKM usually follows
these steps:
1. The IKM executes a single set-oriented SELECT statement to carry out declarative rules on all
"C$" tables and tables located on the staging area (such as D in the figure). This generates a
result set.
2. The IKM loads this result set into the target datastore, following the defined strategy (append or
incremental update).

总结：IKM的作用是将在staging area的结果集或者是集中到‘I$’表中的数据导入到目标表中。在这个过程中，如果staging area 和目标表在同一个服务器上，则可以在IKM的过程中调用CKM来做一致性检查。如果不是，则不能。

(5)Journalizing KM(JKM)
Creates the Change Data Capture framework objects in the source staging area.
Used in models, sub models and datastores to create, start
and stop journals and to register subscribers.

JKMs create the infrastructure for Change Data Capture on a model, a sub model or a datastore. JKMs
are not used in interfaces, but rather within a model to define how the CDC infrastructure is initialized. This
infrastructure is composed of a subscribers table, a table of changes, views on this table and one or more
triggers or log capture programs as illustrated below.

总结：JKM主要是给ODI提供了CDC的功能，通过在源表自动创建触发器（T$）或者通过源数据库的LOG挖掘，得到净DML变更数据的主键，放到ODI创建的J$日记表中，并通过JV$日记视图提供完整的变更数据，供ELT直接使用。ODI把这个叫做“Journalizing Models”。通过订阅来获取数据，每次订阅后临时表中的数据都会清空。

(6)Service KM(SKM)
Generates data manipulation web services,used in models and datastores

SKMs are in charge of creating and deploying data manipulation Web Services to your Service Oriented
Architecture (SOA) infrastructure. SKMs are set on a Model. They define the different operations to
generate for each datastore’s web service. Unlike other KMs, SKMs do no generate an executable code
but rather the Web Services deployment archive files. SKMs are designed to generate Java code using
Oracle Data Integrator’s framework for Web Services. The code is then compiled and eventually deployed
on the Application Server’s containers

总结：通过SKM知识模块发布成Web服务方式，结合到SOA架构中，就可以通过SOA架构实时地查看数据和修改数据。