1,数据源的时间列Update_Date>SYSDATE-1,或者Source DB中记录上次抽取时间:
如:S_ETL_RUN记录上次抽取日期,S_ETL_CURR_RUN记录当前抽取日期,抽取SQL如下:
where Update_Date〉S_ETL_RUN.PREV_LOAD_DT and Update_Date<=S_ETL_CURR_RUN.LOAD_DT
2,数据源的时间列Update_Date > '$$INCREMENT_TS' ,INCREMENT_TS保存在Target DB或者INFA Repository DB中:
SETVARIABLE($$INCREMENT_TS,SESSSTARTTIME)
3,增量表记录抽取数据(DAC增量抽取SIEBEL数据源的方法)
像S_EVT_ACT此类活动表,Siebel中有时间字段来记录新更新的数据,DAC在数据源建立IMG表可以每天捕捉到增量数据,使用视图V_ETV_ACT进行增量抽取,
使用表S_EVT_ACT进行全量抽取。
DAC使用相关表:S_ETL_R_IMG_ (记录已经抽取过的源数据),S_ETL_I_IMG_ (记录增量源数据),S_ETL_D_IMG_ (记录删除的源数据)
1)第一次抽取后,在Prune Days参数期间的行就被抽取到相应的S_ETL_R_IMG_ 表中,
2)之后的Change Capture过程:
先将不在S_ETL_R_IMG_ 表中的行并且源表的LAST_UPD列比LAST_REFRESH_DATE减去Prune Days新的行插入S_ETL_I_IMG_ 表中(之前先truncate),再将S_ETL_D_IMG_中的数据加入S_ETL_I_IMG_中,然后S_ETL_I_IMG_ 表与base table Join在一起只抽取增量的行(即创建视图V_ETV_ACT用于增量抽取)。
3)ETL过程完成后,将S_ETL_I_IMG_ 的行push到然后S_ETL_R_IMG_ 表中,并删除S_ETL_R_IMG_ 表中早于last_refresh_date减去prune days的行。
4)S_ETL_D_IMG_表是Siebel Delete Trigger写入的表,该表的数据插入S_ETL_I_IMG_表中(operation='D')然后抽取到DW中,可以实现软删除或者硬删除(delete_flg='D')
5)SDEINC_FINDAUX_ 等Mapping 保证辅表更新主表不更新的时候,有其他表需要主表必须更新的情况下,自己写SQL将主表插入到S_ETL_I_IMG表中来捕捉到增量数据。
具体SQL过程:
Change Capture(IMG_BUILD)
# of prune days: 14
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SIEBEL', tabname => 'S_ETL_R_IMG_20', estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO',cascade => false, degree => DBMS_STATS.DEFAULT_DEGREE)
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SIEBEL', tabname => 'S_ETL_D_IMG_20', estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO',cascade => false, degree => DBMS_STATS.DEFAULT_DEGREE)
TRUNCATE TABLE S_ETL_I_IMG_20
2015-04-02 10:02:23.888 - Executing :
INSERT /*+APPEND*/ INTO S_ETL_I_IMG_20
(ROW_ID, MODIFICATION_NUM, OPERATION, LAST_UPD)
SELECT
ROW_ID
,MODIFICATION_NUM
,'I'
,LAST_UPD
FROM
S_EVT_ACT
WHERE
S_EVT_ACT.LAST_UPD > TO_DATE('2015-03-18 11:35:24', 'YYYY-MM-DD HH24:MI:SS')
AND NOT EXISTS
(
SELECT
ROW_ID
,MODIFICATION_NUM
,'I'
,LAST_UPD
FROM
S_ETL_R_IMG_20
WHERE
S_ETL_R_IMG_20.ROW_ID = S_EVT_ACT.ROW_ID
AND S_ETL_R_IMG_20.MODIFICATION_NUM = S_EVT_ACT.MODIFICATION_NUM
AND S_ETL_R_IMG_20.LAST_UPD = S_EVT_ACT.LAST_UPD
)
INSERT /*+APPEND*/ INTO S_ETL_I_IMG_20
(ROW_ID, MODIFICATION_NUM, OPERATION, LAST_UPD)
SELECT
ROW_ID
,MODIFICATION_NUM
,'D'
,LAST_UPD
FROM
S_ETL_D_IMG_20
WHERE NOT EXISTS
(
SELECT
'X'
FROM
S_ETL_I_IMG_20
WHERE
S_ETL_I_IMG_20.ROW_ID = S_ETL_D_IMG_20.ROW_ID
)
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SIEBEL', tabname => 'S_ETL_I_IMG_20', estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO',cascade => false, degree => DBMS_STATS.DEFAULT_DEGREE)
DROP VIEW V_EVT_ACT
CREATE VIEW V_EVT_ACT AS
SELECT
*
FROM
S_EVT_ACT,
S_ETL_I_IMG_20
WHERE
S_EVT_ACT.ROW_ID = S_ETL_I_IMG_20.ROW_ID
Change Capture Sync(IMG_SYNC)
# of prune days: 14
DELETE
FROM S_ETL_D_IMG_20
WHERE
EXISTS
(SELECT
'X'
FROM
S_ETL_I_IMG_20
WHERE
S_ETL_D_IMG_20 .ROW_ID = S_ETL_I_IMG_20.ROW_ID
AND S_ETL_I_IMG_20.OPERATION = 'D'
)
2015-04-02 11:33:37.73 - Executing :
DELETE
FROM S_ETL_I_IMG_20
WHERE LAST_UPD < TO_DATE('2015-03-19 09:59:59', 'YYYY-MM-DD HH24:MI:SS')
DELETE
FROM S_ETL_I_IMG_20
WHERE LAST_UPD > TO_DATE('2015-04-02 09:59:59', 'YYYY-MM-DD HH24:MI:SS')
DELETE
FROM S_ETL_R_IMG_20
WHERE
EXISTS
(
SELECT
'X'
FROM
S_ETL_I_IMG_20
WHERE
S_ETL_R_IMG_20.ROW_ID = S_ETL_I_IMG_20.ROW_ID
)
INSERT /*+APPEND*/ INTO S_ETL_R_IMG_20
(ROW_ID, MODIFICATION_NUM, LAST_UPD)
SELECT
ROW_ID
,MODIFICATION_NUM
,LAST_UPD
FROM
S_ETL_I_IMG_20
DELETE FROM S_ETL_R_IMG_20 WHERE LAST_UPD < TO_DATE('2015-03-19 09:59:59', 'YYYY-MM-DD HH24:MI:SS')