目录
建SDBDDL、SDB、ODB库,分别用于存储表结构、存储增量数据、存储历史数据:
拉链算法是数仓常用的数据存储技术
建SDBDDL、SDB、ODB库,分别用于存储表结构、存储增量数据、存储历史数据:
CREATE DATABASE "SDBDDL"
AS PERM=104857600
FALLBACK
NO BEFORE JOURNAL
NO AFTER JOURNAL
CREATE DATABASE "SDB"
AS PERM=104857600
FALLBACK
NO BEFORE JOURNAL
NO AFTER JOURNAL
CREATE DATABASE "ODB"
AS PERM=104857600
FALLBACK
NO BEFORE JOURNAL
NO AFTER JOURNAL
SDBDDL库建表:
CREATE MULTISET TABLE SDBDDL.SW5_SALARY_INFO(
ID VARCHAR(10) CHARACTER SET LATIN CASESPECIFIC TITLE '账户ID',
NAME VARCHAR(10) CHARACTER SET LATIN CASESPECIFIC TITLE '户名',
BALANCE DECIMAL(12,2) TITLE '余额')
PRIMARY INDEX ( ID );
注意:
1.定义为SET表的表不存储重复记录,而MULTISET表可存储重复记录
2.CASESPECIFIC ——大小写敏感,NOT CASESPECIFIC ——大小写不敏感
ODB库建表:(状态类无删除表)
CREATE MULTISET TABLE ODB.SW5_SALARY_INFO(
ID VARCHAR(10) CHARACTER SET LATIN CASESPECIFIC TITLE '账户ID' NOT NULL,
NAME VARCHAR(10) CHARACTER SET LATIN CASESPECIFIC TITLE '户名' NOT NULL,
BALANCE DECIMAL(12,2) TITLE '余额' NOT NULL,
START_DT DATE FORMAT 'YYYYMMDD' TITLE '开始时间' NOT NULL,
END_DT DATE FORMAT 'YYYYMMDD' TITLE '结束时间' NOT NULL
)
PRIMARY INDEX ( ID );
给ODB铺个底:
INSERT INTO ODB.SW5_SALARY_INFO VALUES('001','张三','520',DATE'2020-01-02',DATE'2999-12-31');
INSERT INTO ODB.SW5_SALARY_INFO VALUES('002','李四','2100',DATE'2020-10-10',DATE'2999-12-31');
ODB有两条铺底数据了
20201016的下发增量数据:
张三的余额发生变化,变成1314
新增了王五的数据
入库 ——SDB.SW5_SALARY_INFO1016
INSERT INTO SDB.SW5_SALARY_INFO1016 VALUES('001','张三','1314');
INSERT INTO SDB.SW5_SALARY_INFO1016 VALUES('003','王五','5000');
将20201016的数据入导ODB:
建一张临时表VT_NEW(ODB表结构):
CREATE MULTISET VOLATILE TABLE VT_NEW, NO LOG AS ODB.SW5_SALARY_INFO WITH NO DATA ON COMMIT PRESERVE ROWS;
取出新增和修改的数据,导入临时表:
INSERT INTO VT_NEW(
ID
,NAME
,BALANCE
,START_DT
,END_DT)
SELECT
N.ID
,N.NAME
,N.BALANCE
,DATE'2020-10-16'
,DATE'2999-12-31'
FROM
(SELECT
COALESCE( TRIM( ID ) , '' ) AS ID
,COALESCE( TRIM( NAME ) , '' ) AS NAME
,COALESCE( TRIM( BALANCE ) , 0 ) AS BALANCE
FROM SDB.SW5_SALARY_INFO1016 ) N
LEFT JOIN
(SELECT
ID
,NAME
,BALANCE
FROM ODB.SW5_SALARY_INFO
WHERE END_DT = DATE'2999-12-31' ) T
ON
N.ID = T.ID
WHERE T.ID IS NULL
OR N.NAME<>T.NAME OR N.BALANCE<>T.BALANCE
;
查临时表:
SELECT * FROM VT_NEW;
张三是修改的数据
王五是新增的数据
修改目标表
UPDATE T FROM ODB.SW5_SALARY_INFO T, VT_NEW N
SET END_DT = DATE'2020-10-16'
WHERE T.END_DT = DATE'2999-12-31'
AND N.ID = T.ID
;
INSERT INTO ODB.SW5_SALARY_INFO
SELECT * FROM VT_NEW;
到这一步,就完成了20201016的增量数据入库到近源模型层ODB