一、概述
根据星型模型的概念,不存在渐变维度,数据存在冗,典型例子地域维度表,如国家,省,市这种树形数据结构。
OLTP数据结构:
id | pid | name |
1 | 中国 | |
2 | 1 | 广东省 |
3 | 2 | 深圳 |
期望的星型模型数据结构:
id | country | province | city | level |
1 | 中国 | UNKNOWN | UNKNOWN | 1 |
2 | 中国 | 广东省 | UNKNOWN | 2 |
3 | 中国 | 广东省 | 深圳 | 3 |
二、GreenPlum处理方案
gp底层是PostgreSQL, 支持递归,实现如下:
WITH RECURSIVE cte AS (
SELECT
"id",
"name",
1 AS "level"
FROM
t_location
UNION ALL
SELECT
"c"."id",
(
"p"."name" || '/' || "c"."name"
) :: VARCHAR (255) AS "name",
"p"."level" + 1 AS "level"
FROM
cte AS "p"
JOIN t_location AS "c" ON "c"."pid" = "p"."id"
)
SELECT
"id",
split_part("name", '/', 1) AS country,
split_part("name", '/', 2) AS province,
split_part("name", '/', 3) AS city,
"level"
FROM
(
SELECT
"c"."id",
"c"."name",
"c"."level"
FROM
cte "c"
INNER JOIN (
SELECT
"id",
MAX ("level") AS "lv"
FROM
cte
GROUP BY
"id"
) AS "d" ON "c"."id" = "d"."id"
AND "c"."level" = "d"."lv"
) AS "t"
源数据: