统计报表中,通常会有多个维度组合,对应一个指标的情况,如指标-DAU,维度可能是{(APP),(APP、终端),(APP、版本号)}等等。如果每一种维度组合都建一张表,开发需要维护多个结果表,增加开发成本;也可以把不同的维度组合UNION ALL,这样ETL的脚本比较臃肿。Grouping Sets可以解决这个问题。
假设有一张用户埋点日志表:
app | appver | form | os | device |
---|---|---|---|---|
淘宝 | 1.1.0 | app | 安卓 | 123 |
淘宝 | 1.2.0 | 小程序 | IOS | 234 |
京东 | 1.2.1 | M站 | IOS | 256 |
… | … | … | … | … |
SQL:
SELECT TO_CHAR(TO_DATE(ts,'yyyymmdd'),'yyyy-mm-dd') AS statistic_date,
app,
appver,
form,
os,
grouping(app) AS g_app,
grouping(appver) AS g_appver,
grouping(form) AS g_form,
grouping(os) AS g_os,
grouping_id(app,appver,form,os) AS grouping_id,
COUNT(DISTINCT device) AS DAU
FROM tracking_data
WHERE ts=20190820
GROUP BY app,
appver,
form,
os
GROUPING
SETS (
(app,form),
(app,form,appver),
(app,form,os),
(app,form,appver,os)
)
后续,可以根据g_app、g_appver、g_form、g_os
进行维度选择,不选则令g_*=1