近期做漏斗分析的功能开发,总结一下开发过程。
数据库使用的是 starrocks,由于支持 window_funnel 函数,所以直接使用,点击查看。
直接上 sql:
SELECT bi_device_id,
window_funnel(3600, datetime, 0, [(event_name like 'account_login_success') AND (`date` <='2023-12-27'), event_name like 'ad_click']) as `funnel`
FROM business_table
WHERE (date >= '2023-12-26' AND date <= '2023-12-28')
GROUP BY bi_device_id
window_funnel 参数说明:
- 3600 单位依赖第二个参数,如果是 date 类型,单位为天,如果是 datetime 类型,单位为秒
- 取的是 business_table 表中的事件时间字段
- 模式,查看上面链接有 0,1,2,4
- 事件数组,event_name 为表中事件名称的字段,漏斗事件的顺序,其中每一个事件都可以有自己的筛选条件
查询结果如下:
bi_device_id | funnel |
---|---|
e4b5175c-d6b0-45ad-817f-99e3b61e527a | 4 |
c10decef-a12b-4625-9169-4fc9d7c4dc27 | 4 |
00de37e6-cd7e-4414-ad04-daa7ef14e376 | 3 |
3b818e3b-bc6f-4171-a5de-aa8fd4ae7529 | 2 |
94e4f3f0-c972-46cb-a639-7608ae886718 | 1 |
06ec42a7-ffe6-406e-a136-f25424dfa084 | 0 |
2023-12-26 到 12-28 之间的所有事件中,
2023-12-27 号之前触发account_login_success事件后 3600 秒内设备触发 ad_click 事件的数量。
基于上述结果可以根据业务做转化率和流失率、到达率的计算,上 sql:
SELECT ifnull(sum(if(funnel >= 1, 1, 0)), 0) as `account_login_success_step_1`,
ifnull(sum(if(funnel >= 2, 1, 0)), 0) as `ad_click_step_2`,
ifnull(sum(if(funnel >= 3, 1, 0)), 0) as `ad_impression_step_3`,
ifnull(sum(if(funnel >= 4, 1, 0)), 0) as `ad_request_step_4`,
ifnull(round(sum(if(funnel >= 4, 1, 0)) / sum(if(funnel >= 1, 1, 0)), 4), 0) as `all_conversion_rate`,
'1.0' as `1->1_rate`,
ifnull(round(sum(if(funnel >= 2, 1, 0)) / sum(if(funnel >= 1, 1, 0)), 4), 0) as `1->2_rate`,
ifnull(round(sum(if(funnel >= 2, 1, 0)) / sum(if(funnel >= 1, 1, 0)), 4), 0) as `1->2_arrival_rate`,
ifnull(round(sum(if(funnel >= 3, 1, 0)) / sum(if(funnel >= 2, 1, 0)), 4), 0) as `2->3_rate`,
ifnull(round(sum(if(funnel >= 3, 1, 0)) / sum(if(funnel >= 1, 1, 0)), 4), 0) as `2->3_arrival_rate`,
ifnull(round(sum(if(funnel >= 4, 1, 0)) / sum(if(funnel >= 3, 1, 0)), 4), 0) as `3->4_rate`,
ifnull(round(sum(if(funnel >= 4, 1, 0)) / sum(if(funnel >= 1, 1, 0)), 4), 0) as `3->4_arrival_rate`
FROM (SELECT bi_device_id,
window_funnel(3600, datetime, 0,
[(event_name like 'account_login_success') AND (`date` <='2023-12-27'),
event_name like 'ad_click',
event_name like 'ad_impression',
event_name like 'ad_request']) as `funnel`
FROM business_table
WHERE (date >= '2023-12-26' AND date <= '2023-12-28')
GROUP BY bi_device_id) t
查询结果出来就是各步骤的转化率和到达率。
基于上述功能,新增了任意事件,任意事件使用 event_name like ‘%%’ 实现查询任意事件。