数仓开发:如何计算投放效果?

背景介绍

业务介绍:用户是通过低价引流进来,然后通过复购购买高价商品,可以多次购买。低价商品和高价商品均可以退款,高价商品由于各种复杂的场景,可能会有多次退款。低价商品如果退款是全退,不存在多次退款。
业务需求:计算低价引流的流量后续的产出,复购的期限是在 30天内,退款的期限也是按 30天(退款时间-复购时间),最终交付一个按引流订单维度。

架构设计

由于引流订单和复购订单的性质不同,在数仓贴源层处理的过程中,已经将二者分离独立成表,此处是在贴源层的基础上加以处理。
将引流订单记录、复购订单记录和退款订单记录以引流订单为主表整合到一起,输出不加业务逻辑的事实表:订单链路数据表。在此基础上,再根据不同的业务逻辑处理为不同聚合表单。

数据开发

说明:为方便测试,以下使用的是 MySQL 语法,实际开发使用的是阿里云的 MaxCompute SQL,本代码的差异点在于日期函数date_add(),在 MaxCompute SQL 中,语法有一定差异,后者是date_add(<date_col_name>, 30),如果时间字段包含日期和时间,则要使用dateadd(<datetime_col_name>,30,'dd')

订单链路数据表好处理,把三个表根据用户信息进行 JOIN 即可,伪代码参考如下:

select *
from <引流订单记录>
left join <复购订单记录> on <用户信息>
left join <退款订单记录> on <用户信息>

有了订单链路数据表,接下来按 30天的复购和30天的退款期限聚合为引流订单产出表。
抽象出一个表单,数据记录如下,有 10 个字段

  • low_price_order_id:低价订单号
  • low_price_paid_time:低价订单付款时间
  • low_price_paid_amount:低价订单付款金额
  • low_price_refund_time:低价订单退款时间
  • low_price_refund_amount:低价订单退款金额
  • high_price_order_id:高价订单号
  • high_price_paid_time:高价订单付款时间
  • high_price_paid_amount:高价订单付款金额
  • high_price_refund_time:高价订单退款时间
  • high_price_refund_amount:高价订单退款金额

with user_orders as(
-- 只有低价订单
select '10001' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,null as "high_price_order_id",null as "high_price_paid_time",0 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 只有低价订单,且退款
select '10002' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",'2024-01-02' as "low_price_refund_time",1.0 as "low_price_refund_amount"
   ,null as "high_price_order_id",null as "high_price_paid_time",0 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 低价订单+一个高价订单
select '10003' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20001' as "high_price_order_id",'2024-01-01' as "high_price_paid_time",1000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 低价订单+2个高价订单
select '10004' as "low_price_order_id",'2024-01-01' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20002' as "high_price_order_id",'2024-01-02' as "high_price_paid_time",2000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all

select '10004' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20004' as "high_price_order_id",'2024-01-05' as "high_price_paid_time",1000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all
-- 低价订单+1个30天内高价订单+1个30天外高价订单
select '10005' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20003' as "high_price_order_id",'2024-01-02' as "high_price_paid_time",2000 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all

select '10005' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20010' as "high_price_order_id",'2024-02-04' as "high_price_paid_time",1500 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all

-- 低价订单+一个高价订单且一个退款
select '10006' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20005' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-05' as "high_price_refund_time",2000 as "high_price_refund_amount"
union all
-- 低价订单+一个高价订单且2个退款
select '10007' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20006' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-03' as "high_price_refund_time",1000 as "high_price_refund_amount"
union all

select '10007' as "low_price_order_id",'2024-01-02' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20006' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-04' as "high_price_refund_time",1000 as "high_price_refund_amount"
union all
-- 低价订单+一个高价订单且1个30天内退款1个30天外退款
select '10008' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20007' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-01-05' as "high_price_refund_time",100 as "high_price_refund_amount"
union all

select '10008' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20007' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-02-05' as "high_price_refund_time",1900 as "high_price_refund_amount"
union all
-- 低价订单+1个30天内高价订单—+1个30天外高价订单且该单有30天内退款和30天外退款
select '10009' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20008' as "high_price_order_id",'2024-01-03' as "high_price_paid_time",1500 as "high_price_paid_amount",null as "high_price_refund_time",0 as "high_price_refund_amount"
union all

select '10009' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20009' as "high_price_order_id",'2024-02-04' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-02-05' as "high_price_refund_time",100 as "high_price_refund_amount"
union all

select '10009' as "low_price_order_id",'2024-01-03' as "low_price_paid_time",1.0 as "low_price_paid_amount",null as "low_price_refund_time",0 as "low_price_refund_amount"
   ,'20009' as "high_price_order_id",'2024-02-04' as "high_price_paid_time",2000 as "high_price_paid_amount",'2024-03-06' as "high_price_refund_time",1900 as "high_price_refund_amount"
)
select *
from user_orders;

数据参考如下:

  • 部分只有低价订单,可能有退款,退款可能是 30天内,可能是 30天外;
  • 部分有高价订单,高价订单可能是 30天内,可能是 30天外;
  • 部分有多笔高价订单,每笔可能是 30天内,可能是 30天外;
  • 部分高价订单有退款,退款可能是 30天内,可能是 30天外;
  • 部分高价订单有多笔退款,每笔退款可能是 30天内,可能是 30天外。

image.png

目标表字段记录如下:

  • low_price_order_id:低价订单号
  • low_price_paid_time:低价订单付款时间
  • low_price_paid_amount:低价订单付款金额
  • low_price_refund_amount:30天内低价订单退款金额
  • high_price_paid_amount:30天内高价订单付款金额
  • high_price_refund_amount:30天内高价订单退款金额

接下来开始聚合目标表。
保留低价订单唯一记录,那就直接按照低价订单聚合?
逻辑上没问题,直接聚合试试。

不加日期限制时,结构如下:

select uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount
  ,sum(uos.low_price_refund_amount)
  ,sum(uos.high_price_paid_amount)
  ,sum(uos.high_price_refund_amount)
from user_orders uos
group by uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount

按 30 天周期限制,对每个聚合字段进行界限判断:

select uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount
  ,sum(case when uos.low_price_refund_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.low_price_refund_amount else 0 end)     low_price_refund_amount
  ,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.high_price_paid_amount else 0 end)       high_price_paid_amount
  ,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) and uos.high_price_refund_time<=date_add(uos.high_price_paid_time,interval 30 day) then uos.high_price_refund_amount else 0 end)  high_price_refund_amount
from user_orders uos
group by uos.low_price_order_id,uos.low_price_paid_time,uos.low_price_paid_amount

查看结果,显然不行!当高价订单有多笔退款时,数据发散了,直接聚合时,会出现翻倍的异常。
image.png

既然高价订单发散了,为了保证唯一记录,需要分两段聚合,先按低价订单和高价订单聚合退款数据,然后再按低价订单聚合高价订单金额和退款金额,数据流转参考如下。
image.png

不加时间限制时,参考如下:

select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time
    ,sum(uos.low_price_refund_amount)  low_price_refund_amount
    ,sum(uos.high_price_paid_amount)   high_price_paid_amount
    ,sum(uos.high_price_refund_amount) high_price_refund_amount
from(
    select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time
        ,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
        ,sum(uos.high_price_refund_amount) high_price_refund_amount
    from user_orders uos
    group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time
      ,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
)uos
group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time

接下来加上限制。
先按低价订单和高价订单聚合退款数据,把高价订单的退款时间减去高价订单的付款时间在30天内的退款金额聚合。
然后再按低价订单,把高价订单的付款时间减去低价订单的付款时间在 30天内的高价订单的付款金额和高价订单的退款金额聚合,同时把低价订单的退款时间减去低价订单的付款时间在 30天内的低价订单的付款金额聚合,得到最终的目标表。

select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time
    ,sum(case when uos.low_price_refund_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.low_price_refund_amount else 0 end)     low_price_refund_amount
    ,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.high_price_paid_amount else 0 end)       high_price_paid_amount
    ,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,interval 30 day) then uos.high_price_refund_amount else 0 end)     high_price_refund_amount
from(
    select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time
        ,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
        ,sum(case when uos.high_price_refund_time<=date_add(uos.high_price_paid_time,interval 30 day) then uos.high_price_refund_amount else 0 end)  high_price_refund_amount
    from user_orders uos
    group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time
      ,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
)uos
group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time

结果如下,符合需求的预期。
image.png

可视化

说明:实际业务表单还有很多其他的维度,此处仅抽象出时间维度。

有了目标表,便可以根据目标表按时间维度聚合每月、每周、每天的营收金额、退款率等数据指标。
由于案例数据较少,此处做一个按天查看营收金额和退款率的折线图,看看二者的关系走势。

  • 营收金额:低价订单付款金额-低价订单退款金额+高价订单付款金额-高价订单付款金额
  • 退款率:高价订单退款金额>0 的数量/总低价订单数,一个低价订单可以视为是一个用户,按用户数计算退款率。
select low_price_paid_time,sum(low_price_paid_amount-low_price_refund_amount+high_price_paid_amount-high_price_refund_amount) 营收金额,sum(if(high_price_refund_amount>0,1,0))/count(*) 退款率
from <引流订单产出表>
group by low_price_paid_time
order by low_price_paid_time

image.png

小结

本文介绍了怎么实现以引流的低价订单的为基本维度,按照业务 30天的间隔分别聚合低价订单退款、高价订单金额和高价订单退款金额。

采用了两层表单的设计,一层事实表,一层根据业务不同的口径进行聚合。

处理业务逻辑的时候,根据事实表的结构,采用分段聚合的逻辑先聚合【低价订单+高价订单】,然后再根据【低价订单】进行聚合,最终得到目标表。

目标表开发好之后,在该表的基础上,根据各类业务指标进行聚合并可视化,最终提交给业务方使用。

附录

附上 MaxCompute SQL 处理逻辑,此案例,将 date_add()语法修改即可。

select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time
    ,sum(case when uos.low_price_refund_time<=date_add(uos.low_price_paid_time,30) then uos.low_price_refund_amount else 0 end)     low_price_refund_amount
    ,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,30) then uos.high_price_paid_amount else 0 end)       high_price_paid_amount
    ,sum(case when uos.high_price_paid_time<=date_add(uos.low_price_paid_time,30) then uos.high_price_refund_amount else 0 end)     high_price_refund_amount
from(
    select uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time
        ,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
        ,sum(case when uos.high_price_refund_time<=date_add(uos.high_price_paid_time,30) then uos.high_price_refund_amount else 0 end)  high_price_refund_amount
    from user_orders uos
    group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time,uos.low_price_refund_amount,uos.low_price_refund_time
      ,uos.high_price_order_id,uos.high_price_paid_time,uos.high_price_paid_amount
)uos
group by uos.low_price_order_id,uos.low_price_paid_amount,uos.low_price_paid_time
  • 10
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Xin学数据

为你点亮一盏灯,愿你前进无阻。

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值