hive练习题(二)计算用户的复购率和回购率
数据表ods_sales_orders前10行
复购率
定义:在某时间窗口内重复消费用户(消费两次及以上的用户)在总消费用户中占比
例如,按月作为统计窗口,当前月份购买2次及以上的客户占当前月份所有客户的比例。
第一步:计算每位客户在每月的消费次数
create table temp1 as SELECT customer_key,DATE_FORMAT(create_date,'YYYY-MM') AS umonth,
COUNT(customer_key) AS con
FROM ods_sales_orders GROUP BY customer_key,DATE_FORMAT(create_date,'YYYY-MM');
第二步:对消费2次及以上的客户进行统计,并计算复购率
SELECT umonth, SUM(IF(con>1,1,0))AS repurchase_con,COUNT(1) AS customs_con,
concat(round(SUM(IF(con>1,1,0))/COUNT(1)*100,2),'%') AS repurchase_ratio
FROM temp1
GROUP BY umonth;
回购率
定义:是某一个时间窗口内消费的用户,在下一个时间窗口仍旧消费的占比。
例如,按月作为统计窗口,1月份1000人购买了该产品,在2月份这1000人中的300人继续购买,则称1月份的回购率为30%。
第一步:将本月用户的消费情况和下月用户的消费情况做连接,连接方式采用左连接
涉及知识点 :concat函数,add_months函数
select * from
(SELECT customer_key,DATE_FORMAT(create_date,'YYYY-MM') AS umonth
FROM ods_sales_orders GROUP BY customer_key,DATE_FORMAT(create_date,'YYYY-MM')) as a
LEFT JOIN
(SELECT customer_key,DATE_FORMAT(create_date,'YYYY-MM') AS umonth
FROM ods_sales_orders GROUP BY customer_key,DATE_FORMAT(create_date,'YYYY-MM')) as b
ON a.customer_key = b.customer_key AND
CONCAT(a.umonth,'-01')=add_months(CONCAT(b.umonth,'-01'),-1);
第二步:计算本月和下月消费用户数以及相应的回购率
SELECT a.umonth,COUNT(a.customer_key) AS now_count,
COUNT(b.customer_key)AS next_count,
CONCAT(ROUND(COUNT(b.customer_key)/COUNT(a.customer_key)*100,2),'%')AS ratio
FROM
(SELECT customer_key,DATE_FORMAT(create_date,'YYYY-MM') AS umonth
FROM ods_sales_orders GROUP BY customer_key,DATE_FORMAT(create_date,'YYYY-MM'))a
LEFT JOIN
(SELECT customer_key,DATE_FORMAT(create_date,'YYYY-MM') AS umonth
FROM ods_sales_orders GROUP BY customer_key,DATE_FORMAT(create_date,'YYYY-MM'))b
ON a.customer_key = b.customer_key AND
CONCAT(a.umonth,'-01')=add_months(CONCAT(b.umonth,'-01'),-1)
GROUP BY a.umonth;