5. 查询首次下单后第二天连续下单的用户比率

最新推荐文章于 2023-12-01 12:20:18 发布

Dataer__

最新推荐文章于 2023-12-01 12:20:18 发布

阅读量451

点赞数 1

分类专栏： # HQL题目文章标签： sql

本文链接：https://blog.csdn.net/weixin_43676735/article/details/131469087

版权

HQL题目专栏收录该内容

50 篇文章 13 订阅

订阅专栏

文章目录

题目需求

从订单信息表(order_info)中查询首次下单后第二天仍然下单的用户占所有下单用户的比例，结果保留一位小数，使用百分数显示。

期望结果如下：

percentage
70.0%

需要用到的表

订单信息表：order_info

order_id (订单id)	user_id (用户id)	create_date (下单日期)	total_amount (订单金额)
1	101	2021-09-30	29000.00
10	103	2020-10-02	28000.00

实现一

-- 5） 使用 笛卡尔积将两个结果关联起来
select concat(count1 / all_haveOrder_userNum * 100, '%') as percentage
from (
         -- 3） 选出 create_date = first_date and datediff(last_one_date, first_date) = 1 的记录
         select count(*) count1
         from (
                  -- 2） 按照用户分组、每组按下单日期升序排序，开窗 计算 first_value（）、lead（create_date ， 1， ‘9999-99-99’）
                  select user_id,
                         create_date,
                         first_value(create_date) over (partition by user_id order by create_date) first_date,
                         lead(create_date, 1, '9999-99-99')
                              over (partition by user_id order by create_date)                     last_one_date
                  from (
                           -- 1） 对用户的下单日期进行滤重。原因：同一个用户在同一天可能存在多个订单记录
                           -- distinct 会同时作用于 user_id 和 create_date
                           select distinct user_id,
                                           create_date
                           from order_info
                       ) t1
              ) t2
         where create_date = first_date
           and datediff(last_one_date, first_date) = 1
     ) t3,
     (
         -- 4） 计算所有下单用户的个数
         select count(distinct user_id) all_haveOrder_userNum
         from order_info
     ) t4;

-- ） 按照用户分组，计算每个用户总的下单次数 sum_order，过滤掉 sum_order <=1 的用户（不需要了）

总结

	1. 思想: 在遇到需要对某组中排好序的 第一个和第二个 记录进行比较的时候，不要总是想着把 第一个和第二个 取出来，可以使用 over() 开窗函数，计算每组的 first_value()、lead(,1,)。
	2. 在Hive中，distinct只能放到所有字段的最前面，且当distinct应用到多个字段时，其作用范围是后面的所有字段，而不只是紧挨着它的一个字段。
	3. distinct不能和聚合函数一起使用，否则会报错。
		eg. select distinct user_id, count(xxx) from xxx;

题目来源

http://practice.atguigu.cn/#/question/5/desc?qType=SQL

Dataer__

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
0
评论
5. 查询首次下单后第二天连续下单的用户比率

1. 思想: 在遇到需要对某组中排好序的第一个和第二个记录进行比较的时候，不要总是想着把第一个和第二个取出来，可以使用 over() 开窗函数，计算每组的 first_value()、lead(,1,)。2. 在Hive中，distinct只能放到所有字段的最前面，且当distinct应用到多个字段时，其作用范围是后面的所有字段，而不只是紧挨着它的一个字段。3. distinct不能和聚合函数一起使用，否则会报错。
复制链接

扫一扫