有50w
个京东店铺,每个顾客访问任何一个店铺的任何一个商品时都会产生一条访问日志,访问日志存储的表名为Visit
,访客的用户id
为user_id
,被访问的店铺名称为shop
,数据如下:
请统计:
(1) 每个店铺的UV
(访客数)
(2) 每个店铺访问次数top3
的访客信息。输出店铺名称、访客id
、访问次数
实现
(1) 每个店铺的UV
统计,最简单的分组统计即可
select shop,
count(distinct user_id) as uv
from test2 group by shop;
(2) 每个店铺访问次数top3
的访客信息。输出店铺名称、访客id
、访问次数
第一步:先统计每个店铺每个访客的访问次数
select user_id,shop,
count(*) as cnt
from test2
group by user_id,shop
order by shop,user_id
接下来,我们对访客的访问次数进行倒序排序,并筛选访问次数top3
的访客信息,嵌套子查询比较多,可读性不好,因此使用with as
with t1 as (
select user_id,
shop,
count(*) as cnt
from test2
group by user_id, shop
order by shop, user_id
),
t2 as (
select shop,
user_id,
cnt,
row_number() over (partition by shop order by cnt desc) rn
from t1
)
select *from t2 where rn<=3 order by shop ,cnt desc;