第1题:访问量统计
userId visitDate visitCount
u01 2017/1/21 5
u02 2017/1/23 6
u03 2017/1/22 8
u04 2017/1/20 3
u01 2017/1/23 6
u01 2017/2/21 8
U02 2017/1/23 6
U01 2017/2/22 4
使用SQL统计出每个用户的累积访问次数,如下表所示:
用户id 月份 小计 累积
u01 2017-01 11 11
u01 2017-02 12 23
u02 2017-01 12 12
u03 2017-01 8 8
u04 2017-01 3 3
with t0 as (select *, date_format(regexp_replace(v.visitdate, '/', '-'), 'yyyy-MM') month from visit v),
t1 as (select userid, month, sum(visitcount) visitTotal from t0 group by userid, month),
t2 as (select *, sum(visitTotal) over (partition by userid order by visitTotal) visitSub from t1)
第2题:电商场景TopN统计
user_id shop
u1 a
u2 b
u1 b
u1 a
u3 c
u4 b
u1 a
u2 c
u5 b
u4 b
u6 c
u2 c
u1 b
u2 a
u2 a
u3 a
u5 a
u5 a
u5 a
(1)每个店铺的UV(访客数)
with t0 as (select shop, count(distinct user_id) usercnt from shopvisit group by shop)
select *
from t0;
(2)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数
with t1 as (select shop, user_id, count(user_id) cnt from shopvisit group by shop, user_id),
t2 as (select *, row_number() over (partition by shop order by cnt desc) rk from t1),
t3 as (select * from t2 where rk <= 3)
第3题:订单量统计
2017-01-01,10029028,1000003251,33.57
2017-01-01,10029029,1000003251,33.57
2017-01-01,100290288,1000003252,33.57
2017-02-02,10029088,1000003251,33.57
2017-02-02,100290281,1000003251,33.57
2017-02-02,100290282,1000003253,33.57
2017-11-02,10290282,100003253,234.00
2018-11-02,10290284,100003243,234.00
(1)给出2017年每个月的订单数、用户数、总成交金额
with t0 as (select *, date_format(dt, 'yyyy-MM') month from orders where year(dt) = 2017),
t1 as (select month, count(order_id) order_cnt, count(user_id) user_cnt, sum(amount) amount_sum from t0 group by month)
(2)给出2017年11月的新客数(指在11月才有第一笔订单)
with t0 as (select * from orders where date_format(dt, 'yyyy-MM') = '2017-11'),
t1 as (select count(distinct user_id) new_user from t0)
第4题:大数据排序统计
movie_user
user_id name age
001 u1 10
002 u2 15
003 u3 15
004 u4 20
005 u5 25
006 u6 35
007 u7 40
008 u8 45
009 u9 50
0010 u10 65
movie_log
user_id url
001 url1
002 url1
003 url2
004 url3
005 url3
006 url1
007 url5
008 url7
009 url5
0010 url1
根据年龄段观看电影的次数进行排序
with t0 as (select *, concat(`floor`(age / 10) * 10, '-', `floor`((age / 10) + 1) * 10) age_phase from movie_user mu join movie_log ml on mu.user_id = ml.user_id),
t1 as (select age_phase, count(t0.user_id) view_cnt from t0 group by age_phase)
第5题:活跃用户统计
日期 用户 年龄
2019-02-11 test_1 23
2019-02-11 test_2 19
2019-02-11 test_3 39
2019-02-11 test_1 23
2019-02-11 test_3 39
2019-02-11 test_1 23
2019-02-12 test_2 19
2019-02-13 test_1 23
2019-02-15 test_2 19
2019-02-16 test_2 19
求所有用户和活跃用户的总数及平均年龄(活跃用户指连续2天都有访问记录的用户)
with t0 as (select count(distinct user_id) users, avg(age) avg_age from user_log),
t1 as (select user_id, dt, age from user_log group by user_id, dt, age),
t2 as (select *, row_number() over (partition by user_id order by dt) rn from t1),
t3 as (select *, date_sub(dt, rn) newdt from t2),
t4 as (select user_id, newdt, age, count(*) login_days from t3 group by user_id, newdt, age having login_days >= 2),
t5 as (select user_id, age from t4 group by user_id, age)
select '所有用户' type, users, avg_age from t0
union
select '活跃用户', count(user_id), avg(age) from t5;