牛客网SQL大厂面试真题

目录

1、各个视频的平均完播率

2、平均播放进度大于60%的视频类别

3、每类视频近一个月的转发量/率

4、每个创作者每月的涨粉率及截止当前的总粉丝量

5、国庆期间每类视频点赞量和转发量 

6、近一个月发布的视频中热度最高的top3视频

7、2021年11月每天的人均浏览文章时长

8、每篇文章同一时刻最大在看人数

9、 2021年11月每天新用户的次日留存率

10、每天的日活数及新用户占比

11、 统计活跃间隔对用户分级结果

12、连续签到领金币

14、统计2021年10月每个退货率不大于0.5的商品各项指标

15、某店铺的各商品毛利率及店铺整体毛利率

16、零食类商品中复购率top3高的商品

17、10月的新户客单价和获客成本

18、店铺901国庆期间的7日动销率和滞销率


  • 1、各个视频的平均完播率

视频完播率是指完成播放次数占总播放次数的比例。

简单起见,结束观看时间与开始播放时间的差>=视频时长时,视为完成播放。

avg()此刻的用法相当于满足条件的数量 / 个数,例如 1 0 0 1 1 1 0 0    那么avg()等于4/8

select video_id,
round(avg(if(timestampdiff(second,start_time,end_time) >= duration,1,0)),3)  as	avg_comp_play_rate
from tb_user_video_log left join tb_video_info using(video_id)
where year(start_time) =2021
group by video_id
order by avg_comp_play_rate DESC

  • 2、平均播放进度大于60%的视频类别

播放进度=播放时长÷视频时长*100%,当播放时长大于视频时长时,播放进度均记为100%。

计算各类视频的平均播放进度,将进度大于60%的类别输出。

先乘以100后的数据更准确,先用函数后处理和先乘以100后用函数处理的位数是不一样的: 
例如0.61345先用函数处理保留两位小数 就变成了0.61,再乘以100就变成61.00了,
先乘以100 就是61.345,四舍五入就变成61.35,所以结果更加准确。

select tag,
concat(
    round(
        avg(
            if(timestampdiff(second,start_time,end_time) >= duration,duration,timestampdiff(second,start_time,end_time))
        / duration * 100),
    2),
 "%")  as avg_play_progress
from tb_user_video_log join tb_video_info using(video_id)
group by tag
having avg_play_progress  > "60.00%"
order by avg_play_progress DESC;

# having replace(avg_play_progress,"%","") > 60
# SUBSTRING_INDEX(avg_play_progress,'%',1)>60

  • 3、每类视频近一个月的转发量/率

统计在有用户互动的最近一个月(按包含当天在内的近30天算,比如10月31日的近30天为10.2~10.31之间的数据)中,每类视频的转发量和转发率。转发率=转发量÷播放量。

近一个月:作为筛选条件

# where timestampdiff(
#     day,
#     date(start_time),
#     date((select max(start_time)from tb_user_video_log)))<30


select tag, 
sum(if_retweet) as retweet_cut,
round(avg(if_retweet) , 3) as retweet_rate
from tb_user_video_log join tb_video_info using(video_id)
where datediff(date((select max(start_time) from tb_user_video_log)),date(start_time)) < 30
group by tag
order by  retweet_rate DESC 

# where timestampdiff(
#     day,
#     date(start_time),
#     date((select max(start_time)from tb_user_video_log)))<30

  • 4、每个创作者每月的涨粉率及截止当前的总粉丝量

涨粉率=(加粉量 - 掉粉量) / 播放量。

if_follow-是否关注:为1表示用户观看视频中关注了视频创作者,为0表示此次互动前后关注状态未发生变化,为2表示本次观看过程中取消了关注。

窗口函数:第一个sum是求每月的涨粉数量,第二个sum是对每月涨粉量的累计求和!

# sum(sum(if(if_follow=1,1,if(if_follow=2,-1,0))))

select author,
date_format(start_time,"%Y-%m") as month,
round(sum(case when if_follow = 0 then 0
         when if_follow = 1 then 1
         else -1 end ) / count(if_follow) 
         ,3) as 	fans_growth_rate,
sum(sum((case when if_follow = 0 then 0
         when if_follow = 1 then 1
         else -1 end )))
over (partition by author order by date_format(start_time,"%Y-%m")) as total_fans
from tb_user_video_log join tb_video_info using(video_id)
where year(start_time) = "2021"
group by author,month
order by author,total_fans;



# round(avg(if(if_follow=1,1,if(if_follow=2,-1,0))),3) as fans_growth_rate,
# sum(sum(if(if_follow=1,1,if(if_follow=2,-1,0))))
# over (partition by author order by date_format(start_time,"%Y-%m")) as total_fans

  • 5、国庆期间每类视频点赞量和转发量 

统计2021年国庆头3天 每类视频每天的近一周总点赞量和一周内最大单天转发量。

近一周:作为值

sum(sum(if_like)) over (partition by tag 
order by DATE_FORMAT(start_time,'%Y-%m-%d') rows 6 preceding)

select * from 
(
select tag,
DATE_FORMAT(start_time,'%Y-%m-%d') dt,

sum(sum(if_like)) over (partition by tag 
order by DATE_FORMAT(start_time,'%Y-%m-%d') rows 6 preceding) as sum_like_cnt_7d,

max(sum(if_retweet)) over (partition by tag 
order by DATE_FORMAT(start_time,'%Y-%m-%d') rows 6 preceding) as max_retweet_cnt_7d 

from tb_user_video_log join tb_video_info using(video_id)
group by tag,DATE_FORMAT(start_time,'%Y-%m-%d')
) t

where dt between "2021-10-01" and "2021-10-03"
order by tag DESC,dt ASC

  • 6、近一个月发布的视频中热度最高的top3视频

热度=(a*视频完播率+b*点赞数+c*评论数+d*转发数)*新鲜度;

新鲜度=1/(最近无播放天数+1);

当前配置的参数a,b,c,d分别为100、5、3、2。

最近播放日期以end_time-结束观看时间为准,假设为T,则最近一个月按[T-29, T]闭区间统计。

select video_id,
round((100*complete_rate+5*like_cnt+3*comment_cnt+2*retweet_cnt)/(1+day_cnt),0) as hot_index
from 

(select video_id,
avg(if(timestampdiff(second,start_time,end_time)>=duration,1,0)) as complete_rate,
sum(if_like) as like_cnt,
count(comment_id) as comment_cnt,
sum(if_retweet) as retweet_cnt,
timestampdiff(day,max(date(end_time)),(select max(date(end_time)) from tb_user_video_log)) as day_cnt

from tb_user_video_log join tb_video_info using(video_id)

where timestampdiff(day,date(release_time),(select max(date(end_time)) from tb_user_video_log))<30
group by video_id
) t
order by hot_index DESC
limit 3
;

  • 7、2021年11月每天的人均浏览文章时长
select date(in_time) as dt,
round( 
    sum(timestampdiff(second,in_time,out_time)) / count(distinct uid)  
    ,1) as avg_viiew_len_sec
from tb_user_log
where date_format(in_time,"%Y-%m") = "2021-11" and artical_id!=0
group by dt
order by avg_viiew_len_sec ASC;

  • 8、每篇文章同一时刻最大在看人数

如果同一时刻有进入也有离开时,先记录用户数增加再记录减少。

1代表进入,-1代表离开。

select artical_id,max(cnt) as max_uv from
(
select artical_id,
sum(num) over( partition by artical_id order by dt asc,num desc) cnt
from
(
select artical_id,in_time as dt,1 as num from tb_user_log
where artical_id!=0 
union all
select artical_id,out_time as dt,-1 as num from tb_user_log
where artical_id!=0 
) a
) b
group by artical_id
order by max_uv DESC;

  • 9、 2021年11月每天新用户的次日留存率

次日留存率为当天新增的用户数中第二天又活跃了的用户数占比。

如果in_time-进入时间和out_time-离开时间跨天了,在两天里都记为该用户活跃过。

在表连接时,条件语句放在on和where的位置不同所产生不同的结果:

on用于连接操作时的匹配条件;where是在连接操作之后用于筛选出满足一定条件的最终结果行。

outer join有一个特点:就是以一侧的表为基,假如另一侧的表没有符合on筛选条件的记录,则以null替代。在这次的查询中,这一步的作用就是将那条原本应该被过滤掉的记录给添加了回来

select t1.dt,
round(count(distinct t2.uid)/count( t1.uid),2) as uv_left_rate
from 

(select uid,min(date(in_time)) as dt from tb_user_log group by uid
) as t1

left join 

(select uid,date(in_time) as dt from tb_user_log
union 
select uid,date(out_time) as dt from tb_user_log
) as t2 
on t1.uid=t2.uid and timestampdiff(day,t1.dt,t2.dt)=1 次日是否留存表,如果留存,t2.uid匹配,如果不留存,t2.uid为null

where date_format(t1.dt,"%Y-%m")="2021-11"
group by t1.dt
order by t1.dt ASC;

  • 10、每天的日活数及新用户占比

新用户占比=当天的新用户数÷当天活跃用户数(日活数)。

如果in_time-进入时间和out_time-离开时间跨天了,在两天里都记为该用户活跃过。

select t1.dt,count(distinct t1.uid) as dau,
round(count(t2.uid) / count(distinct  t1.uid) , 2) as uv_new_ratio
from
(
select uid,date(in_time) as dt from tb_user_log
union 
select uid,date(out_time) as dt from tb_user_log
) t1  #用户活跃表

left join 

(
select uid,min(date(in_time)) as dt from tb_user_log
group by uid
) t2  # 新用户表

 on t1.uid = t2.uid and t1.dt = t2.dt
group by dt
order by dt;

  • 11、 统计活跃间隔对用户分级结果

统计活跃间隔对用户分级后,各活跃等级用户占比。

用户等级标准简化为:忠实用户(近7天活跃过且非新晋用户)、新晋用户(近7天新增)、沉睡用户(近7天未活跃但更早前活跃过)、流失用户(近30天未活跃但更早前活跃过)。

假设就是数据中所有日期的最大值。

近7天表示包含当天T的近7天,即闭区间[T-6, T]。

SELECT user_grade, round(count(uid)/(select count(distinct uid) from tb_user_log),2) ratio
FROM (SELECT uid, 
            (CASE WHEN DATEDIFF(DATE((SELECT MAX(out_time) FROM tb_user_log)),date(max(in_time)))<=6
            AND DATEDIFF(DATE((SELECT MAX(out_time) FROM tb_user_log)),date(min(in_time)))>6
            THEN '忠实用户'
            WHEN DATEDIFF(DATE((SELECT MAX(out_time) FROM tb_user_log)),date(min(in_time)))<=6
            THEN '新晋用户'
            WHEN DATEDIFF(DATE((SELECT MAX(out_time) FROM tb_user_log)),date(max(in_time))) BETWEEN 7 AND 29
            THEN '沉睡用户'
            WHEN DATEDIFF(DATE((SELECT MAX(out_time) FROM tb_user_log)),date(max(in_time)))>29
            THEN '流失用户' END) AS user_grade
            FROM tb_user_log
            GROUP BY uid) a
GROUP BY user_grade
ORDER BY ratio DESC;

  • 12、连续签到领金币

从2021年7月7日0点开始,10月底结束,用户每天签到可以领1金币,并可以开始累积签到天数,连续签到的第3、7天分别可额外领2、6金币。每连续签到7天后重新累积签到天数(即重置签到天数:连续第8天签到时记为新的一轮签到的第一天,领1金币)。计算每个用户2021年7月以来每月获得的金币数

:如果签到记录的in_time-进入时间和out_time-离开时间跨天了,也只记作in_time对应的日期签到了。

WITH t1 AS( -- t1表筛选出活动期间内的数据,并且为了防止一天有多次签到活动,distinct 去重
	SELECT
		DISTINCT uid,DATE(in_time) dt,
		DENSE_RANK() over(PARTITION BY uid ORDER BY DATE(in_time)) rn -- 编号  # 根据uid分组dt排序得到rn
	FROM
		tb_user_log
	WHERE
		DATE(in_time) BETWEEN '2021-07-07' AND '2021-10-31' AND artical_id = 0 AND sign_in = 1
),
t2 AS (
	SELECT *,
	DATE_SUB(dt,INTERVAL rn day) dt_tmp, 
	case DENSE_RANK() over(PARTITION BY DATE_SUB(dt,INTERVAL rn day),uid ORDER BY dt )%7 -- 再次编号    # 以uid分组、dt_tmp排序再次dense_rank获得连续签到的天数
		WHEN 3 THEN 3
		WHEN 0 THEN 7
		ELSE 1
	END as day_coin -- 用户当天签到时应该获得的金币数
	FROM
	t1
)
	SELECT
		uid,DATE_FORMAT(dt,'%Y%m') `month`, sum(day_coin) coin  -- 总金币数
	FROM
		t2
	GROUP BY
		uid,DATE_FORMAT(dt,'%Y%m')
	ORDER BY
		DATE_FORMAT(dt,'%Y%m'),uid;

 

  • 13、计算商城中2021年每月的GMV

请计算商城中2021年每月的GMV,输出GMV大于10w的每月GMV,值保留到整数。

:GMV为已付款订单和未付款订单两者之和。结果按GMV升序排序。

select date_format(event_time,"%Y-%m") as month,
round(sum(if(status=2,0,total_amount)) ,0) as GMV
from tb_order_overall
where year(event_time)  = "2021"
group by month
having GMV>100000
order by GMV

  • 14、统计2021年10月每个退货率不大于0.5的商品各项指标

商品点展比=点击数÷展示数;

加购率=加购数÷点击数;

成单率=付款数÷加购数;退货率=退款数÷付款数,

当分母为0时整体结果记为0,结果中各项指标保留3位小数,并按商品ID升序排序。

select product_id,
round( avg(if_click) ,3) as ctr,
round(sum(if_cart) / sum(if_click) , 3) as cart_rate,
round( sum(if_payment) /sum(if_cart) ,3) as payment_rate,
round( sum(if_refund) / sum(if_payment) , 3) as refund_rate
from tb_user_event
where date_format(event_time,"%Y-%m") = "2021-10"
group by product_id
having refund_rate<=0.5
order by product_id ASC;

  • 15、某店铺的各商品毛利率及店铺整体毛利率

请计算2021年10月以来店铺901中商品毛利率大于24.9%的商品信息及店铺整体毛利率。

:商品毛利率=(1-进价/平均单件售价)*100%;

店铺毛利率=(1-总进价成本/总销售收入)*100%。

结果先输出店铺毛利率,再按商品ID升序输出各商品毛利率,均保留1位小数。

(
select "店铺汇总" as product_id,
concat(round( (1-sum(in_price * cnt) / sum(price * cnt))*100,1)  ,"%")
as profit_rate
from tb_product_info join tb_order_detail using(product_id) join tb_order_overall using(order_id)
where shop_id="901" and date_format(event_time,"%Y-%m") >= "2021-10"
) 

union all
(
select product_id,
concat(round( (1- sum(in_price * cnt) / sum(price * cnt))*100,1)  ,"%")
as profit_rate
from tb_product_info join tb_order_detail using(product_id) join tb_order_overall using(order_id)
where shop_id="901" and date_format(event_time,"%Y-%m") >= "2021-10"
group by product_id
having substring_index(profit_rate ,"%",1) >24.9
order by product_id
)

  • 16、零食类商品中复购率top3高的商品

复购率指用户在一段时间内对某商品的重复购买比例,复购率越大,则反映出消费者对品牌的忠诚度就越高,也叫回头率

此处我们定义:某商品复购率 = 近90天内购买它至少两次的人数 ÷ 购买它的总人数

近90天指包含最大日期(记为当天)在内的近90天。结果中复购率保留3位小数,并按复购率倒序、商品ID升序排序

select product_id,
round( count(distinct if(num >1 ,uid,null)) / count(distinct uid) , 3) as repurchase_rate
from
(
select product_id,uid,
count(order_id) as num  # 打标签 购买次数
from tb_product_info join tb_order_detail using(product_id) join tb_order_overall using(order_id)
where timestampdiff(day,date(event_time),(select date(max(event_time)) from tb_order_overall)) < 90 
and tag = "零食"
group by product_id,uid

) t
GROUP BY product_id
ORDER BY repurchase_rate DESC,product_id
LIMIT 3;

  • 17、10月的新户客单价和获客成本

问题:请计算2021年10月商城里所有新用户的首单平均交易金额(客单价)和平均获客成本(保留一位小数)。

:订单的优惠金额 = 订单明细里的{该订单各商品单价×数量之和} - 订单总表里的{订单总金额} 

with
    main as(
        #统计每个订单的商品总金额
        select
            order_id,
            sum(price*cnt) as uid_cost
        from tb_order_detail
        group by order_id
    )

select
    round(avg(total_amount),1) as avg_amount,
    round(avg(uid_cost-total_amount),1) as avg_cost
from tb_order_overall
join main using(order_id)
where date_format(event_time,'%Y%m') = '202110'
and status = 1
and (uid,event_time) in(
    select
        uid,
        min(event_time) as event_time
    from tb_order_overall
    group by uid
)

  • 18、店铺901国庆期间的7日动销率和滞销率

  • 11
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值