MySql窗口函数学习

窗口函数

分析函数 over(partition by 列名 order by 列名 rows between 开始位置 and 结束位置)
Partition by 是分区函数
Order by 排序


窗口范围
指定窗口范围 between and
经常使用的是 rows between unbounded preceding and current row 从第一行到当前行
Unbounded preceding 开始行
2 preceding 往前推两行
Current row 当前行
3 following 向下取多少行
Unbounded following 结尾

排序

row_number() 1,2,3
rank() 1,2,2,4
dense_rank() 1,2,2,3

聚合类

Avg() sum() max() min()

其他类
Lag(列名,往前的行数),可以计算用户上次购买时间或者用户下次购买时间
Lead(列名,往后的行数)
Ntitle(n)把有序分区中的行数分发到指定数据的组中,每个组有从1开始的编号,ntitle返回该行所属的组的编号 例如n=5就是把表平均分成5份

select name, orderdate, cost,
lag(orderdate,1) over(partition by name order by orderdate) 
from business

分布函数
percent_rank() 分(组内当前行的RANK值-1)/(分组内总行数-1)
Cume_dist()=小于等于当前值的行数/分组内总行数

范例
1.求七天累加金额

select visited_on,sum(amount) over(order by visited_on rows between 6 preceding and current rows) as amount,#七天的累加金额
round(avg(amount) over (order by visited_on rows between 6 preceding and current row),2) as average_amount
from
(select visited_on, sum(amount) as amount from customer group by visited_on) s t #每个访问对应的日期
where timesdiff(day,select(min(visited_on) from customer),visited_on)>=6

2.求环比、同比

#环比
select 
date(year_month) as year_month,
sales,
lag(sales,1) (order by year_month) as lag_1_sales,
sales/lag(sales,1) (order by year_month)-1 as mom_growth
from new_table

#同比
select 
date(year_month) as year_month,
sales,
lag(sales,12) (order by year_month) as lag_1_sales,
sales/lag(sales,12) (order by year_month)-1 as yoy_growth
from new_table

3.统计用户最近第二次活动的信息,如果只有一次那么就主要第一次就可以
法一:判断条件,对不同条件,用不同选择标准

#首先计算有没有第二次(临时表)
(select username,case when count(startDate)>1 then 1 else 0 end as over_1
from UserActivity
group by username
)
#之后连接筛选
select t.username,activity,startDate,endDate from
(select username,startDate,endDate,rank() over (partition by username order by startDate desc) as rank1
from UserActivity) t
left join tmp a on t.username=a.username
where case when over_1=1 then rank1=2
when over_1=0 then rank=1 end

法2:根据不同选择条件更改取值

select username, activity, startDate,
case when cas=0 then 2 else dense_rank() over(partition by username order by Startdate) as rank from 
(select *, case when count(*)<=1 then 0 else 1 end  as ca from UserActivity group by username) as t 
where rank=2

4.不使用窗口函数自己构建排序

select p1.product,p1.price,
(select count(p2.price) from product_info p2
where p2.price>p1.price)+1 as rank_1
from
product_info p1

5.三种取第二名的方法
limit;窗口函数;非等值自连接

select student_id, score from student_score order by score desc limit 1,1
#Limit[offset],rows 默认偏移量是0
 select dense_rank() over(order by score desc) as r
 from student_id where r=2
select p1.product,p1.price,
 (select count(distinct p2.price) from product_info p2
where p2.price>p1.price)+1 as rank_1
from
product_info p1
 where rank_1=2

6.求销量top10%的商品信息
percent_Rank() (分组内当前的rank值-1)/(分组内总行数-1)
cume_rank()小于等于当前值的行数/分组内总行数
row_number()

with tmp as (
select product,
row_number() over (order by sales desc) as `rank`
from sku_Sales)

select*from tmp a 
left join (
select ceiling(count(1)*0.1) as Rank_bar
from sku_sales) b on 1=1 where `rank`<=rank_bar

#ntile将排序分区中的行划分为特定数量的组,对于每一行,ntile()返回一个桶号,表示行所属的组

select priduct,
ntile(10) over (order by sales desc) as ntile_rank
from sku_sales where ntile_rank=1
#如果分区行的数量不能被整除n,则ntile()函数将生成两个大小的组
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值