以下有一份直播间的数据
room_id | user_id | start_time | end_time |
---|---|---|---|
001 | 23435 | 2021-11-01 12:09:23 | 2021-11-01 12:24:54 |
001 | 44625 | 2021-11-01 18:34:26 | 2021-11-01 19:35:13 |
002 | 35780 | 2021-11-01 16:24:43 | 2021-11-01 15:54:22 |
004 | 43325 | 2021-11-01 19:04:21 | 2021-11-01 21:13:23 |
006 | 67895 | 2021-11-01 20:13:42 | 2021-11-01 23:09:26 |
… |
字段解释:
- room_id为每个直播间的唯一ID标识
- user_id为每个用户的唯一ID标识
- start_time为每个用户进入直播间的时间
- end_time为每个用户退出直播间的时间
问题:求每个直播间的最高在线人数
思路:
- 每一条数据其实就是对应一条用户的数据,每个start_time相当于直播间用户数加1,每个end_time直播间的用户数减1.
- 一般的想法就是先按直播间分类,对于每个直播间的数据按时间顺序排列.定义一个变量user_num和max_user_num,每进入一个用户将user_num值加1并和max_user_num进行比较,如果user_num>max_user_num则将user_num的值赋给max_user_num,每退出一个用户将user_num减1,遍历所有的数据,最后的max_user_num就是最高在线人数
- 对于sql,可以借鉴上面的思路,具体代码如下:
select room_id
,max(user_num) as max_user_num
from (
select room_id
,start_time
,sum(flag) over(partition by room_id order by start_time asc) user_num
from (
select room_id
,start_time
,1 as flag
from temp
union all
select room_id
,end_time as start_time
,-1 as flag
from temp
) a
) b
group by room_id