需求说明 : 如下为某直播平台主播开播及关播时间,根据该数据计算出平台最高峰同时在线的主播
-- 数据准备
-- DDL
create table test8 (
`id` string comment '主播id',
`stt` string comment '主播登入时间',
`edt` string comment '主播登出时间')
comment '主播登入记录表'
row format delimited fields terminated by '\t'
lines terminated by '\n' stored as orc;
insert overwrite table test8
select '1001', '2021-06-14 12:12:12', '2021-06-14 18:12:12' union all
select '1003', '2021-06-14 13:12:12', '2021-06-14 16:12:12' union all
select '1004', '2021-06-14 13:15:12', '2021-06-14 20:12:12' union all
select '1002', '2021-06-14 15:12:12', '2021-06-14 16:12:12' union all
select '1005', '2021-06-14 15:18:12', '2021-06-14 20:12:12' union all
select '1001', '2021-06-14 20:12:12', '2021-06-14 23:12:12' union all
select '1006', '2021-06-14 21:12:12', '2021-06-14 23:15:12' union all
select '1007', '2021-06-14 22:12:12', '2021-06-14 23:10:12' union all
select '1008', '2021-06-14 11:12:12', '2021-06-14 19:10:12' ;
说明 : 如下为某直播平台主播开播及关播时间,根据该数据计算出平台最高峰同时在线的主播人数
方案1 :
1. 思路分析
1. 每条数据与本表自关联(笛卡尔积)
2. 筛选出 同时在线的数据
同时在线的条件 :
1001 1----------5
2-----------6
t1.stt <= t2.edt
and t1.edt >= t2.stt
-- 查询sql
select
t1.id
,t1.stt
,t1.edt
,count(distinct t2.id) as cnt
from test8 as t1
left outer join test8 as t2
where t1.stt <= t2.edt
and t1.edt >= t2.stt
group by t1.id
,t1.stt
,t1.edt
;
-- 查询结果
t1.id t1.stt t1.edt cnt
1001 2021-06-14 12:12:12 2021-06-14 18:12:12 6
1001 2021-06-14 20:12:12 2021-06-14 23:12:12 5
1002 2021-06-14 15:12:12 2021-06-14 16:12:12 6
1003 2021-06-14 13:12:12 2021-06-14 16:12:12 6
1004 2021-06-14 13:15:12 2021-06-14 20:12:12 6
1005 2021-06-14 15:18:12 2021-06-14 20:12:12 6
1006 2021-06-14 21:12:12 2021-06-14 23:15:12 3
1007 2021-06-14 22:12:12 2021-06-14 23:10:12 3
1008 2021-06-14 11:12:12 2021-06-14 19:10:12 6
方案2 :
1-----3
2-----4
5----6
7-----8
1 1
2 1
3 -1
4 -1
5 1
6 -1
7 1
8 -1
-- 查询sql
with t1 as (
select id
, stt as ds
, 1 as login_flag
from test8
union all
select id
, edt as ds
, -1 as login_flag
from test8
)
select
id
,ds
,login_flag
,sum(login_flag) over(order by ds) as sum_login
from t1
;
-- 查询结果
id ds login_flag sum_login
1008 2021-06-14 11:12:12 1 1
1001 2021-06-14 12:12:12 1 2
1003 2021-06-14 13:12:12 1 3
1004 2021-06-14 13:15:12 1 4
1002 2021-06-14 15:12:12 1 5
1005 2021-06-14 15:18:12 1 6
1002 2021-06-14 16:12:12 -1 4
1003 2021-06-14 16:12:12 -1 4
1001 2021-06-14 18:12:12 -1 3
1008 2021-06-14 19:10:12 -1 2
1001 2021-06-14 20:12:12 1 1
1005 2021-06-14 20:12:12 -1 1
1004 2021-06-14 20:12:12 -1 1
1006 2021-06-14 21:12:12 1 2
1007 2021-06-14 22:12:12 1 3
1007 2021-06-14 23:10:12 -1 2
1001 2021-06-14 23:12:12 -1 1
1006 2021-06-14 23:15:12 -1 0
方案3(实时) :
开启变量 login_cnt 为在线人数
新登用户 + 1
退出用户 - 1