
对每个hall_id单独做计算,计算顺序依据start_day升序(体现在窗口函数OVER部分中);
作辅助列,对每一行,计算截至该行的最大的end_day时间值,记作max_end_day_so_far;
检查当前行start_day是否小于上一行的max_end_day_so_far,(注意到我们的计算都基于按start_day作升序排序)如果是说明这一行的start_day被包含在上一行的时间区间中,那么设置值为0,反之,设置值为1,记作range_start;
窗口累加range_start,记作range_grp,用于作为分组的标志;
按hall_id和range_grp进行分组,对于每个分组,MIN(d1)即为最终结果区间的开始,MAX(d2)即为最终结果区间的结束。
SELECT
hall_id,
start_day,
end_day,
MAX(end_day) OVER (PARTITION BY hall_id
ORDER BY
start_day) AS max_end_day_so_far
FROM
HallEvents
| hall_id | start_day | end_day | max_end_day_so_far |
|---|---|---|---|
| 1 | 2023-01-13 | 2023-01-14 | 2023-01-14 |
| 1 | 2023-01-14 | 2023-01-17 | 2023-01-17 |
| 1 | 2023-01-18 | 2023-01-25 | 2023-01-25 |
| 2 | 2022-12-09 | 2022-12-23 | 2022-12-23 |
| 2 | 2022-12-13 | 2022-12-17 | 2022-12-23 |
| 3 | 2022-12-01 | 2023-01-30 | 2023-01-30 |
WITH t1 AS (
SELECT
hall_id,
start_day,
end_day,
MAX(end_day) OVER (PARTITION BY hall_id
ORDER BY
start_day) AS max_end_day_so_far
FROM
HallEvents)
SELECT
*,
CASE
WHEN start_day <= LAG(max_end_day_so_far) OVER (PARTITION BY hall_id
ORDER BY
start_day) THEN 0
ELSE 1
END AS range_start
FROM
t1
| end_day | max_end_day_so_far | range_start | ||
|---|---|---|---|---|
| 1 | 2023-01-13 | 2023-01-14 | 2023-01-14 | 1 |
| 1 | 2023-01-14 | 2023-01-17 | 2023-01-17 | 0 |
| 1 | 2023-01-18 | 2023-01-25 | 2023-01-25 | 1 |
| 2 | 2022-12-09 | 2022-12-23 | 2022-12-23 | 1 |
| 2 | 2022-12-13 | 2022-12-17 | 2022-12-23 | 0 |
| 3 | 2022-12-01 | 2023-01-30 | 2023-01-30 | 1 |
WITH t1 AS (
SELECT
hall_id,
start_day,
end_day,
MAX(end_day) OVER (PARTITION BY hall_id
ORDER BY
start_day) AS max_end_day_so_far
FROM
HallEvents),
t2 AS (SELECT
*,
CASE
WHEN start_day <= LAG(max_end_day_so_far) OVER (PARTITION BY hall_id
ORDER BY
start_day) THEN 0
ELSE 1
END AS range_start
FROM
t1)
SELECT *,SUM(range_start) OVER (PARTITION BY hall_id ORDER BY start_day) AS range_grp FROM t2
| hall_id | start_day | end_day | max_end_day_so_far | range_start | range_grp |
|---|---|---|---|---|---|
| 1 | 2023-01-13 | 2023-01-14 | 2023-01-14 | 1 | 1 |
| 1 | 2023-01-14 | 2023-01-17 | 2023-01-17 | 0 | 1 |
| 1 | 2023-01-18 | 2023-01-25 | 2023-01-25 | 1 | 2 |
| 2 | 2022-12-09 | 2022-12-23 | 2022-12-23 | 1 | 1 |
| 2 | 2022-12-13 | 2022-12-17 | 2022-12-23 | 0 | 1 |
| 3 | 2022-12-01 | 2023-01-30 | 2023-01-30 | 1 | 1 |
SELECT hall_id,
MIN(start_day) AS start_day,
MAX(end_day) AS end_day
FROM (
SELECT *,
SUM(range_start) OVER (PARTITION BY hall_id ORDER BY start_day) AS range_grp
FROM (
SELECT *,
CASE WHEN start_day <= LAG(max_end_day_so_far) OVER (PARTITION BY hall_id ORDER BY start_day) THEN 0
ELSE 1 END AS range_start
FROM (
SELECT hall_id,
start_day,
end_day,
MAX(end_day) OVER (PARTITION BY hall_id ORDER BY start_day) AS max_end_day_so_far
FROM HallEvents
) t
) t1
) t2
GROUP BY hall_id, range_grp;
浙公网安备 33010602011771号