题目很简单:
已知表名:user_login,表中有两个字段:user_id,login_time
问:用户最大连续登录天数。
整个表先理一下思路:
| user_id | log_time |
|---|---|
| 224332 | 2022/12/12 |
| 224332 | 2022/12/13 |
| 224332 | 2022/12/22 |
| 224332 | 2022/12/23 |
| 224332 | 2022/12/24 |
| 224332 | 2022/12/27 |
| 224332 | 2022/12/31 |
| 224561 | 2022/12/2 |
| 224561 | 2022/12/8 |
| 224561 | 2022/12/11 |
| 224561 | 2022/12/14 |
| 224561 | 2022/12/18 |
| 224561 | 2022/12/23 |
| 224561 | 2022/12/27 |
| 224561 | 2022/12/28 |
| 224561 | 2022/12/30 |
第一步:用row_number 对每个用户进行分组依据登录时间先后排序
WITH login_days AS (
SELECT
user_id,
login_time,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_time) AS rn
FROM
user_login
),
| user_id | log_time | rn |
|---|---|---|
| 224332 | 2022/12/12 | 1 |
| 224332 | 2022/12/13 | 2 |
| 224332 | 2022/12/22 | 3 |
| 224332 | 2022/12/23 | 4 |
| 224332 | 2022/12/24 | 5 |
| 224332 | 2022/12/27 | 6 |
| 224332 | 2022/12/31 | 7 |
| 224561 | 2022/12/2 | 1 |
| 224561 | 2022/12/8 | 2 |
| 224561 | 2022/12/11 | 3 |
| 224561 | 2022/12/14 | 4 |
| 224561 | 2022/12/18 | 5 |
| 224561 | 2022/12/23 | 6 |
| 224561 | 2022/12/27 | 7 |
| 224561 | 2022/12/28 | 8 |
| 224561 | 2022/12/30 | 9 |
第二步:计算辅助列,用登录时间-排序 得到一个辅助列,如果得到的辅助列值相同,那么值相同的这几天就是连续的。这一步可以说是关键。
login_days_diff AS (
SELECT
user_id,
login_time,
rn,
login_time - rn as diff
FROM
login_days
),
| user_id | log_time | rn | diff |
|---|---|---|---|
| 224332 | 2022/12/12 | 1 | 2022/12/11 |
| 224332 | 2022/12/13 | 2 | 2022/12/11 |
| 224332 | 2022/12/22 | 3 | 2022/12/19 |
| 224332 | 2022/12/23 | 4 | 2022/12/19 |
| 224332 | 2022/12/24 | 5 | 2022/12/19 |
| 224332 | 2022/12/27 | 6 | 2022/12/21 |
| 224332 | 2022/12/31 | 7 | 2022/12/24 |
| 224561 | 2022/12/2 | 1 | 2022/12/1 |
| 224561 | 2022/12/8 | 2 | 2022/12/6 |
| 224561 | 2022/12/11 | 3 | 2022/12/8 |
| 224561 | 2022/12/14 | 4 | 2022/12/10 |
| 224561 | 2022/12/18 | 5 | 2022/12/13 |
| 224561 | 2022/12/23 | 6 | 2022/12/17 |
| 224561 | 2022/12/27 | 7 | 2022/12/20 |
| 224561 | 2022/12/28 | 8 | 2022/12/20 |
| 224561 | 2022/12/30 | 9 | 2022/12/21 |
第三步,求出每个用户的所有连续登录天数
consecutive_day AS (SELECT
user_id,
diff,
COUNT(diff) AS consecutive_day
FROM
login_days_diff
GROUP BY
user_id,diff
)
| user_id | diff | consecutive_day |
| 224332 | 2022/12/11 | 2 |
| 224332 | 2022/12/19 | 3 |
| 224332 | 2022/12/21 | 1 |
| 224332 | 2022/12/24 | 1 |
| 224561 | 2022/12/1 | 1 |
| 224561 | 2022/12/6 | 1 |
| 224561 | 2022/12/8 | 1 |
| 224561 | 2022/12/10 | 1 |
| 224561 | 2022/12/13 | 1 |
| 224561 | 2022/12/17 | 1 |
| 224561 | 2022/12/20 | 2 |
| 224561 | 2022/12/21 | 1 |
第四步,得到用户的最大登录天数
SELECT user_id,MAX(consecutive_day) as max_consecutive_day
FROM consecutive_day
GROUP BY user_id;
| user_id | diff | consecutive_day |
| 224332 | 2022/12/19 | 3 |
| 224561 | 2022/12/20 | 2 |
一道类似的题目:
更复杂,因为胜利的场次对应的日期并不是像本题一样是连续的,所以要进行两次排序,最后筛选的过程中使用sum(if)排除null的情况
用的rank()
浙公网安备 33010602011771号