全表数据列值转换 漏斗分析 漏斗函数实现原理
CASE WHEN user_id%2=0 THEN user_id END AS t, t的值有2中0、NULL
SELECT
user_id,
MIN(CASE WHEN event_name = 'view' THEN event_time END) AS view_time,
MIN(CASE WHEN event_name = 'cart' THEN event_time END) AS cart_time,
MIN(CASE WHEN event_name = 'purchase' THEN event_time END) AS purchase_time,
MIN(CASE WHEN event_name = 'payment' THEN event_time END) AS payment_time
FROM user_events
WHERE event_name IN ('view', 'cart', 'purchase', 'payment')
示例SQL代码:
假设有一个用户事件表 user_events,包含 user_id, event_name, event_time,需要分析从 浏览商品(view)→ 加入购物车(cart)→ 下单(purchase)→ 支付(payment) 的漏斗,各步骤时间窗口为1天。
sql
复制
WITH user_steps AS (
SELECT
user_id,
MIN(CASE WHEN event_name = 'view' THEN event_time END) AS view_time,
MIN(CASE WHEN event_name = 'cart' THEN event_time END) AS cart_time,
MIN(CASE WHEN event_name = 'purchase' THEN event_time END) AS purchase_time,
MIN(CASE WHEN event_name = 'payment' THEN event_time END) AS payment_time
FROM user_events
WHERE event_name IN ('view', 'cart', 'purchase', 'payment')
GROUP BY user_id
),
funnel AS (
SELECT
user_id,
view_time,
-- 计算有效的加入购物车时间(在浏览后1天内)
CASE
WHEN cart_time >= view_time AND cart_time <= view_time + INTERVAL '1 day'
THEN cart_time
ELSE NULL
END AS cart_valid,
-- 计算有效的下单时间(在加入购物车后1天内)
CASE
WHEN cart_time >= view_time AND cart_time <= view_time + INTERVAL '1 day'
AND purchase_time >= cart_time AND purchase_time <= cart_time + INTERVAL '1 day'
THEN purchase_time
ELSE NULL
END AS purchase_valid,
-- 计算有效的支付时间(在下单后1天内)
CASE
WHEN cart_time >= view_time AND cart_time <= view_time + INTERVAL '1 day'
AND purchase_time >= cart_time AND purchase_time <= cart_time + INTERVAL '1 day'
AND payment_time >= purchase_time AND payment_time <= purchase_time + INTERVAL '1 day'
THEN payment_time
ELSE NULL
END AS payment_valid
FROM user_steps
WHERE view_time IS NOT NULL -- 仅统计有浏览行为的用户
)
SELECT
COUNT(user_id) AS view_users,
COUNT(cart_valid) AS cart_users,
COUNT(purchase_valid) AS purchase_users,
COUNT(payment_valid) AS payment_users,
-- 计算各步骤转化率(相对于前一步)
ROUND(COUNT(cart_valid) * 100.0 / COUNT(user_id), 2) AS view_to_cart_rate,
ROUND(COUNT(purchase_valid) * 100.0 / COUNT(cart_valid), 2) AS cart_to_purchase_rate,
ROUND(COUNT(payment_valid) * 100.0 / COUNT(purchase_valid), 2) AS purchase_to_payment_rate,
-- 计算整体转化率(相对于第一步)
ROUND(COUNT(cart_valid) * 100.0 / COUNT(user_id), 2) AS step2_rate,
ROUND(COUNT(purchase_valid) * 100.0 / COUNT(user_id), 2) AS step3_rate,
ROUND(COUNT(payment_valid) * 100.0 / COUNT(user_id), 2) AS step4_rate
FROM funnel;
输出说明:
-
view_users:完成浏览步骤的用户数。
-
cart_users:在浏览后1天内加入购物车的用户数。
-
purchase_users:在加入购物车后1天内下单的用户数。
-
payment_users:在下单后1天内支付的用户数。
-
view_to_cart_rate:浏览到加入购物车的转化率。
-
cart_to_purchase_rate:加入购物车到下单的转化率。
-
purchase_to_payment_rate:下单到支付的转化率。
-
step2_rate、step3_rate、step4_rate:各步骤相对于浏览用户的转化率。
注意事项:
-
时间窗口调整:根据需求修改 INTERVAL '1 day' 部分。
-
事件名称匹配:确保 event_name 的值与实际数据一致。
-
性能优化:如果数据量大,可为 event_time 和 event_name 添加索引。
-
扩展性:如需增加步骤,按相同模式添加新字段和条件。
问题:
上述sql
如果每一步骤的条件相同或者存在包含关系,则筛选失效。
1)
event=‘buy’
event=‘buy’
2)
event=‘buy’ and a>1
event=‘buy’ and a>1 and b>1
解决办法:
CASE WHEN event_name = 'view' THEN event_time END
独立成表,即每一步独立成表,再UNION ALL形成user_steps