FILTER优化案例
1、问题描述
某网友在群里发了一条SQL,执行耗时40多分钟,返回22条数据
SQL文本如下
SELECT *
FROM UCR_RIGHT.tf_rc_order
WHERE telnum IN
(SELECT f.phone_number
FROM UCR_RIGHT.tf_rc_order c,
UCR_RIGHT.tf_rc_bill_record f,
UCR_RIGHT.tf_rc_partition_discount g
WHERE c.user_id IN
(SELECT user_id
FROM (SELECT DISTINCT t.user_id, t.telnum
FROM UCR_RIGHT.tf_rc_order t
WHERE t.telnum IN
(SELECT telnum
FROM (SELECT count(*), b.telnum
FROM (SELECT count(*),
t.telnum,
t.user_id
FROM UCR_RIGHT.tf_rc_order t
GROUP BY t.telnum, t.user_id) b
GROUP BY b.telnum
HAVING count(*) > 1))
AND t.create_date >=
to_date('2020-12-01 00:00:00',
'YYYY-MM-DD HH24:mi:ss')
ORDER BY telnum))
AND c.id = f.order_id
AND g.partiton_start = '1'
AND g.right_id = f.right_code
AND g.partiton_price <> f.fact_fee)
执行计划如下
Plan hash value: 1935457672
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 273 | | 17M (1)| 00:11:24 |
| 1 | NESTED LOOPS | | 1 | 273 | | 17M (1)| 00:11:24 |
| 2 | NESTED LOOPS | | 1 | 273 | | 17M (1)| 00:11:24 |
| 3 | VIEW | VW_NSO_1 | 1 | 10 | | 17M (1)| 00:11:24 |
| 4 | HASH UNIQUE | | 1 | 146 | | | |
|* 5 | FILTER | | | | | | |
|* 6 | HASH JOIN | | 3111 | 443K| | 14683 (1)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | TF_RC_ORDER | 14299 | 516K| | 6348 (1)| 00:00:01 |
|* 8 | HASH JOIN | | 284K| 29M| | 8334 (1)| 00:00:01 |
|* 9 | HASH JOIN RIGHT SEMI | | 11349 | 831K| | 1990 (1)| 00:00:01 |
|* 10 | TABLE ACCESS FULL | TF_RC_PARTITION_DISCOUNT | 17 | 408 | | 5 (0)| 00:00:01 |
| 11 | TABLE ACCESS FULL | TF_RC_BILL_RECORD | 295K| 14M| | 1984 (1)| 00:00:01 |
| 12 | TABLE ACCESS FULL | TF_RC_ORDER | 613K| 19M| | 6342 (1)| 00:00:01 |
|* 13 | FILTER | | | | | | |
| 14 | HASH GROUP BY | | 1 | 12 | | 11232 (1)| 00:00:01 |
| 15 | VIEW | | 613K| 7192K| | 11232 (1)| 00:00:01 |
| 16 | SORT GROUP BY | | 613K| 16M| 23M| 11232 (1)| 00:00:01 |
| 17 | TABLE ACCESS FULL | TF_RC_ORDER | 613K| 16M| | 6342 (1)| 00:00:01 |
|* 18 | INDEX RANGE SCAN | ORDER_TELNUM_IDX | 1 | | | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID| TF_RC_ORDER | 1 | 263 | | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter( EXISTS (SELECT 0 FROM (SELECT COUNT(*) "COUNT(*)","T"."TELNUM" "TELNUM","T"."USER_ID"
"USER_ID" FROM "UCR_RIGHT"."TF_RC_ORDER" "T" GROUP BY "T"."TELNUM","T"."USER_ID") "B" GROUP BY
"B"."TELNUM" HAVING "B"."TELNUM"=:B1 AND COUNT(*)>1))
6 - access("C"."USER_ID"="T"."USER_ID")
7 - filter("T"."CREATE_DATE">=TO_DATE(' 2020-12-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
8 - access("C"."ID"="F"."ORDER_ID")
9 - access("G"."RIGHT_ID"="F"."RIGHT_CODE")
filter("G"."PARTITON_PRICE"<>TO_NUMBER("F"."FACT_FEE"))
10 - filter("G"."PARTITON_START"=1)
13 - filter("B"."TELNUM"=:B1 AND COUNT(*)>1)
18 - access("TELNUM"="PHONE_NUMBER")
2、问题分析
了解到SQL中最大的表才184M,那所有表就算走HASH,也要不了40分钟才对。那问题出在哪了?通过观察执行计划ID=5这里感觉是个坑,filter类似NL,如果上面执行计划估算的ROWS是准确的话,就不会有性能问题了。
3、问题处理
通常filter是没法通过hint消除的,这里我们通过等价改写消除filter

另外问题SQL中执行计划通过等价改写产生了filter,通过不展开group by,让子查询作为NL的驱动表,执行计全走NL,也可以达到优化效果,这里网友对改写后SQL性能满意,就没有进行测试
想了解SQL优化原理的同学,可以添加微信号:chenjunbee

或关注微信公众号,定期更新优化案例


浙公网安备 33010602011771号