小铁匠ME

导航

主查询谓词无法内推导致的问题

这个测试是在11g R2测试的,主查询的谓词无法内推到子查询的第二层导致的性能问题。
由于下面两个SQL性能问题太明显,这里只分析执行计划。
重现过程如下:
--测试环境
create table t1 as select object_id,object_name from dba_objects;
create table t2 as select object_id,object_name from user_objects;
create table t3 as select rownum object_id,table_name object_name from user_tables;
analyze table t1 compute statistics for table for all columns;
analyze table t2 compute statistics for table for all columns;
analyze table t3 compute statistics for table for all columns;
t1--49830;t2--37;t3--16
开始测试:
1.比较简单的子查询:子查询中表直接和主表就有关联关系
select /*+gather_plan_statistics*/
(select t2.object_name from t2 where t2.object_id = t1.object_id) object_name
from t1;
执行计划:
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------
|* 1 | TABLE ACCESS FULL| T2 | 49830 | 1 | 36 |00:00:01.75 | 149K| 0 |
| 2 | TABLE ACCESS FULL| T1 | 1 | 49830 | 49830 |00:00:00.30 | 3546 | 231 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T2"."OBJECT_ID"=:B1)
这个执行计划很简单,首先执行t1,然后将t1中返回的object_id在t2中进行filter,t2的执行次数由t1返回行数决定(t1和t2的关联类似于嵌套循环,而t2则相当于内部表)。
其实这种SQL的写法是比较低效的,可以将子查询中表t2放到from后面跟t1做left outer join,这样可以远远提升执行效率。
2.复杂点的子查询:主查询中的表跟子查询的第二层有关联关系
select /*+gather_plan_statistics*/
(select t2.object_name
from t2
where t2.object_name in
(select t3.object_name from t3 where t1.object_id = t3.object_id)) object_name
from t1;
执行计划:
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------
|* 1 | FILTER | | 49830 | | 15 |00:00:39.89 | 5680K| 2 |
| 2 | TABLE ACCESS FULL| T2 | 49830 | 37 | 1843K|00:00:07.89 | 149K| 0 |
|* 3 | TABLE ACCESS FULL| T3 | 1843K| 1 | 15 |00:00:27.95 | 5531K| 2 |
| 4 | TABLE ACCESS FULL | T1 | 1 | 49830 | 49830 |00:00:00.20 | 3546 | 0 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ */ 0 FROM "T3" "T3" WHERE
"T3"."OBJECT_NAME"=:B1 AND "T3"."OBJECT_ID"=:B2))
3 - filter("T3"."OBJECT_NAME"=:B1 AND "T3"."OBJECT_ID"=:B2)
首先执行t1,然后执行t2,最后执行t3(从predicate中3可以看出,t3依赖于t1传过来的参数object_id和t2传过来的参数object_name)。
仔细分析下Starts和A-Rows发现,执行t1返回49830行,然后执行t2,由于t2和t1没有任何关联条件,导致执行t2后返回的行数是t2数据行数与t2执行次数(t1返回行数)相乘;
这也就直接影响到t3执行次数。也就是说不管t3中有多少行满足条件的数据,都会对表t3扫描t1和t2行数相乘次。
特别是,如果t1、t2的行数足够大时,性能就更可想而知了。
可以将该SQL做如下改写:
select t2.object_name
from t1
left outer join (select distinct t3.object_id, t2.object_name
from t2, t3
where t2.object_name = t3.object_name) temp
on t1.object_id = t3.object_id
 

posted on 2013-08-04 19:10  小铁匠ME  阅读(379)  评论(0编辑  收藏  举报