标量子查询加聚合函数sql改写一
标量子查询的语句:
select /*+ GATHER_PLAN_STATISTICS dwtest */ empno,
(select count(*) from DEPT1 b where b.id = a.id) as d,
(select sum(x) from DEPT1 b where b.id = a.id) as e
from EMP1 a --where a.id in (1,2,3,4,5,6,7,8,9,11,12,13)
创建表:
create table DEPT1 as select * from DEPT; insert into DEPT1 select * from DEPT1; ---多执行几次 create table emp1 as select * from emp; insert into emp1 select * from emp1; ---多执行几次 alter table DEPT1 add id number; update DEPT1 set id=rownum; alter table emp1 add id number; update emp1 set id=rownum;
create index idx_emp1 on emp1(id);
create index idx_DEPT1 on dept1(id);
改写后的语句:
select /*+ GATHER_PLAN_STATISTICS dwtes2 */ empno,d,e from EMP1 a left join (select count(1)d,id,sum(x) e from DEPT1 group by id) b on a.id=b.id --where a.id in (1,2,3,4,5,6,7,8,9,11,12,13);
在放开where条件时,标量子查询时的多次索引范围扫描,导致cost较高,性能比左连接方式要差些,从下图看的不是很明显,如果从monitor看耗时更直观些。得出结论是:左连接改写后效率比标量子查询稍好,但是没有明显提升。
----
SQL_ID 8mgcw7x9x16rq, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS dwtest */ empno, (select
count(*) from DEPT1 b where b.id = a.id) as d, (select sum(x)
from DEPT1 b where b.id = a.id) as e from EMP1 a where a.id in
(1,2,3,4,5,6,7,8,9,11,12,13)
Plan hash value: 3470857716
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 12 |00:00:00.01 | 9 | 4 |
| 1 | SORT AGGREGATE | | 12 | 1 | 12 |00:00:00.01 | 8 | 1 |
|* 2 | INDEX RANGE SCAN | IDX_DEPT1 | 12 | 1 | 12 |00:00:00.01 | 8 | 1 |
| 3 | SORT AGGREGATE | | 12 | 1 | 12 |00:00:00.01 | 10 | 0 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT1 | 12 | 1 | 12 |00:00:00.01 | 10 | 0 |
|* 5 | INDEX RANGE SCAN | IDX_DEPT1 | 12 | 1 | 12 |00:00:00.01 | 8 | 0 |
| 6 | INLIST ITERATOR | | 1 | | 12 |00:00:00.01 | 9 | 4 |
| 7 | TABLE ACCESS BY INDEX ROWID| EMP1 | 12 | 12 | 12 |00:00:00.01 | 9 | 4 |
|* 8 | INDEX RANGE SCAN | IDX_EMP1 | 12 | 12 | 12 |00:00:00.01 | 8 | 4 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("B"."ID"=:B1)
5 - access("B"."ID"=:B1)
8 - access(("A"."ID"=1 OR "A"."ID"=2 OR "A"."ID"=3 OR "A"."ID"=4 OR "A"."ID"=5 OR "A"."ID"=6 OR
"A"."ID"=7 OR "A"."ID"=8 OR "A"."ID"=9 OR "A"."ID"=11 OR "A"."ID"=12 OR "A"."ID"=13))
-----
SQL_ID dpwyqsf1rch2g, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS dwtes2 */ empno,d,e from EMP1 a left
join (select count(1)d,id,sum(x) e from DEPT1 group by id) b on
a.id=b.id where a.id in (1,2,3,4,5,6,7,8,9,11,12,13)
Plan hash value: 1193336691
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 12 |00:00:00.01 | 19 | | | |
|* 1 | HASH JOIN OUTER | | 1 | 12 | 12 |00:00:00.01 | 19 | 1969K| 1969K| 1407K (0)|
| 2 | INLIST ITERATOR | | 1 | | 12 |00:00:00.01 | 9 | | | |
| 3 | TABLE ACCESS BY INDEX ROWID | EMP1 | 12 | 12 | 12 |00:00:00.01 | 9 | | | |
|* 4 | INDEX RANGE SCAN | IDX_EMP1 | 12 | 12 | 12 |00:00:00.01 | 8 | | | |
| 5 | VIEW | | 1 | 12 | 12 |00:00:00.01 | 10 | | | |
| 6 | HASH GROUP BY | | 1 | 12 | 12 |00:00:00.01 | 10 | 1116K| 1116K| 2222K (0)|
| 7 | INLIST ITERATOR | | 1 | | 12 |00:00:00.01 | 10 | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| DEPT1 | 12 | 12 | 12 |00:00:00.01 | 10 | | | |
|* 9 | INDEX RANGE SCAN | IDX_DEPT1 | 12 | 12 | 12 |00:00:00.01 | 8 | | | |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."ID"="B"."ID")
4 - access(("A"."ID"=1 OR "A"."ID"=2 OR "A"."ID"=3 OR "A"."ID"=4 OR "A"."ID"=5 OR "A"."ID"=6 OR "A"."ID"=7 OR
"A"."ID"=8 OR "A"."ID"=9 OR "A"."ID"=11 OR "A"."ID"=12 OR "A"."ID"=13))
9 - access(("ID"=1 OR "ID"=2 OR "ID"=3 OR "ID"=4 OR "ID"=5 OR "ID"=6 OR "ID"=7 OR "ID"=8 OR "ID"=9 OR "ID"=11 OR
"ID"=12 OR "ID"=13))
在没有where条件时,上例的索引选择性非常好,标量子查询时的多次索引范围扫描,与左连接方式的对两表的全表扫描的hash排序cost差不多,导致两种方式效率相差无几。得出结论是:在关联索引选择性非常好时,左连接改写后效率比标量子查询差不多,但是如果索引选择性一般时,左连接效果要好。
但是在关联条件没有索引时,emp1表多少行,就要对dept是乘2次的全表扫描,此时就会导致cost非常高。
所以,尽量使用左连接加分组来优化
浙公网安备 33010602011771号