利用递归WITH子查询进行优化的实例
2011-06-16 17:29 Tracy. 阅读(759) 评论(0) 收藏 举报利用递归WITH子查询进行优化的实例
递归WITH子查询我已经用过不少了,不过都是当作玩具,没有在实践中用过。昨天碰到了一个实用例子。   
在一个OLTP中有一张表,主键是随着创建时间递增的。每天产生大约1万条记录,全表大约有几百万,没有分区,创建时间没有索引。    
现在要求取出最近两天的数据并且从中过滤出部分记录。过滤条件上也没有索引。    
原来的查询为全表扫描,效率太低,有没有办法改善?前提:因为原表索引够多了,不能新增索引;原查询来自JAVA程序,修改之后必须还是一个SQL, 交由JAVA程序执行。    
测试数据:    
CREATE TABLE items AS    
SELECT 3000000 - LEVEL AS item_id    
,SYSDATE-LEVEL/10000 AS created_date    
,TRUNC(DBMS_RANDOM.VALUE(1,101)) AS item_type    
,SYS_GUID() AS DESCRIPTION    
FROM DUAL     
CONNECT BY LEVEL<=1000000;    
ALTER TABLE items ADD CONSTRAINT items_pk PRIMARY KEY (item_id) USING INDEX;    
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'ITEMS');    
原表要复杂得多,现在简化为100万的一张表。查询大约相当于:    
SELECT * FROM items WHERE created_date >= TRUNC(SYSDATE)-2 AND item_type=14;    
294 rows selected.    
Elapsed: 00:00:00.90    
Execution Plan    
----------------------------------------------------------    
Plan hash value: 446380563    
---------------------------------------------------------------------------    
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |    
---------------------------------------------------------------------------    
| 0 | SELECT STATEMENT | | 258 | 8772 | 1501 (1)| 00:00:19 |    
|* 1 | TABLE ACCESS FULL| ITEMS | 258 | 8772 | 1501 (1)| 00:00:19 |    
---------------------------------------------------------------------------    
Predicate Information (identified by operation id):    
---------------------------------------------------    
1 - filter("ITEM_TYPE"=14 AND "CREATED_DATE">=TRUNC(SYSDATE@!)-2)    
Statistics    
----------------------------------------------------------    
0 recursive calls    
0 db block gets    
5429 consistent gets    
5406 physical reads    
0 redo size    
13636 bytes sent via SQL*Net to client    
625 bytes received via SQL*Net from client    
21 SQL*Net roundtrips to/from client    
0 sorts (memory)    
0 sorts (disk)    
294 rows processed    
例子表简化了很多, 所以只花了0.9秒,5429个一致读。实际表要胖得多,查询需要几十秒。    
改写:    
因为主键是随着时间递增的,所以“取最近的数据”可转换为:取ID最大的N条数据,如果其中最早的一条还是两天之内就增大N, 直到日期落在两天之外停止查询。    
WITH t (item_id,cnt) AS (    
SELECT max(item_id),1 FROM items ---先取最近的    
UNION ALL     
select (SELECT MIN(item_id) FROM (SELECT item_id FROM items ORDER BY item_id DESC) WHERE ROWNUM<=t.cnt+2000) ---- 跳跃取2000行之后的ID    
,cnt+2000 ----- 当日期还在区间内则递增取ID的范围。根据每天的数据两选取合适的步长,这里定为2000    
FROM t    
WHERE (SELECT MAX(created_date) FROM items WHERE item_id=t.item_id)>=TRUNC(SYSDATE)-2) ---- 当取到的ID落在区间外则停止递归    
CYCLE item_id SET cycle_flag TO 'Y' DEFAULT 'N' ---- 虽然ID都不重复但是ORACLE会报告有循环数据,所以在这里加上CYCLE语句    
select *     
FROM items     
WHERE item_id>=(SELECT min(item_id) from t) ----- 利用前面的搜索结果    
AND item_type=14    
AND created_date >= TRUNC(SYSDATE)-2;    
294 rows selected.    
Elapsed: 00:00:00.16    
Execution Plan    
----------------------------------------------------------    
Plan hash value: 1844953721    
---------------------------------------------------------------------------------------------------------    
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |    
---------------------------------------------------------------------------------------------------------    
| 0 | SELECT STATEMENT | | 13 | 442 | 81 (0)| 00:00:01 |    
|* 1 | TABLE ACCESS BY INDEX ROWID | ITEMS | 13 | 442 | 72 (0)| 00:00:01 |    
|* 2 | INDEX RANGE SCAN | ITEMS_PK | 9000 | | 23 (0)| 00:00:01 |    
| 3 | SORT AGGREGATE | | 1 | 13 | | |    
| 4 | VIEW | | 2 | 26 | 9 (0)| 00:00:01 |    
| 5 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | |    
| 6 | SORT AGGREGATE | | 1 | 6 | | |    
| 7 | INDEX FULL SCAN (MIN/MAX) | ITEMS_PK | 1 | 6 | 3 (0)| 00:00:01 |    
| 8 | SORT AGGREGATE | | 1 | 13 | | |    
|* 9 | COUNT STOPKEY | | | | | |    
| 10 | VIEW | | 1000K| 12M| 2238 (1)| 00:00:27 |    
| 11 | INDEX FULL SCAN DESCENDING | ITEMS_PK | 1000K| 5859K| 2238 (1)| 00:00:27 |    
|* 12 | FILTER | | | | | |    
| 13 | RECURSIVE WITH PUMP | | | | | |    
| 14 | SORT AGGREGATE | | 1 | 14 | | |    
| 15 | TABLE ACCESS BY INDEX ROWID | ITEMS | 1 | 14 | 3 (0)| 00:00:01 |    
|* 16 | INDEX UNIQUE SCAN | ITEMS_PK | 1 | | 2 (0)| 00:00:01 |    
---------------------------------------------------------------------------------------------------------    
Predicate Information (identified by operation id):    
---------------------------------------------------    
1 - filter("ITEM_TYPE"=14 AND "CREATED_DATE">=TRUNC(SYSDATE@!)-2)    
2 - access("ITEM_ID">= (SELECT MIN("ITEM_ID") FROM "T" "T"))    
9 - filter(ROWNUM<=:B1+2000)    
12 - filter( (SELECT MAX("CREATED_DATE") FROM "ITEMS" "ITEMS" WHERE    
"ITEM_ID"=:B1)>=TRUNC(SYSDATE@!)-2)    
16 - access("ITEM_ID"=:B1)    
Statistics    
----------------------------------------------------------    
0 recursive calls    
0 db block gets    
722 consistent gets    
0 physical reads    
0 redo size    
13636 bytes sent via SQL*Net to client    
625 bytes received via SQL*Net from client    
21 SQL*Net roundtrips to/from client    
15 sorts (memory)    
0 sorts (disk)    
294 rows processed    
主键索引用得上了,一致读下降为722。
本文来自博客园,作者:Tracy.,转载请注明原文链接:https://www.cnblogs.com/tracy/archive/2011/06/16/2082914.html
                    
                
                
            
        
浙公网安备 33010602011771号