组合索引的前导列,要使用选择率高的列?
Oracle Concept上有一句话,大致的意思是:应该选择选择率高的列作为组合索引的前导列
其实这句话,并不严谨,前导列的选择(索引的设计)应该决定于SQL的逻辑,而不是统一的固定原则,比如下面的案例:
Z:\>sqlplus mydb
SQL*Plus: Release 10.2.0.4.0 - Production on 星期二 3月 10 15:15:49 2009
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
输入口令:
连接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
MYDB > select * from testxyz order by a, b;
A B C
---------- ---------- ----------
1 10 ABC
2 15 ABC
3 15 ABC
4 21 ABC
4 25 ABC
5 20 ABC
5 22 ABC
6 27 ABC
9 10 ABC
10 15 ABC
17 3 ABC
18 21 ABC
已选择12行。
已用时间: 00: 00: 00.00
MYDB > select * from testxyz order by b, a;
A B C
---------- ---------- ----------
17 3 ABC
1 10 ABC
9 10 ABC
2 15 ABC
3 15 ABC
10 15 ABC
5 20 ABC
4 21 ABC
18 21 ABC
5 22 ABC
4 25 ABC
6 27 ABC
已选择12行。
已用时间: 00: 00: 00.00
MYDB > create index testxyz_ab on testxyz (a,b) pctfree 97;
索引已创建。
已用时间: 00: 00: 00.03
MYDB > create index testxyz_ba on testxyz (b,a) pctfree 97;
索引已创建。
已用时间: 00: 00: 00.06
MYDB > select object_name,object_id from user_objects
2 where object_name in ('TESTXYZ_AB','TESTXYZ_BA');
OBJECT_NAME OBJECT_ID
---------------------------------------- ----------
TESTXYZ_AB 13218
TESTXYZ_BA 13219
已用时间: 00: 00: 00.01
A,B两列的选择率不通,我们前面创建了2个不通前导列的两列组合索引;那么对于下面的SQL来说哪个SQL的效率更高呢?
select /*+ index(testxyz testxyz_ba) */ a, b from testxyz
where a=4 and b between 20 and 25;
这要先看看2个索引中,索引key是如何分布在节点上的~
1.先看TESTXYZ_AB:
MYDB > alter session set events 'immediate trace name treedump level 13218';
会话已更改。
已用时间: 00: 00: 00.26
/*
----- begin tree dump
branch: 0x1000c8c 16780428 (0: nrow: 3, level: 1)
leaf: 0x1000c8d 16780429 (-1: nrow: 4 rrow: 4)
leaf: 0x1000c8e 16780430 (0: nrow: 4 rrow: 4)
leaf: 0x1000c8f 16780431 (1: nrow: 4 rrow: 4)
----- end tree dump
*/
MYDB > SELECT dbms_utility.data_block_address_file(to_number(ltrim('0x1000c8d',
2 '0x'),
3 'xxxxxxxx')) file#,
4 dbms_utility.data_block_address_block(to_number(ltrim('0x1000c8d',
5 '0x'),
6 'xxxxxxxx')) block#
7 FROM dual
8 /
FILE# BLOCK#
---------- ----------
4 3213
已用时间: 00: 00: 00.00
MYDB > SELECT dbms_utility.data_block_address_file(to_number(ltrim('0x1000c8e',
2 '0x'),
3 'xxxxxxxx')) file#,
4 dbms_utility.data_block_address_block(to_number(ltrim('0x1000c8e',
5 '0x'),
6 'xxxxxxxx')) block#
7 FROM dual
8 /
FILE# BLOCK#
---------- ----------
4 3214
已用时间: 00: 00: 00.01
MYDB > SELECT dbms_utility.data_block_address_file(to_number(ltrim('0x1000c8f',
2 '0x'),
3 'xxxxxxxx')) file#,
4 dbms_utility.data_block_address_block(to_number(ltrim('0x1000c8f',
5 '0x'),
6 'xxxxxxxx')) block#
7 FROM dual
8 /
FILE# BLOCK#
---------- ----------
4 3215
已用时间: 00: 00: 00.00
MYDB > alter system dump datafile 4 block 3213;
系统已更改。
已用时间: 00: 00: 01.34
MYDB > alter system dump datafile 4 block 3214;
系统已更改。
已用时间: 00: 00: 00.01
MYDB > alter system dump datafile 4 block 3215;
系统已更改。
已用时间: 00: 00: 00.01
/*
row#0[8021] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 02 --1
col 1; len 2; (2): c1 0b --10
col 2; len 6; (6): 01 00 0c 86 00 01
row#1[8006] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 03 --2
col 1; len 2; (2): c1 10 --15
col 2; len 6; (6): 01 00 0c 86 00 00
row#2[7991] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 04 --3
col 1; len 2; (2): c1 10 --15
col 2; len 6; (6): 01 00 0c 86 00 02
row#3[7976] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 05 --4
col 1; len 2; (2): c1 16 --21
col 2; len 6; (6): 01 00 0c 86 00 04
----- end of leaf block dump -----
End dump data blocks tsn: 4 file#: 4 minblk 3213 maxblk 3213
row#0[8021] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 05 --4
col 1; len 2; (2): c1 1a --25
col 2; len 6; (6): 01 00 0c 86 00 03
row#1[8006] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 06 --5
col 1; len 2; (2): c1 15 --20
col 2; len 6; (6): 01 00 0c 86 00 05
row#2[7991] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 06 --5
col 1; len 2; (2): c1 17 --22
col 2; len 6; (6): 01 00 0c 86 00 06
row#3[7976] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 07 --6
col 1; len 2; (2): c1 1c --27
col 2; len 6; (6): 01 00 0c 86 00 07
----- end of leaf block dump -----
End dump data blocks tsn: 4 file#: 4 minblk 3214 maxblk 3214
row#0[8021] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 0a --9
col 1; len 2; (2): c1 0b --10
col 2; len 6; (6): 01 00 0c 86 00 08
row#1[8006] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 0b --10
col 1; len 2; (2): c1 10 --15
col 2; len 6; (6): 01 00 0c 86 00 09
row#2[7991] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 12 --17
col 1; len 2; (2): c1 04 --3
col 2; len 6; (6): 01 00 0c 86 00 0a
row#3[7976] flag: ------, lock: 0, len=15
col 0; len 2; (2): c1 13 --18
col 1; len 2; (2): c1 16 --21
col 2; len 6; (6): 01 00 0c 86 00 0b
----- end of leaf block dump -----
End dump data blocks tsn: 4 file#: 4 minblk 3215 maxblk 3215
*/
从dump的结果上看,TESTXYZ_AB的存储结构和select a,b from testxyz order by a,b相同~
由于pctfree 97,12个索引entry,存储在4个block中,3个leafblock的数据分布,如下图:
|
L1 a b |
|
L2 a b |
|
L3 a b |
|||
|
1 |
10 |
4 |
25 |
9 |
10 |
||
|
2 |
15 |
5 |
20 |
10 |
15 |
||
|
3 |
15 |
5 |
22 |
17 |
3 |
||
|
4 |
21 |
6 |
27 |
18 |
21 |
||
强制使用TESTXYZ_AB索引,以a=4 and b between 14 and 25来查询testxyz.a,testxyz.b,如下:
MYDB > select /*+ index(testxyz testxyz_ab) */ a, b from testxyz
2 where a=4 and b between 20 and 25;
A B
---------- ----------
4 21
4 25
已用时间: 00: 00: 00.00
执行计划
----------------------------------------------------------
Plan hash value: 624396574
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 52 | 3 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| TESTXYZ_AB | 2 | 52 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"=4 AND "B">=20 AND "B"<=25)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
498 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed
2.再来看看TESTXYZ_BA:
同样得到索引的存储结构如下图:
|
b a |
|
b a |
|
b a |
|||
|
3 |
17 |
15 |
3 |
21 |
18 |
||
|
10 |
1 |
15 |
10 |
22 |
5 |
||
|
10 |
9 |
20 |
5 |
25 |
4 |
||
|
15 |
2 |
21 |
4 |
27 |
6 |
||
强制使用TESTXYZ_AB索引,以a=4 and b between 14 and 25来查询testxyz.a,testxyz.b,如下:
MYDB > select /*+ index(testxyz testxyz_ba) */ a, b from testxyz
2 where a=4 and b between 20 and 25;
A B
---------- ----------
4 21
4 25
已用时间: 00: 00: 00.01
执行计划
----------------------------------------------------------
Plan hash value: 1233347511
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 52 | 4 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| TESTXYZ_BA | 2 | 52 | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("B">=20 AND "A"=4 AND "B"<=25)
filter("A"=4)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
498 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed
因为b between 20 and 25 后 a的值在存储上不连续,所以无法access,只能filter得到a=4...
根据concept上的说法:distinct值比较高的列(selective)要放在组合索引的前面,但是如果是对于我们这种SQL的查询,concpt上的说法明显是错的
我们可以从下面大数据量的表,类似SQL执行中逻辑读的数量来证实:
C:\Documents and Settings\RequieM>sqlplus mydb
SQL*Plus: Release 9.2.0.8.0 - Production on Mon Mar 9 20:39:33 2009
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Enter password:
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
SQL> select count(*) from test_abc where object_id=10;
COUNT(*)
----------
24976
Elapsed: 00:00:00.00
SQL> update test_abc set object_id=20 where rownum<16000;
15999 rows updated.
Elapsed: 00:00:00.02
SQL> update test_abc set object_id=20 where rownum<8000;
7999 rows updated.
Elapsed: 00:00:00.00SQL> select object_id,count(*) from test_abc group by object_id;
OBJECT_ID COUNT(*)
---------- ----------
10 8977
20 8000
30 7999
SQL> create index ind_test_abc on test_abc(data_object_id,object_id);
Index created.
Elapsed: 00:00:00.00
SQL> select count(*)from test_abc where object_id=20 and data_object_id between 20 and 10000;
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'IND_TEST_ABC' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
407 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> explain plan for
2 select count(*)from test_abc where object_id=20 and data_object_id between 20 and 10000;
Explained.
Elapsed: 00:00:00.00
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | SORT AGGREGATE | | | | |
|* 2 | INDEX RANGE SCAN | IND_TEST_ABC | | | |
----------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TEST_ABC"."DATA_OBJECT_ID">=20 AND
"TEST_ABC"."OBJECT_ID"=20 AND "TEST_ABC"."DATA_OBJECT_ID"<=10000)
filter("TEST_ABC"."OBJECT_ID"=20)
Note: rule based optimization
17 rows selected.
Elapsed: 00:00:00.00
SQL> create index ind_test_abc on test_abc (object_id,data_object_id);
Index created.
Elapsed: 00:00:00.02
SQL> set autotrace traceonly
SQL> select count(*)from test_abc where object_id=20 and data_object_id between 20 and 10000;
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'IND_TEST_ABC' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5 consistent gets
4 physical reads
0 redo size
407 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> explain plan for
2 select count(*)from test_abc where object_id=20 and data_object_id between 20 and 10000;
Explained.
Elapsed: 00:00:00.00
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | SORT AGGREGATE | | | | |
|* 2 | INDEX RANGE SCAN | IND_TEST_ABC | | | |
----------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TEST_ABC"."OBJECT_ID"=20 AND
"TEST_ABC"."DATA_OBJECT_ID">=20 AND
"TEST_ABC"."DATA_OBJECT_ID"<=10000)
Note: rule based optimization
17 rows selected.
Elapsed: 00:00:00.00

浙公网安备 33010602011771号