位图索引:
- 统计条数奋勇夺冠:
SQL> drop table t purge; Table dropped. SQL> create table t as select * from dba_objects; Table created. SQL> insert into t select * from t; 2314432 rows created. SQL> update t set object_id=rownum; 4628864 rows updated. SQL> commit; SQL> select count(*) from t; COUNT(*) ---------- 4628864 ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 19312 (8)| 00:03:52 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T | 289M| 19312 (8)| 00:03:52 | ------------------------------------------------------------------- Statistics ---------------------------------------------------------- 65727 consistent gets 12258 physical reads
SQL> create index idx_t_1 on t(object_id);
Index created.
SQL> alter table t modify object_id not null;
Table altered.
SQL> set autotrace on
SQL> select count(*) from t;
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2998 (1)| 00:00:36 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| IDX_T_1 | 5518K| 2998 (1)| 00:00:36 |
-------------------------------------------------------------------------
Statistics
----------------------------------------------------------
10865 consistent getsSQL> create bitmap index idx_map_t on t(status);
Index created.
SQL> select count(*) from t;
COUNT(*)
----------
4628864SQL> select count(*) from t;
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 113 (0)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | BITMAP CONVERSION COUNT | | 5518K| 113 (0)| 00:00:02 |
| 3 | BITMAP INDEX FAST FULL SCAN| IDX_MAP_T | | | |
Statistics
----------------------------------------------------------
123 consistent gets - 即席查询一骑绝尘:
- 人口普查中,我们会使用性别,年龄范围,出生地等多维的分析统计,这类多维查询可称为即席查询:
select * from t where col1=xxx and col2=xxx and col3=xxx and col4=xxx...
- 构造表t如下,有性别,年龄范围,出生地等字段:
create table t (id, gender not null,location not null,age not null,data) as select rownum, decode(ceil(dbms_random.value(0,2)),1,'M',2,'F') gender, ceil(dbms_random.value(1,50)) location, decode(ceil(dbms_random.value(0,3)),1,'child',2,'young',3,'middle_age',4,'old'), rpad('*',20,'*') from dual connect by rownum<=100000; set autotrace traceonly SQL> select * from t where gender='M' and location in (1,10,30) and age='child'; 658 rows selected. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 671 | 31537 | 171 (1)| 00:00:03 | |* 1 | TABLE ACCESS FULL| T | 671 | 31537 | 171 (1)| 00:00:03 | 1 - filter("GENDER"='M' AND ("LOCATION"=1 OR "LOCATION"=10 OR "LOCATION"=30) AND "AGE"='child') Statistics ---------------------------------------------------------- 652 consistent gets 16290 bytes sent via SQL*Net to client --列高度重复,建立联合索引;梁在书里说的走全表扫描的情况没有出现。oracle还是做了优化。 SQL> create index idx_union on t(gender,location,age); Index created. SQL> select * from t where gender='M' and location in (1,10,30) and age='child'; 658 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 306189815 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 671 | 31537 | 6 (0)| 00:00:01 | | 1 | INLIST ITERATOR | | | | | | | 2 | TABLE ACCESS BY INDEX ROWID| T | 671 | 31537 | 6 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | IDX_UNION | 1 | | 5 (0)| 00:00:01 | Statistics ---------------------------------------------------------- 635 consistent gets 36405 bytes sent via SQL*Net to client --建位图索引 SQL> create bitmap index idx_gender on t(gender); SQL> create bitmap index idx_age on t(age); SQL> create bitmap index idx_loc on t(location); SQL> select * from t where gender='M' and location in (1,10,30) and age='young'; 657 rows selected. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 622 | 29234 | 6 (0)| 00:00:01 | | 1 | INLIST ITERATOR | | | | | | | 2 | TABLE ACCESS BY INDEX ROWID| T | 622 | 29234 | 6 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | IDX_UNION | 1 | | 5 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ 3 - access("GENDER"='M' AND ("LOCATION"=1 OR "LOCATION"=10 OR "LOCATION"=30) AND "AGE"='young') Statistics ---------------------------------------------------------- 637 consistent gets 36355 bytes sent via SQL*Net to client
- bitmap索引的缺点:
- 当向建国bitmap索引的列插入值没有提交时,不允许修改或插入这个特定值的行。
- 对于id值绝对不能建立bitmap索引,对比如下
SQL> create bitmap index idx_btid on t(id); SQL> set autotrace traceonly SQL> select count(*) from t; | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1675 (1)| 00:00:21 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T | 1081K| 1675 (1)| 00:00:21 |---走全表扫描: Statistics ---------------------------------------------------------- 6140 consistent gets 6051 physical reads --强制走索引 SQL> select /*+index(t,idx_btid)*/ count(*) from t; | Id | Operation | Name | Rows | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 3563 (1)| 00:00:43 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | BITMAP CONVERSION COUNT| | 1081K| 3563 (1)| 00:00:43 | | 3 | BITMAP INDEX FULL SCAN| IDX_BTID | | | | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 3563 consistent gets 3494 physical reads
- 原理:
- 将特定字段的所有的可能值都做成一个一个位图,每个位图以0和1存放值的分布。
- 举例:如性别这一字段,值为男和女。则分别有男和女两张位图,在所有的位图里,每一个位bit顺序对应这张表的特定记录(假定是rowid),男位图若为0011100,则表示这张表有7条记录,只有第3-5条记录的值为男,这些位可以直接找到rowid,而为0的位就找不到rowid。相对的,女:1100011。
- 如果这张表还有另外一个字段地区,分别有3个值A区,B区,C区,则就有A,B和C3张位图,假定他们的值为
- A:1000100;
- B:0100001;
- C:0011010;那么男和女的加入:
- 男:0011100;
- 女:1100011;
- 如果要找在A区的女人,则把上面的行1和行5做与运算,得新的位图:1000000表示第一条记录符合条件,那么通过位图,就直接得到相对应的rowid。
- 位图的压缩:位图的算法使得位图后面的所有0都可以省略,如A:1000100=A:10001。
- 位图的锁定:从5可以知道只有当有新的1加入时,位图才会发生改变,位图发生改变时它所指向的rowid会被全部锁定,不可以修改。因为这个原因,位图只适合做查询。
- 重复度低的位图索引为变得巨大:如果某列有100个不同的值,100万列,那么这个位图索引的大小大约是1亿位。若它有100万不同的值,100万列,那么大小大约是1万亿位/2=5千亿位,这是非常庞大的数字。这个就是位图索引的软肋。
函数索引:
- 列运算让索引失去作用: 比较如下
SQL> create table t as select * from dba_objects; SQL> create index idx_id on t(object_id); SQL> create index idx_name on t(object_name); SQL> create index idx_crt on t(created); --对列运用函数,列的索引失效 SQL> select * from t where upper(object_name)='T'; | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 12 | 2484 | 289 (1)| 00:00:04 | |* 1 | TABLE ACCESS FULL| T | 12 | 2484 | 289 (1)| 00:00:04 | -------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1036 consistent gets --所以需要建立函数索引 SQL> create index idx_fx on t(upper(object_name)); SQL> select * from t where upper(object_name)='T'; | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 707 | 188K| 151 (0)| 00:00:02 | | 1 | TABLE ACCESS BY INDEX ROWID| T | 707 | 188K| 151 (0)| 00:00:02 | |* 2 | INDEX RANGE SCAN | IDX_FX | 283 | | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 5 consistent gets
- 避免列运算的经典案例
select * from t where object_id-10<30;--完全没有比较的列运算 set autotrace traceonly set linesize 1000 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 12 | 2484 | 289 (1)| 00:00:04 | |* 1 | TABLE ACCESS FULL| T | 12 | 2484 | 289 (1)| 00:00:04 | -------------------------------------------------------------------------- select * from t where object_name like 'CLUS%';--用这个,不要用下面的: select * from t where substr(object_name,1,4)='CLUS'; select * from t where created>=to_date('2010-10-02','YYYY-MM-DD') AND created<to_date('2010-10-03','YYYY-MM-DD')+1; --好的sql select * from t where trunc(created)>=to_date('2010-10-02','YYYY-MM-DD') and trunc(created)<=to_date('2010-10-03','YYYY-MM-DD'); --注:created的值为:2011/12/7 22:18:33, trunc(created)为:2011/12/7
索引经典案例:
- count(* )查询多,加上索引加速。
- 少量返回,频繁操作的列,加上组合索引。
- Max/Min的列加索引。
- 分区索引,查询的sql应该加上分区条件,否则会遍历各个分区索引。
- 去掉没有必要的order by操作
- 对排序的列加索引
- 位图索引是否建立在经常修改的列?
- 去除没有必要的索引