将数列唯一值化后再求中值的效率比较 第一方案胜出,加索引后在近两百万数据中查出中值耗时0.176秒

建表及充值:

create table hy_emp(
    id number(9,0) primary key,
    name nvarchar2(20) not null,
    salary integer not null);
    
insert into hy_emp
select rownum,dbms_random.string('*',dbms_random.value(2,20)),dbms_random.value(1000,10000)
from dual
connect by level<2000000
order by dbms_random.random;

方案一(运行时间0.95秒,cost3082)

SQL:

with soloemp as (select distinct salary from hy_emp)
select avg(salary) from
(select row_number() over (order by salary) as seq, row_number() over (order by salary desc) as revseq,salary from soloemp) a
where abs(a.seq-a.revseq)<=1

解释计划:

Plan hash value: 2542413663
 
----------------------------------------------------------------------------------
| Id  | Operation               | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |        |     1 |    39 |  3082   (3)| 00:00:01 |
|   1 |  SORT AGGREGATE         |        |     1 |    39 |            |          |
|*  2 |   VIEW                  |        |  9001 |   342K|  3082   (3)| 00:00:01 |
|   3 |    WINDOW SORT          |        |  9001 |   114K|  3082   (3)| 00:00:01 |
|   4 |     WINDOW SORT         |        |  9001 |   114K|  3082   (3)| 00:00:01 |
|   5 |      VIEW               |        |  9001 |   114K|  3081   (3)| 00:00:01 |
|   6 |       HASH UNIQUE       |        |  9001 | 36004 |  3081   (3)| 00:00:01 |
|   7 |        TABLE ACCESS FULL| HY_EMP |  1999K|  7812K|  3026   (1)| 00:00:01 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(ABS("A"."SEQ"-"A"."REVSEQ")<=1)

 

方案二(运行时间0.85秒,cost:3089)

SQL:

with soloemp as (select distinct salary from hy_emp)
select avg(b.salary) from
(select a.salary,abs(a.seq-a.revseq) as diff from (select salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) b
where b.diff=(select min(c.diff) from
(select abs(a.seq-a.revseq) as diff from (select dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) c)

解释计划:

Plan hash value: 2351668320
 
----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |     1 |    39 |  3089   (3)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A2_18C7CE |       |       |            |          |
|   3 |    HASH UNIQUE                           |                           |  9001 | 36004 |  3081   (3)| 00:00:01 |
|   4 |     TABLE ACCESS FULL                    | HY_EMP                    |  1999K|  7812K|  3026   (1)| 00:00:01 |
|   5 |   SORT AGGREGATE                         |                           |     1 |    39 |            |          |
|*  6 |    VIEW                                  |                           |  9001 |   342K|     4  (25)| 00:00:01 |
|   7 |     WINDOW SORT                          |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|   8 |      WINDOW SORT                         |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|   9 |       VIEW                               |                           |  9001 | 36004 |     3   (0)| 00:00:01 |
|  10 |        TABLE ACCESS FULL                 | SYS_TEMP_0FD9DE4A2_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
|  11 |     SORT AGGREGATE                       |                           |     1 |    26 |            |          |
|  12 |      VIEW                                |                           |  9001 |   228K|     4  (25)| 00:00:01 |
|  13 |       WINDOW SORT                        |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|  14 |        WINDOW SORT                       |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|  15 |         VIEW                             |                           |  9001 | 36004 |     3   (0)| 00:00:01 |
|  16 |          TABLE ACCESS FULL               | SYS_TEMP_0FD9DE4A2_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   6 - filter(ABS("A"."SEQ"-"A"."REVSEQ")= (SELECT MIN(ABS("A"."SEQ"-"A"."REVSEQ")) FROM  (SELECT 
              DENSE_RANK() OVER ( ORDER BY "SALARY") "SEQ",DENSE_RANK() OVER ( ORDER BY INTERNAL_FUNCTION("SALARY") DESC ) 
              "REVSEQ" FROM  (SELECT /*+ CACHE ("T1") */ "C0" "SALARY" FROM "SYS"."SYS_TEMP_0FD9DE4A2_18C7CE" "T1") 
              "SOLOEMP" ORDER BY "SALARY") "A"))

 

方案三(用时1.663秒,Cost:3089):

 SQL:

with soloemp as (select distinct salary from hy_emp)
select avg(a.salary) from
(select 
salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from soloemp) as ceil 
from soloemp) a
where a.seq=a.ceil or a.revseq=a.ceil

解释计划:

Plan hash value: 940756949
 
----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |     1 |    52 |  3089   (3)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A4_18C7CE |       |       |            |          |
|   3 |    HASH UNIQUE                           |                           |  9001 | 36004 |  3081   (3)| 00:00:01 |
|   4 |     TABLE ACCESS FULL                    | HY_EMP                    |  1999K|  7812K|  3026   (1)| 00:00:01 |
|   5 |   SORT AGGREGATE                         |                           |     1 |    52 |            |          |
|   6 |    SORT AGGREGATE                        |                           |     1 |       |            |          |
|   7 |     VIEW                                 |                           |  9001 |       |     3   (0)| 00:00:01 |
|   8 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9DE4A4_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
|*  9 |    VIEW                                  |                           |  9001 |   457K|     8  (25)| 00:00:01 |
|  10 |     WINDOW SORT                          |                           |  9001 |   114K|     8  (25)| 00:00:01 |
|  11 |      WINDOW SORT                         |                           |  9001 |   114K|     8  (25)| 00:00:01 |
|  12 |       VIEW                               |                           |  9001 |   114K|     3   (0)| 00:00:01 |
|  13 |        TABLE ACCESS FULL                 | SYS_TEMP_0FD9DE4A4_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   9 - filter("A"."SEQ"="A"."CEIL" OR "A"."REVSEQ"="A"."CEIL")

从目前看来,第一种方案以耗时少和cost低稳居榜首。

 

让我们增加一个salary列上的索引:

create index idx_hyemp_salary on hy_emp(salary);

再看三种方案的耗时和cost:

方案一(耗时0.176秒,Cost:1198):

with soloemp as (select distinct salary from hy_emp)
select avg(salary) from
(select row_number() over (order by salary) as seq, row_number() over (order by salary desc) as revseq,salary from soloemp) a
where abs(a.seq-a.revseq)<=1

解释计划:

Plan hash value: 3967344228
 
-----------------------------------------------------------------------------------------------
| Id  | Operation                  | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                  |     1 |    39 |  1198   (6)| 00:00:01 |
|   1 |  SORT AGGREGATE            |                  |     1 |    39 |            |          |
|*  2 |   VIEW                     |                  |  9001 |   342K|  1198   (6)| 00:00:01 |
|   3 |    WINDOW SORT             |                  |  9001 |   114K|  1198   (6)| 00:00:01 |
|   4 |     WINDOW SORT            |                  |  9001 |   114K|  1198   (6)| 00:00:01 |
|   5 |      VIEW                  |                  |  9001 |   114K|  1196   (6)| 00:00:01 |
|   6 |       HASH UNIQUE          |                  |  9001 | 36004 |  1196   (6)| 00:00:01 |
|   7 |        INDEX FAST FULL SCAN| IDX_HYEMP_SALARY |  1999K|  7812K|  1142   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(ABS("A"."SEQ"-"A"."REVSEQ")<=1)

 

方案二:(耗时0.252秒,Cost:1205)

SQL:

with soloemp as (select distinct salary from hy_emp)
select avg(b.salary) from
(select a.salary,abs(a.seq-a.revseq) as diff from (select salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) b
where b.diff=(select min(c.diff) from
(select abs(a.seq-a.revseq) as diff from (select dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) c)

解释计划:

Plan hash value: 3915164046
 
----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |     1 |    39 |  1205   (6)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A6_18C7CE |       |       |            |          |
|   3 |    HASH UNIQUE                           |                           |  9001 | 36004 |  1196   (6)| 00:00:01 |
|   4 |     INDEX FAST FULL SCAN                 | IDX_HYEMP_SALARY          |  1999K|  7812K|  1142   (1)| 00:00:01 |
|   5 |   SORT AGGREGATE                         |                           |     1 |    39 |            |          |
|*  6 |    VIEW                                  |                           |  9001 |   342K|     4  (25)| 00:00:01 |
|   7 |     WINDOW SORT                          |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|   8 |      WINDOW SORT                         |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|   9 |       VIEW                               |                           |  9001 | 36004 |     3   (0)| 00:00:01 |
|  10 |        TABLE ACCESS FULL                 | SYS_TEMP_0FD9DE4A6_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
|  11 |     SORT AGGREGATE                       |                           |     1 |    26 |            |          |
|  12 |      VIEW                                |                           |  9001 |   228K|     4  (25)| 00:00:01 |
|  13 |       WINDOW SORT                        |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|  14 |        WINDOW SORT                       |                           |  9001 | 36004 |     4  (25)| 00:00:01 |
|  15 |         VIEW                             |                           |  9001 | 36004 |     3   (0)| 00:00:01 |
|  16 |          TABLE ACCESS FULL               | SYS_TEMP_0FD9DE4A6_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   6 - filter(ABS("A"."SEQ"-"A"."REVSEQ")= (SELECT MIN(ABS("A"."SEQ"-"A"."REVSEQ")) FROM  (SELECT 
              DENSE_RANK() OVER ( ORDER BY "SALARY") "SEQ",DENSE_RANK() OVER ( ORDER BY INTERNAL_FUNCTION("SALARY") DESC ) 
              "REVSEQ" FROM  (SELECT /*+ CACHE ("T1") */ "C0" "SALARY" FROM "SYS"."SYS_TEMP_0FD9DE4A6_18C7CE" "T1") 
              "SOLOEMP" ORDER BY "SALARY") "A"))

 

方案三(0.24秒,Cost:1205):

SQL:

with soloemp as (select distinct salary from hy_emp)
select avg(a.salary) from
(select 
salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from soloemp) as ceil 
from soloemp) a
where a.seq=a.ceil or a.revseq=a.ceil

解释计划:

Plan hash value: 234565008
 
----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |     1 |    52 |  1205   (6)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A8_18C7CE |       |       |            |          |
|   3 |    HASH UNIQUE                           |                           |  9001 | 36004 |  1196   (6)| 00:00:01 |
|   4 |     INDEX FAST FULL SCAN                 | IDX_HYEMP_SALARY          |  1999K|  7812K|  1142   (1)| 00:00:01 |
|   5 |   SORT AGGREGATE                         |                           |     1 |    52 |            |          |
|   6 |    SORT AGGREGATE                        |                           |     1 |       |            |          |
|   7 |     VIEW                                 |                           |  9001 |       |     3   (0)| 00:00:01 |
|   8 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9DE4A8_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
|*  9 |    VIEW                                  |                           |  9001 |   457K|     8  (25)| 00:00:01 |
|  10 |     WINDOW SORT                          |                           |  9001 |   114K|     8  (25)| 00:00:01 |
|  11 |      WINDOW SORT                         |                           |  9001 |   114K|     8  (25)| 00:00:01 |
|  12 |       VIEW                               |                           |  9001 |   114K|     3   (0)| 00:00:01 |
|  13 |        TABLE ACCESS FULL                 | SYS_TEMP_0FD9DE4A8_18C7CE |  9001 | 36004 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   9 - filter("A"."SEQ"="A"."CEIL" OR "A"."REVSEQ"="A"."CEIL")

还是第一种方案占优。

 

最终的状元是第一种方案,它的成绩是两百万数据中查询中值用时0.176秒:

with soloemp as (select distinct salary from hy_emp)
select avg(salary) from
(select row_number() over (order by salary) as seq, row_number() over (order by salary desc) as revseq,salary from soloemp) a
where abs(a.seq-a.revseq)<=1

--2020-04-14--

posted @ 2020-04-14 17:05  逆火狂飙  阅读(149)  评论(0编辑  收藏  举报
生当作人杰 死亦为鬼雄 至今思项羽 不肯过江东