【oracle】同是选出每个城市最年长女性,not exits 方案和 分析函数rank方案的巨大差距

以下实验版本:

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production

不保证在其它环境也是同样效果。

 

表结构:

create table customer(
    id number(8),
    name nvarchar2(20),
    gender char(1),
    cityname nvarchar2(20),
    birthday timestamp,
    primary key(id)
)

可以使用以下程序给它充值:

declare 
   g integer;
   c integer;
begin
for i in 1..100000 loop
    g:=dbms_random.value(1,0);
    c:=dbms_random.value(1,5);

    insert into customer values(i,
                                dbms_random.string('*',dbms_random.value(6,20)),
                decode(g,0,'m',1,'f'),
                decode(c,1,'兴城',2,'绥中',3,'山海关',4,'北戴河',5,'津滨'),
                DATE'1970-01-01'+i);
end loop;

commit;

end;
/

现在需求是把每个城市最年长女性找出来。

这个方案不止一种,以下是 not exists 方案:

select id,name,gender,cityname,to_char(birthday,'yyyy-MM-dd') as birthymd from customer a
where a.gender='f' and
not exists
(
    select null from customer b
    where b.gender='f' and b.cityname=a.cityname and b.birthday<a.birthday
)

这个方案只是简单实现需求,其执行起来还是挺费时的:

SQL> select id,name,gender,cityname,to_char(birthday,'yyyy-MM-dd') as birthymd from customer a
  2  where a.gender='f' and
  3  not exists
  4  (
  5      select null from customer b
  6      where b.gender='f' and b.cityname=a.cityname and b.birthday<a.birthday
  7  );

        ID NAME                                     G CITYNAME                                 BIRTHYMD
---------- ---------------------------------------- - ---------------------------------------- ----------
         1 XGZLORILKOGWELLJI                        f 山海关                                   1970-01-02
         2 JFMOXKKFDJPSNBNQS                        f 绥中                                     1970-01-03
         3 WKMIILRQAKY                              f 北戴河                                   1970-01-04
        23 MKSREQXPKJTWSK                           f 兴城                                     1970-01-24
        30 ILXYETNXOKXSB                            f 津滨                                     1970-01-31

已用时间:  00: 00: 16.38

十万数据查询,用时16秒,还是挺长的。

分组求两端的极值,比较合适的方案是用分析函数rank

select id,name,gender,cityname,birthymd from (
select 
id,name,gender,cityname,to_char(birthday,'yyyy-MM-dd') as birthymd, 
rank() over (partition by cityname order by birthday) as seq
from customer
where gender='f')
where seq=1
order by id

看看这个方案的耗时:

SQL> select id,name,gender,cityname,birthymd from (
  2  select
  3  id,name,gender,cityname,to_char(birthday,'yyyy-MM-dd') as birthymd,
  4  rank() over (partition by cityname order by birthday) as seq
  5  from customer
  6  where gender='f')
  7  where seq=1
  8  order by id;

        ID NAME                                     G CITYNAME                                 BIRTHYMD
---------- ---------------------------------------- - ---------------------------------------- ----------
         1 XGZLORILKOGWELLJI                        f 山海关                                   1970-01-02
         2 JFMOXKKFDJPSNBNQS                        f 绥中                                     1970-01-03
         3 WKMIILRQAKY                              f 北戴河                                   1970-01-04
        23 MKSREQXPKJTWSK                           f 兴城                                     1970-01-24
        30 ILXYETNXOKXSB                            f 津滨                                     1970-01-31

已用时间:  00: 00: 00.10

这个方案秒出,比较结果很明显,就不用看解释计划的cost了。

--END--

posted @ 2021-10-05 09:42  逆火狂飙  阅读(43)  评论(0)    收藏  举报
生当作人杰 死亦为鬼雄 至今思项羽 不肯过江东