SQL夯实基础(四):子查询及sql优化案例

 

首先我们先明确一下sql语句的执行顺序,如下有前至后执行:

(1)from  (2) on   (3) join  (4) where  (5)group by  (6) avg,sum...  (7)having  (8) select  (9) distinct  (10) order by

1、独立子查询

独立子查询:顾名思义:就是子查询和外层查询不存在任何联系,是独立于外层查询的:
下面就看一个例子:
有一张订单表 Sales.Order 和一张 客户表 Sales.Customer
下面的sql 语句是为了查询出Sales.Customer里 custid(用户id)不在 Sales.Order 的custid

select custid 
from [Sales.Customers]
where custid not in
(
select custid
from [Sales.Orders]
)

2、相关子查询

顾名思义:就是子查询里面的条件依赖于外层查询的数据
下面我再来举一个例子:
业务要求:查询出每个客户的订单的数量:

select distinct custid,
 (
select COUNT(*)
from [Sales.Orders]
--相关子查询:依赖于外层查询结果;;是外层和内层相互结合的操作
where [Sales.Orders].custid=[Sales.Customers].custid
) as orderNum
from [Sales.Customers]

查询的结果:

 

所以我们不难看出:相关子查询比独立子查询实现的功能强大的多

但是复杂度也会有提升,相当于笛卡尔积

 

3、案例演示

一个我之前开发时候写过的错误案例

SELECT
    DISTINCT m.id AS PolicyId,
    m.PolicyNo,
    m.PolicyType,
    m.PolicyStartTime,
    m.PolicyEndTime,
    m.BelogOrganizationName,
CASE
    ( SELECT count( 0 ) FROM policy_people AS b WHERE b.ProposalNo = m.ProposalNo AND b.Relation = 2 ) 
    WHEN 1 THEN
    ( SELECT GROUP_CONCAT( b.`Name` ) FROM policy_people AS b WHERE b.ProposalNo = m.ProposalNo AND b.Relation = 2 ) ELSE
 concat(
 ( SELECT b.NAME FROM policy_people AS b WHERE b.ProposalNo = m.ProposalNo AND b.Relation = 2 LIMIT 1 ),",",(SELECT b.NAME FROM policy_people AS b WHERE b.ProposalNo = m.ProposalNo AND b.Relation = 2 LIMIT 1,1 ),''
 ) 
    END AS InsuredPeople,
    ( SELECT count( 0 ) FROM report WHERE report.ProposalNo = m.ProposalNo ) AS CompenCount 
FROM
    policy_people AS p
    INNER JOIN policy_index AS m ON p.ProposalNo = m.ProposalNo 
WHERE
    p.Relation = 1 
    AND p.CertificateNo = "57016086-6

注意这个语句中的子查询,CASE嵌套了一层子查询,之后还用后面的逻辑继续执行子查询。再加上使用了distinct ,很容易造成选出过多的数据,执行过多的无效子查询,最后被distinct筛选掉。

暴力测试数据量上w之后,上面的语句执行时间长达几十秒。

       优化之后的修改,首先时使用group by代替了distinct,因为sql执行顺序的问题。distinct的级别很低,会造成先去选取数据,然后再去distinct,所以尽量去用group by而不是distinct

然后将子查询中过多的逻辑放到逻辑层去解决,而不是再sql中强行使用。

优化后的sql

SELECT
    m.ProposalNo,
    ANY_VALUE ( m.PolicyNo ) AS PolicyNo,
    ANY_VALUE ( m.PolicyType ) AS PolicyType,
    ANY_VALUE ( m.PolicyStartTime ) AS PolicyStartTime,
    ANY_VALUE ( m.PolicyEndTime ) AS PolicyEndTime,
    ANY_VALUE ( m.BelogOrganizationName ) AS BelogOrganizationName,
    d.InsuredPeople,
    d.InsuredPeolpleCount,
    ( SELECT count( 0 ) FROM report WHERE report.ProposalNo = p.ProposalNo ) AS CompenCount 
FROM
    policy_people AS p
    INNER JOIN policy_index AS m ON p.ProposalNo = m.ProposalNo
    INNER JOIN
 (SELECT GROUP_CONCAT( b.NAME ) AS InsuredPeople,count( 1 ) AS InsuredPeolpleCount,b.ProposalNo FROM policy_people AS b  WHERE b.Relation = 2  GROUP BY b.ProposalNo ) AS d 
ON p.ProposalNo = d.ProposalNo 
WHERE
    p.Relation = 1 
    AND p.CertificateNo = ? certificateNo 
    AND m.PolicyStartTime >? YEAR 
GROUP BY
    p.ProposalNo 
    LIMIT ? pagebegin,? pageend

使用了三表联查,减少了子查询,效率大幅提升,目前已经可以满足几十万级别的数据量,等项目运作一段时间后,再继续考虑是否进一步优化。

     

 

posted @ 2018-02-25 12:45  SeedQi  阅读(1220)  评论(0编辑  收藏  举报