self join 专题

self join 理解上偏抽象,这里做记录。

*sql 面向集合,二维表实质是集合。

1. cross join:

自连接:将同一张表复制成两张,通过 cross join 产生笛卡尔积。

select p1.name,p2.name
from products as p1,products as p2

2. 去掉相同元素对:

即在cross join中去掉自己对自己。

select p1.name,p2.name
from products as p1,products as p2
where p1.name != p2.name

力扣182:

select p1.name,p2.name
from products as p1,products as p2
where p1.name != p2.name

力扣612:

关键点1:自联表

with tmp as
(select p1.x as x1,p1.y as y1,
    p2.x as x2,p2.y as y2
from Point2D as p1,Point2D as p2
where (p1.x,p1.y)<>(p2.x,p2.y))

关键点2:新建距离列

select round(min(sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1))),2)
    as shortest 
from tmp

题解:

select round(min(sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1))),2) as shortest 
from
    (select p1.x as x1,p1.y as y1,
        p2.x as x2,p2.y as y2
    from Point2D as p1,Point2D as p2
    where (p1.x,p1.y)<>(p2.x,p2.y)) as tmp

3. 去掉重复组合:

从无序角度看,cross join 会产生重复集合。

select p1.name,p2.name
from products as p1,products as p2
where p1.name > p2.name

*> 除了可比较数值,还可比较字符串字典顺序、日期。

力扣1445:

select distinct s1.sale_date,s1.sold_num-s2.sold_num as diff
from Sales s1,Sales s2
where s1.sale_date = s2.sale_date
and s1.fruit < s2.fruit

select s1.sale_date,s1.sold_num-s2.sold_num as diff         
from Sales s1,Sales s2
where s1.sale_date = s2.sale_date
and s1.fruit = 'apples'
and s2.fruit = 'oranges'

---这里 distinct 可加可不加,仅两类水果且已过滤重复组合。

 4. 删除重复记录:

力扣196:

delete p1 from Person as p1,Person as p2
where p1.email = p2.email
and p1.id > p2.id

---若仅第一个条件会产生相同元素的笛卡尔积,含序号相同与不同的组合;第二个条件保留序号不同的组合。

*4/5/6点代码类似,涉及重复 / 相同问题;单纯重复可用 group by,见力扣1050。

5. 找到错误记录:

select distinct a1.address,a2.address
from address as a1,address as a2
where a1.family_id = a2.family_id
and a1.address != a2.address

---原则上,相同 id 的家庭地址应相同。

6. 找到同价商品:

select distinct p1.name,p2.name
from product as p1,product as p2
where p1.price = p2.price
and p1.name <> p2.name

*distinct 在结果中去掉 cross join 底层产生的重复,比如 “草莓”,“草莓”。

7. 要求温差时间差:

力扣197:

select w1.id
from Weather as w1,Weather as w2
where w1.Temperature > w2.Temperature
and datediff(w1.recordDate,w2.recordDate) = 1

*7/8/9点代码类似,存在时间滑动。

8. 和上年相比持平:

select s1.year,s1.sale
from sales as s1,sales as s2
where s1.year-s2.year = 1
and s1.sale = s2.sale

---第一个 where 条件 cross join 去年和今年;第二个 where 条件过滤销售额相等。

9. 住宿重叠时间:

select a.reserver,a.start_date,a.end_date
from reservations as a,reservations as b
where a.reserver <> b.reserver
and (a.start_date between b.start_date and b.end_date
or a.end_date between b.start_date and b.end_date)
order by a.start_date,a.end_date

---第一个 where 条件排除自己对自己;第二组 where 条件判别时间重叠。 

(select a.reserver,a.start_date,a.end_date
from reservations as a,reservations as b
where a.reserver <> b.reserver
and (a.start_date between b.start_date and b.end_date
or a.end_date between b.start_date and b.end_date)
order by a.start_date,a.end_date)
union
(select b.reserver,b.start_date,b.end_date
from reservations as a,reservations as b
where a.reserver <> b.reserver
and (a.start_date between b.start_date and b.end_date
or a.end_date between b.start_date and b.end_date)
order by a.start_date,a.end_date)

---更严谨的解法,union 默认去重。

10. 父属 / 上下级关系:

力扣608:

select distinct t1.id,
case when t1.p_id is null then 'Root' 
    when t1.id not in 
    (select distinct t1.id
    from tree t1,tree t2
    where t1.id = t2.p_id) then 'Leaf'
    else 'Inner' end as type
from tree t1,tree t2
order by t1.id

力扣570:

select e1.name 
from Employee e1,Employee e2
where e1.id=e2.managerId
group by e2.managerId
having count(e2.managerId) >= 5

select name 
from Employee
where id in
(select managerId
from Employee
group by managerId
having count(managerId) >= 5)

力扣181:

select employee_table.name as Employee
from Employee as manager_table,Employee as employee_table
where manager_table.id = employee_table.managerId
and employee_table.salary > manager_table.salary

力扣614:

# 抽出 follower 字段,需要去重(或聚合)!
with tmp1 as
(select distinct follower
from follow)

# 抽出 followee 字段
,tmp2 as
(select followee
from follow)

# 联表
,tmp3 as
(select follower,followee
from tmp1,tmp2
where follower=followee)

# 聚合计数
select follower,count(followee) as num
from tmp3
group by follower
order by follower

力扣1241:

select s1.sub_id as post_id,
  count(distinct s2.sub_id) as number_of_comments from Submissions s1 left join Submissions s2 on s1.sub_id = s2.parent_id where s1.parent_id is null group by s1.sub_id order by post_id

力扣1148:(无需 self join )

select distinct author_id as id
from Views
where author_id = viewer_id
order by author_id

-END

posted @ 2022-09-11 10:00  找回那所有、  阅读(77)  评论(0)    收藏  举报
这里到底了哦~(●'◡'●)