self join 专题
self join 理解上偏抽象,这里做记录。
*sql 面向集合,二维表实质是集合。
1. cross join:
自连接:将同一张表复制成两张,通过 cross join 产生笛卡尔积。
select p1.name,p2.name from products as p1,products as p2
2. 去掉相同元素对:
即在cross join中去掉自己对自己。
select p1.name,p2.name from products as p1,products as p2 where p1.name != p2.name
力扣182:
select p1.name,p2.name from products as p1,products as p2 where p1.name != p2.name
力扣612:
关键点1:自联表
with tmp as (select p1.x as x1,p1.y as y1, p2.x as x2,p2.y as y2 from Point2D as p1,Point2D as p2 where (p1.x,p1.y)<>(p2.x,p2.y))
关键点2:新建距离列
select round(min(sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1))),2) as shortest from tmp
题解:
select round(min(sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1))),2) as shortest from (select p1.x as x1,p1.y as y1, p2.x as x2,p2.y as y2 from Point2D as p1,Point2D as p2 where (p1.x,p1.y)<>(p2.x,p2.y)) as tmp
3. 去掉重复组合:
从无序角度看,cross join 会产生重复集合。
select p1.name,p2.name from products as p1,products as p2 where p1.name > p2.name
*> 除了可比较数值,还可比较字符串字典顺序、日期。
力扣1445:
select distinct s1.sale_date,s1.sold_num-s2.sold_num as diff from Sales s1,Sales s2 where s1.sale_date = s2.sale_date and s1.fruit < s2.fruit select s1.sale_date,s1.sold_num-s2.sold_num as diff from Sales s1,Sales s2 where s1.sale_date = s2.sale_date and s1.fruit = 'apples' and s2.fruit = 'oranges'
---这里 distinct 可加可不加,仅两类水果且已过滤重复组合。
4. 删除重复记录:
力扣196:
delete p1 from Person as p1,Person as p2 where p1.email = p2.email and p1.id > p2.id
---若仅第一个条件会产生相同元素的笛卡尔积,含序号相同与不同的组合;第二个条件保留序号不同的组合。
*4/5/6点代码类似,涉及重复 / 相同问题;单纯重复可用 group by,见力扣1050。
5. 找到错误记录:
select distinct a1.address,a2.address from address as a1,address as a2 where a1.family_id = a2.family_id and a1.address != a2.address
---原则上,相同 id 的家庭地址应相同。
6. 找到同价商品:
select distinct p1.name,p2.name from product as p1,product as p2 where p1.price = p2.price and p1.name <> p2.name
*distinct 在结果中去掉 cross join 底层产生的重复,比如 “草莓”,“草莓”。
7. 要求温差时间差:
力扣197:
select w1.id from Weather as w1,Weather as w2 where w1.Temperature > w2.Temperature and datediff(w1.recordDate,w2.recordDate) = 1
*7/8/9点代码类似,存在时间滑动。
8. 和上年相比持平:
select s1.year,s1.sale from sales as s1,sales as s2 where s1.year-s2.year = 1 and s1.sale = s2.sale
---第一个 where 条件 cross join 去年和今年;第二个 where 条件过滤销售额相等。
9. 住宿重叠时间:
select a.reserver,a.start_date,a.end_date from reservations as a,reservations as b where a.reserver <> b.reserver and (a.start_date between b.start_date and b.end_date or a.end_date between b.start_date and b.end_date) order by a.start_date,a.end_date
---第一个 where 条件排除自己对自己;第二组 where 条件判别时间重叠。
(select a.reserver,a.start_date,a.end_date from reservations as a,reservations as b where a.reserver <> b.reserver and (a.start_date between b.start_date and b.end_date or a.end_date between b.start_date and b.end_date) order by a.start_date,a.end_date) union (select b.reserver,b.start_date,b.end_date from reservations as a,reservations as b where a.reserver <> b.reserver and (a.start_date between b.start_date and b.end_date or a.end_date between b.start_date and b.end_date) order by a.start_date,a.end_date)
---更严谨的解法,union 默认去重。
10. 父属 / 上下级关系:
力扣608:
select distinct t1.id, case when t1.p_id is null then 'Root' when t1.id not in (select distinct t1.id from tree t1,tree t2 where t1.id = t2.p_id) then 'Leaf' else 'Inner' end as type from tree t1,tree t2 order by t1.id
力扣570:
select e1.name from Employee e1,Employee e2 where e1.id=e2.managerId group by e2.managerId having count(e2.managerId) >= 5 select name from Employee where id in (select managerId from Employee group by managerId having count(managerId) >= 5)
力扣181:
select employee_table.name as Employee from Employee as manager_table,Employee as employee_table where manager_table.id = employee_table.managerId and employee_table.salary > manager_table.salary
力扣614:
# 抽出 follower 字段,需要去重(或聚合)! with tmp1 as (select distinct follower from follow) # 抽出 followee 字段 ,tmp2 as (select followee from follow) # 联表 ,tmp3 as (select follower,followee from tmp1,tmp2 where follower=followee) # 聚合计数 select follower,count(followee) as num from tmp3 group by follower order by follower
力扣1241:
select s1.sub_id as post_id,
count(distinct s2.sub_id) as number_of_comments from Submissions s1 left join Submissions s2 on s1.sub_id = s2.parent_id where s1.parent_id is null group by s1.sub_id order by post_id
力扣1148:(无需 self join )
select distinct author_id as id from Views where author_id = viewer_id order by author_id
-END

浙公网安备 33010602011771号