department top three salary---leetcode 解题记录

这部分参考：

http://bookshadow.com/weblog/2015/01/24/leetcode-department-top-three-salaries/

原题：

https://leetcode.com/problems/department-top-three-salaries/

题目描述：

The Employee table holds all employees. Every employee has an Id, and there is also a column for the department Id.

+----+-------+--------+--------------+

+----+-------+--------+--------------+

| 1 | Joe | 70000 | 1 |

| 2 | Henry | 80000 | 2 |

| 3 | Sam | 60000 | 2 |

| 4 | Max | 90000 | 1 |

| 5 | Janet | 69000 | 1 |

| 6 | Randy | 85000 | 1 |

+----+-------+--------+--------------+

The Department table holds all departments of the company.

+----+----------+

| Id | Name |

+----+----------+

| 1 | IT |

| 2 | Sales |

+----+----------+

Write a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows.

+------------+----------+--------+

| Department | Employee | Salary |

+------------+----------+--------+

| IT | Max | 90000 |

| IT | Randy | 85000 |

| IT | Joe | 70000 |

| Sales | Henry | 80000 |

| Sales | Sam | 60000 |

+------------+----------+--------+

题目大意：

雇员表Employee保存了雇员的Id，姓名，薪水，以及部门Id。部门表Department保存了部门的Id和名称。

编写一个SQL查询找出每一个部门薪水排名前3位的雇员信息（排名可以并列），样例如上。

解题思路：

参考StackOverflow的问答（http://stackoverflow.com/questions/12113699/get-top-n-records-for-each-group-of-grouped-results）

LeetCode OJ将此题的难度标记为Hard，可见题目的确有一定的难度。

解题步骤：

首先将雇员表按照雇员的DepartmentId和Salary分别正序和倒序排列；

然后利用MySQL用户定义变量对雇员标记排名rank值，同一部门的rank值从1开始，按照薪水降序递增（注意薪水相同时rank值不变）；

最后筛选rank值不大于3的雇员信息，并与Department表做关联即可。

解题过程：

无意之间看到，最开始没注意到hard标记，看完成率还是挺高的，就点开这道题，开始分析，第一时间的想法就是：先根据部门进行分组，然后每组选前三名即可，写了一个类似于：

select top 3 from a group by departmentId order by salary

但是MySQL根本没有top关键字，其次，这语法也不对，换写法；

接下来想到，可以先针对原始记录进行排序，按照部门id 和薪水两个字段进行排序，然后根据部门分组，最后用limit进行筛选，语句如下：

select * from (select * from Employee e order by e.DepartmentId asc,e.Salary desc)t group by t.DepartmentId order by t.Salary desc limit 3;

事实证明也不行，MySQL不支持组内排序，这时有点疑问，这题难度不小啊，去看了原题的标记，才知道是hard，这时才想起来，leetcode的完成率跟难易程度完全没参考性；

想了半天，也不知其所以然，开始搜索别人的思路，大部分都是写的，针对不同的人进行部门排序，即增加一个临时字段记录rank记录其排名，下一步针对排名进行筛选，根据这个思路就开始写：

select q.name as Department,p.Name as Employee,p.Salary as Salary from(
select DepartmentId,
       Name,
       Salary,
       @rank:=IF(@preDepartmentId <> DepartmentId,1,IF(@preSalary=Salary,@rank,@rank+1) )as rank,
       @preDepartmentId := DepartmentId as preDepartmentId,
       @preSalary := Salary as preSalary
from (select * from Employee e order by e.DepartmentId asc,e.Salary desc) t,(select @DepartmentId:=null,@preSalary:=null,@rank:=0)r
)p
inner join Department q on p.DepartmentId=q.Id
where p.rank<4;

写这个sql还真是学到不少东西，

第一：关于自定义变量的使用，之前只在存储过程写过，还真没在查询sql中写过；

第二：关于join 不同类型，审题以为Employee 表肯定会记录 departmentId，上来就写了一个left join ,一般的test case 都能通过，结果提交之后，发现雇员部门id为空时此条记录应该过滤，看了错误提示反应过来是join 类型不对；

第三：别人写的答案是：

SELECT d.NAME AS Department, t.NAME AS Employee, Salary FROM

( SELECT DepartmentId,

NAME, Salary,

@rank := IF(@prevDeptId != DepartmentId, 1, IF(@prevSalary = Salary, @rank, @rank + 1) ) AS Rank,

@prevDeptId := DepartmentId AS prevDeptId, @prevSalary := Salary AS prevSalary

FROM Employee e, (SELECT @rank := 0, @prevDeptId := NULL, @prevSalary := NULL) r ORDER BY DepartmentId ASC, Salary DESC ) t

INNER JOIN Department d ON t.DepartmentId = d.ID

WHERE t.rank <= 3

跟我的不同主要在于：ORDER BY DepartmentId ASC, Salary DESC 这一句使用的地方，本人写的直接在内层查询，网友写的是

FROM Employee e, (SELECT @rank := 0, @prevDeptId := NULL, @prevSalary := NULL) r

ORDER BY DepartmentId ASC, Salary DESC

按照字面的逻辑理解就是：先进行rank计算，然后根据部门id 和salary 进行排序，此处最开始不甚明白，使用explain 查看了执行计划，发现filesort 先于全表遍历查询；猜测此处MySQL作了优化，只是优化后看起来比较难理解（请教了别人后给出的原因，原谅我不求甚解，哪位知道原因还请告知，不甚感激）。

PS：leetcode提交答案时应该注意：leetcode对答案大小写敏感。

posted @ 2015-09-30 10:26 轻言微雨阅读(122) 评论(0) 收藏举报

刷新页面返回顶部

轻言微雨

department top three salary---leetcode 解题记录

公告