【SQL】Having的作用
WHERE 子句只能指定行的条件,而不能指定组的条件(这里面有个“阶”的概念,可以查阅:神奇的 SQL 之层级 → 为什么 GROUP BY 之后不能直接引用原表中的列),
因此就有了 HAVING 子句,它用来指定组的条件
举例1:
有个学生表,需要找出班级学生人数为3的记录,这里就需要用到Having
DROP TABLE IF EXISTS tbl_student_class; CREATE TABLE tbl_student_class ( id int primary key NOT NULL identity(1,1), sno varchar(12) NOT NULL, cno varchar(5) NOT NULL, cname varchar(50) NOT NULL, ); -- ---------------------------- -- Records of tbl_student_class -- ---------------------------- INSERT INTO tbl_student_class(sno, cno, cname) VALUES ('20190607001', '0607', '影视7班'); INSERT INTO tbl_student_class(sno, cno, cname) VALUES ('20190607002', '0607', '影视7班'); INSERT INTO tbl_student_class(sno, cno, cname) VALUES ('20190608003', '0608', '影视8班'); INSERT INTO tbl_student_class(sno, cno, cname) VALUES ('20190608004', '0608', '影视8班'); INSERT INTO tbl_student_class(sno, cno, cname) VALUES ('20190609005', '0609', '影视9班'); INSERT INTO tbl_student_class(sno, cno, cname) VALUES ('20190609006', '0609', '影视9班'); INSERT INTO tbl_student_class(sno, cno, cname) VALUES ('20190609007', '0609', '影视9班');
具体的查询语句
select cno,count(*) as numbers from tbl_student_class group by cno HAVING COUNT(*) = 3;
既然 HAVING 操作的对象是组,那么其使用的要素是有一定限制的,能够使用的要素有 3 种: 常数 、 聚合函数 和 聚合键 ,聚合键也就是 GROUP BY 子句中指定的列名
示例中的 HAVING COUNT(*) = 3 , COUNT(*) 是聚合函数,3 是常数,都在 3 要素之中;
举例2:
随便删除几条记录,然后找到id字段,缺失的记录
DELETE FROM tbl_student_class WHERE id IN(2,5,6); SELECT * FROM tbl_student_class;
--判断是否存在缺失编号
select '存在缺失编号' as cnt from tbl_student_class having count(*) <> (max(id)-min(id) + 1)
--找出实际缺失的编号值
SELECT number FROM MASTER..spt_values WHERE TYPE='P' and number>= (select min(id) as number from tbl_student_class) and number<=(select max(id) as number from tbl_student_class) except select id as number from tbl_student_class group by id
举例3:有一个工资表
DROP TABLE IF EXISTS tbl_student_salary; CREATE TABLE tbl_student_salary ( id int identity(1,1) NOT NULL , name nvarchar(5) NOT NULL , salary DECIMAL(15,2) NOT NULL, PRIMARY KEY (id) ); insert into tbl_student_salary values ('李小龙', 1000000); insert into tbl_student_salary values ('李四', 50000); insert into tbl_student_salary values ('王五', 50000); insert into tbl_student_salary values ('赵六', 50000); insert into tbl_student_salary values ('张三', 70000); insert into tbl_student_salary values ('张一三', 70000); insert into tbl_student_salary values ('张二三', 70000); insert into tbl_student_salary values ('张三三', 60000); insert into tbl_student_salary values ('张三四', 40000); insert into tbl_student_salary values ('张三丰', 30000);
求众数
-- 使用谓词 ALL 求众数 >=ALL谓词表示大于等于子查询中的所有值 (子查询结果有1,3 所以大于等于所有值就必须大于等于3) -- ALL 谓词用于 NULL 或空集时会出现问题 SELECT salary, COUNT(*) AS cnt FROM tbl_student_salary GROUP BY salary HAVING COUNT(*) >= ALL ( SELECT COUNT(*) FROM tbl_student_salary GROUP BY salary);
--使用极值来获取众数 select salary,count(*) as cnt from tbl_student_salary group by salary having count(*) >= (select max(counts) from (select count(*) as counts from tbl_student_salary group by salary) a)
求中位数
--1、先取出各自的集合,然后求交集,最后再取平均数
SELECT AVG(DISTINCT salary) from( SELECT T1.salary FROM tbl_student_salary T1, tbl_student_salary T2 GROUP BY T1.salary -- S1集合的条件 HAVING SUM(CASE WHEN T2.salary >= T1.salary THEN 1 ELSE 0 END) >= COUNT(*) / 2 InterSect --求交集 SELECT T1.salary FROM tbl_student_salary T1, tbl_student_salary T2 GROUP BY T1.salary -- S2集合的条件 HAVING SUM(CASE WHEN T2.salary <= T1.salary THEN 1 ELSE 0 END) >= COUNT(*) / 2 ) as temp
-- 2、求中位数的SQL 语句:在HAVING 子句中使用非等值自连接 SELECT AVG(DISTINCT salary) FROM ( SELECT T1.salary FROM tbl_student_salary T1, tbl_student_salary T2 GROUP BY T1.salary -- S1 的条件 HAVING SUM(CASE WHEN T2.salary >= T1.salary THEN 1 ELSE 0 END) >= COUNT(*) / 2 -- S2 的条件 AND SUM(CASE WHEN T2.salary <= T1.salary THEN 1 ELSE 0 END) >= COUNT(*) / 2 ) TMP;
这条 SQL 语句的要点在于比较条件 >= COUNT(*)/2 里的等号,加上等号并不是为了清晰地分开子集 S1 和 S2,而是为了让这 2 个子集拥有共同部分
如果去掉等号,将条件改成 > COUNT(*)/2 ,那么当元素个数为偶数时,S1 和 S2 就没有共同的元素了,也就无法求出中位数了;加上等号是为了写出通用性更高的 SQL
举例4:有一个学生记录提交表,没有提交时,字段为null,提交后submit_date有值
DROP TABLE IF EXISTS tbl_student_submit_log; CREATE TABLE tbl_student_submit_log ( id int identity(1,1) NOT NULL , sno varchar(12) NOT NULL , dept varchar(50) NOT NULL , submit_date DATE, PRIMARY KEY (id) ) ; insert into tbl_student_submit_log values ('20200607001', '理学院', '2020-12-12'), ('20200607002', '理学院', '2020-12-13'), ('20200608001', '文学院', null), ('20200608002', '文学院', '2020-12-22'), ('20200608003', '文学院', '2020-12-22'), ('20200612001', '工学院', null), ('20200617001', '经济学院', '2020-12-23');
现在我们需要从这张表里找出哪些学院的学生全部都提交了报告,这个 SQL 该怎么写?
--1、使用差集 select dept from tbl_student_submit_log group by dept except select dept from tbl_student_submit_log where submit_date is null
--2、利用了count(列名)的形式 SELECT dept,COUNT(*) as numbers FROM tbl_student_submit_log GROUP BY dept --COUNT(*) 可以用于 NULL ,而 COUNT(列名) 与其他聚合函数一样,要先排除掉 NULL 的行再进行统计(工学院和文学院有null值所以被屏蔽了) SELECT dept,COUNT(*) as numbers FROM tbl_student_submit_log GROUP BY dept HAVING COUNT(*) = COUNT(submit_date);
--3、使用 CASE 表达式也可以实现同样的功能 --计算出来的原始表达式 SELECT dept,COUNT(*) as numbers,SUM( CASE WHEN submit_date IS NOT NULL THEN 1 ELSE 0 END ) as numbers1 FROM tbl_student_submit_log GROUP BY dept --最终的结果 SELECT dept,COUNT(*) as numbers FROM tbl_student_submit_log GROUP BY dept HAVING COUNT(*) = SUM( CASE WHEN submit_date IS NOT NULL THEN 1 ELSE 0 END );
后续扩展知识点:等值连接,自连接,非等值连接

浙公网安备 33010602011771号