[Oracle数据库学习]十七、对Group By子句的扩展

D12

组函数

组函数处理多行,返回一行。

SELECT [column,] group_function(column). . .
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[ORDER BY column];

示例:

SELECT AVG(salary), STDDEV(salary), COUNT(commission_pct), MAX(hire_date)
FROM hr.employees
WHERE job_id LIKE 'SA%';

 

AVG(SALARY)STDDEV(SALARY)COUNT(COMMISSION_PCT)MAX(HIRE_DATE)
8900 2030.6475350894122813255591942051990066 35 21-APR-08

Group By子句

SELECT [column,]
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[ORDER BY column];

示例:

SELECT department_id, job_id, SUM(salary), COUNT(employee_id)
FROM employees
GROUP BY department_id, job_id ;
DEPARTMENT_IDJOB_IDSUM(SALARY)COUNT(EMPLOYEE_ID)
90 AD_VP 34000 2
100 FI_MGR 12008 1
80 SA_REP 243500 29
- SA_REP 7000 1
90 AD_PRES 24000 1
20 MK_REP 6000 1
110 AC_MGR 12008 1
60 IT_PROG 28800 5
30 PU_CLERK 13900 5
80 SA_MAN 61000 5
50 SH_CLERK 64300 20
20 MK_MAN 13000 1
30 PU_MAN 11000 1
50 ST_CLERK 55700 20
70 PR_REP 10000 1
110 AC_ACCOUNT 8300 1
50 ST_MAN 36400 5
100 FI_ACCOUNT 39600 5
10 AD_ASST 4400 1
40 HR_REP 6500 1

 

HAVING子句

使用组函数对组函数进行限制,对查询进行二次限制。

SELECT [column,] group_function(column)... 
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[HAVING having_expression] 
[ORDER BYcolumn];

示例:

SELECT department_id, job_id, SUM(salary), COUNT(employee_id)
FROM hr.employees
GROUP BY department_id, job_id
HAVING SUM(salary) > 50000;
DEPARTMENT_IDJOB_IDSUM(SALARY)COUNT(EMPLOYEE_ID)
80 SA_REP 243500 29
80 SA_MAN 61000 5
50 SH_CLERK 64300 20
50 ST_CLERK 55700 20

 

带有ROLLUP和CUBE操作的GROUP BY子句

使用带有ROLLUP和CUBE操作的GROUP BY子句产生多种分组结果:
1)ROLLUP产生n+1种分组结果
2)CUBE产生2的n次方种分组结果

 

ROLLUP操作符

ROLLUP是对GROUP BY子句的扩展,产生n + 1种分组结果,顺序是从右向左

SELECT [column,] group_function(column). . .
FROM table
[WHERE condition]
[GROUP BY [ROLLUP] group_by_expression]
[HAVING having_expression]
[ORDER BY column];

示例:

不使用ROLLUP:

SELECT department_id, job_id, SUM(salary)
FROM hr.employees 
WHERE department_id < 60
GROUP BY department_id, job_id
ORDER BY department_id, job_id;
DEPARTMENT_IDJOB_IDSUM(SALARY)
10 AD_ASST 4400
20 MK_MAN 13000
20 MK_REP 6000
30 PU_CLERK 13900
30 PU_MAN 11000
40 HR_REP 6500
50 SH_CLERK 64300
50 ST_CLERK 55700
50 ST_MAN 36400

9 rows selected.

注:此处为了对比添加了order by子句。

使用ROLLUP:

SELECT department_id, job_id, SUM(salary)
FROM employees 
WHERE department_id < 60
GROUP BY ROLLUP(department_id, job_id);
DEPARTMENT_IDJOB_IDSUM(SALARY)
10 AD_ASST 4400
10 - 4400
20 MK_MAN 13000
20 MK_REP 6000
20 - 19000
30 PU_MAN 11000
30 PU_CLERK 13900
30 - 24900
40 HR_REP 6500
40 - 6500
50 ST_MAN 36400
50 SH_CLERK 64300
50 ST_CLERK 55700
50 - 156400
- - 211200

15 rows selected.

注:这里说ROLLUP的顺序是从右向左,即从job_id开始,然后是department_id。结果集像Excel中的分类汇总,包含了不使用ROLLUP的结果集。

 

CUBE操作符

CUBE是对GROUP BY子句的扩展,产生类似于笛卡尔集的分组结果。

SELECT [column,] group_function(column)...
FROM table
[WHERE condition]
[GROUP BY [CUBE] group_by_expression]
[HAVING having_expression]
[ORDER BY column];

示例:

SELECT department_id, job_id, SUM(salary)
FROM hr.employees 
WHERE department_id < 60
GROUP BY CUBE (department_id, job_id) ;
DEPARTMENT_IDJOB_IDSUM(SALARY)
- - 211200
- HR_REP 6500
- MK_MAN 13000
- MK_REP 6000
- PU_MAN 11000
- ST_MAN 36400
- AD_ASST 4400
- PU_CLERK 13900
- SH_CLERK 64300
- ST_CLERK 55700
10 - 4400
10 AD_ASST 4400
20 - 19000
20 MK_MAN 13000
20 MK_REP 6000
30 - 24900
30 PU_MAN 11000
30 PU_CLERK 13900
40 - 6500
40 HR_REP 6500
50 - 156400
50 ST_MAN 36400
50 SH_CLERK 64300
50 ST_CLERK 55700

24 rows selected.

注:与不使用ROLLUP、使用ROLLUP对比,CUBE用department_id和job_id分别进行了分类汇总,包含了ROLLUP的结果集。

 

GROUPING函数

GROUPING函数可以和CUBE或ROLLUP结合使用
使用GROUPING函数,可以找到哪些列在该行中参加了分组,可以区分空值产生的原因
GROUPING函数返回0或1

SELECT [column,] group_function(column) ... ,
    GROUPING(expr)
FROM table
[WHERE condition]
[GROUP BY [ROLLUP][CUBE] group_by_expression]
[HAVING having_expression]
[ORDER BY column];

示例1:

SELECT department_id DEPTID, job_id JOB, SUM(salary), 
    GROUPING(department_id) GRP_DEPT,
    GROUPING(job_id) GRP_JOB
FROM hr.employees
WHERE department_id < 50
GROUP BY ROLLUP(department_id, job_id);
DEPTIDJOBSUM(SALARY)GRP_DEPTGRP_JOB
10 AD_ASST 4400 0 0
10 - 4400 0 1
20 MK_MAN 13000 0 0
20 MK_REP 6000 0 0
20 - 19000 0 1
30 PU_MAN 11000 0 0
30 PU_CLERK 13900 0 0
30 - 24900 0 1
40 HR_REP 6500 0 0
40 - 6500 0 1
- - 54800 1 1

11 rows selected.

注:可以看到GROUPING函数结果为1的是汇总行。

示例2:

SELECT department_id DEPTID, job_id JOB, SUM(salary), 
    GROUPING(department_id) GRP_DEPT,
    GROUPING(job_id) GRP_JOB
FROM hr.employees
WHERE department_id < 50
GROUP BY CUBE(department_id, job_id);
DEPTIDJOBSUM(SALARY)GRP_DEPTGRP_JOB
- - 54800 1 1
- HR_REP 6500 1 0
- MK_MAN 13000 1 0
- MK_REP 6000 1 0
- PU_MAN 11000 1 0
- AD_ASST 4400 1 0
- PU_CLERK 13900 1 0
10 - 4400 0 1
10 AD_ASST 4400 0 0
20 - 19000 0 1
20 MK_MAN 13000 0 0
20 MK_REP 6000 0 0
30 - 24900 0 1
30 PU_MAN 11000 0 0
30 PU_CLERK 13900 0 0
40 - 6500 0 1
40 HR_REP 6500 0 0

17 rows selected.

 

GROUPING SETS

GROUPING SETS是对GROUP BY子句的进一步扩充。

使用GROUPING SETS在同一个查询中定义多个分组集,Oracle对GROUPING SETS子句指定的分组集进行分组后,用UNION ALL操作将各分组结果结合起来。

优点:
1)只进行一次分组即可;
2)不必书写复杂的UNION语句;
3)GROUPING SETS中包含的分组项越多,性能越好。

SELECT department_id, job_id, manager_id, avg(salary)
FROM hr.employees
GROUP BY GROUPING SETS ((department_id, job_id), (job_id, manager_id));
DEPARTMENT_IDJOB_IDMANAGER_IDAVG(SALARY)
90 AD_VP - 17000
100 FI_MGR - 12008
80 SA_REP - 8396.551724137931034482758620689655172414
- SA_REP - 7000
90 AD_PRES - 24000
20 MK_REP - 6000
110 AC_MGR - 12008
60 IT_PROG - 5760
30 PU_CLERK - 2780
80 SA_MAN - 12200
50 SH_CLERK - 3215
20 MK_MAN - 13000
30 PU_MAN - 11000
50 ST_CLERK - 2785
70 PR_REP - 10000
110 AC_ACCOUNT - 8300
50 ST_MAN - 7280
100 FI_ACCOUNT - 7920
10 AD_ASST - 4400
40 HR_REP - 6500
- ST_CLERK 124 2925
- SA_MAN 100 12200
- AC_MGR 101 12008
- FI_ACCOUNT 108 7920
- MK_REP 201 6000
- PR_REP 101 10000
- AD_VP 100 17000
- PU_MAN 100 11000
- AD_ASST 101 4400
- HR_REP 101 6500
- PU_CLERK 114 2780
- SH_CLERK 120 2900
- ST_CLERK 123 3000
- SH_CLERK 122 3200
- AC_ACCOUNT 205 8300
- ST_CLERK 120 2625
- ST_CLERK 121 2675
- SA_REP 145 8500
- SA_REP 147 7766.666666666666666666666666666666666667
- SA_REP 148 8650
- SH_CLERK 121 3675
- SH_CLERK 124 2825
- IT_PROG 103 4950
- FI_MGR 101 12008
- ST_MAN 100 7280
- SH_CLERK 123 3475
- AD_PRES - 24000
- IT_PROG 102 9000
- ST_CLERK 122 2700
- SA_REP 146 8500

Rows 1 - 50. More rows exist.

注:可以看到前面一部分是按照department_id和job_id分组,后面是按照job_id和manager_id分组。

 

复合列

复合列是被作为整体处理的一组列的集合

ROLLUP(a,(b,c),d);

注:这里b列和c列形成复合列;
使用括号将若干列组成复合列,在ROLLUP或CUBE中作为整体进行操作,可以避免复合列产生不必要的分组结果。

示例:

SELECT department_id, job_id, manager_id, SUM(salary)
FROM hr.employees
GROUP BY ROLLUP(department_id, (job_id, manager_id) );
DEPARTMENT_IDJOB_IDMANAGER_IDSUM(SALARY)
- SA_REP 149 7000
- - - 7000
10 AD_ASST 101 4400
10 - - 4400
20 MK_MAN 100 13000
20 MK_REP 201 6000
20 - - 19000
30 PU_MAN 100 11000
30 PU_CLERK 114 13900
30 - - 24900
40 HR_REP 101 6500
40 - - 6500
50 ST_MAN 100 36400
50 SH_CLERK 120 11600
50 SH_CLERK 121 14700
50 SH_CLERK 122 12800
50 SH_CLERK 123 13900
50 SH_CLERK 124 11300
50 ST_CLERK 120 10500
50 ST_CLERK 121 10700
50 ST_CLERK 122 10800
50 ST_CLERK 123 12000
50 ST_CLERK 124 11700
50 - - 156400
60 IT_PROG 102 9000
60 IT_PROG 103 19800
60 - - 28800
70 PR_REP 101 10000
70 - - 10000
80 SA_MAN 100 61000
80 SA_REP 145 51000
80 SA_REP 146 51000
80 SA_REP 147 46600
80 SA_REP 148 51900
80 SA_REP 149 43000
80 - - 304500
90 AD_VP 100 34000
90 AD_PRES - 24000
90 - - 58000
100 FI_MGR 101 12008
100 FI_ACCOUNT 108 39600
100 - - 51608
110 AC_MGR 101 12008
110 AC_ACCOUNT 205 8300
110 - - 20308
- - - 691416

46 rows selected.

注:结果集不将job_id和manager_id分别汇总。

 

连接分组集

连接分组集可以产生有用的对分组项的结合:
将各分组集、ROLLUP和CUBE用逗号连接,Oracle自动在GROUP BY子句中将各分组集进行连接;
连接的结果是对各分组生成笛卡尔集。

GROUP BY GROUPING SETS(a, b), GROUPING SETS(c, d)

示例:

SELECT department_id, job_id, manager_id, SUM(salary)
FROM hr.employees
GROUP BY department_id, ROLLUP(job_id), CUBE(manager_id)
ORDER BY department_id, job_id, manager_id ;
DEPARTMENT_IDJOB_IDMANAGER_IDSUM(SALARY)
10 AD_ASST 101 4400
10 AD_ASST - 4400
10 - 101 4400
10 - - 4400
20 MK_MAN 100 13000
20 MK_MAN - 13000
20 MK_REP 201 6000
20 MK_REP - 6000
20 - 100 13000
20 - 201 6000
20 - - 19000
30 PU_CLERK 114 13900
30 PU_CLERK - 13900
30 PU_MAN 100 11000
30 PU_MAN - 11000
30 - 100 11000
30 - 114 13900
30 - - 24900
40 HR_REP 101 6500
40 HR_REP - 6500
40 - 101 6500
40 - - 6500
50 SH_CLERK 120 11600
50 SH_CLERK 121 14700
50 SH_CLERK 122 12800
50 SH_CLERK 123 13900
50 SH_CLERK 124 11300
50 SH_CLERK - 64300
50 ST_CLERK 120 10500
50 ST_CLERK 121 10700
50 ST_CLERK 122 10800
50 ST_CLERK 123 12000
50 ST_CLERK 124 11700
50 ST_CLERK - 55700
50 ST_MAN 100 36400
50 ST_MAN - 36400
50 - 100 36400
50 - 120 22100
50 - 121 25400
50 - 122 23600
50 - 123 25900
50 - 124 23000
50 - - 156400
60 IT_PROG 102 9000
60 IT_PROG 103 19800
60 IT_PROG - 28800
60 - 102 9000
60 - 103 19800
60 - - 28800
70 PR_REP 101 10000

Rows 1 - 50. More rows exist.

 

总结:

本节介绍了分组Group By子句的各种扩展:

1)ROLLUP:产生n+1个结果(增加了一行汇总);

2) CUBE:产生类似笛卡尔集的结果;

3) GROUPING:与ROLLUP和CUBE配合使用,处理他们产生的空值行;

4) GROUPING SETS:创建分组集;

5) 在Group By子句中组合分组:复合列(用括号将若干列组合起来,避免不想要的分组)、连接分组集(将使用ROLLUP、CUBE、Grouping Sets结合起来进行分组)。

 

posted @ 2020-07-24 09:43  workingdiary  阅读(121)  评论(0)    收藏  举报