[Oracle数据库学习]十七、对Group By子句的扩展
D12
组函数
组函数处理多行,返回一行。
SELECT [column,] group_function(column). . . FROM table [WHERE condition] [GROUP BY group_by_expression] [ORDER BY column];
示例:
SELECT AVG(salary), STDDEV(salary), COUNT(commission_pct), MAX(hire_date) FROM hr.employees WHERE job_id LIKE 'SA%';
| AVG(SALARY) | STDDEV(SALARY) | COUNT(COMMISSION_PCT) | MAX(HIRE_DATE) |
|---|---|---|---|
| 8900 | 2030.6475350894122813255591942051990066 | 35 | 21-APR-08 |
Group By子句
SELECT [column,] FROM table [WHERE condition] [GROUP BY group_by_expression] [ORDER BY column];
示例:
SELECT department_id, job_id, SUM(salary), COUNT(employee_id) FROM employees GROUP BY department_id, job_id ;
| DEPARTMENT_ID | JOB_ID | SUM(SALARY) | COUNT(EMPLOYEE_ID) |
|---|---|---|---|
| 90 | AD_VP | 34000 | 2 |
| 100 | FI_MGR | 12008 | 1 |
| 80 | SA_REP | 243500 | 29 |
| - | SA_REP | 7000 | 1 |
| 90 | AD_PRES | 24000 | 1 |
| 20 | MK_REP | 6000 | 1 |
| 110 | AC_MGR | 12008 | 1 |
| 60 | IT_PROG | 28800 | 5 |
| 30 | PU_CLERK | 13900 | 5 |
| 80 | SA_MAN | 61000 | 5 |
| 50 | SH_CLERK | 64300 | 20 |
| 20 | MK_MAN | 13000 | 1 |
| 30 | PU_MAN | 11000 | 1 |
| 50 | ST_CLERK | 55700 | 20 |
| 70 | PR_REP | 10000 | 1 |
| 110 | AC_ACCOUNT | 8300 | 1 |
| 50 | ST_MAN | 36400 | 5 |
| 100 | FI_ACCOUNT | 39600 | 5 |
| 10 | AD_ASST | 4400 | 1 |
| 40 | HR_REP | 6500 | 1 |
HAVING子句
使用组函数对组函数进行限制,对查询进行二次限制。
SELECT [column,] group_function(column)... FROM table [WHERE condition] [GROUP BY group_by_expression] [HAVING having_expression] [ORDER BYcolumn];
示例:
SELECT department_id, job_id, SUM(salary), COUNT(employee_id) FROM hr.employees GROUP BY department_id, job_id HAVING SUM(salary) > 50000;
| DEPARTMENT_ID | JOB_ID | SUM(SALARY) | COUNT(EMPLOYEE_ID) |
|---|---|---|---|
| 80 | SA_REP | 243500 | 29 |
| 80 | SA_MAN | 61000 | 5 |
| 50 | SH_CLERK | 64300 | 20 |
| 50 | ST_CLERK | 55700 | 20 |
带有ROLLUP和CUBE操作的GROUP BY子句
使用带有ROLLUP和CUBE操作的GROUP BY子句产生多种分组结果:
1)ROLLUP产生n+1种分组结果
2)CUBE产生2的n次方种分组结果
ROLLUP操作符
ROLLUP是对GROUP BY子句的扩展,产生n + 1种分组结果,顺序是从右向左
SELECT [column,] group_function(column). . . FROM table [WHERE condition] [GROUP BY [ROLLUP] group_by_expression] [HAVING having_expression] [ORDER BY column];
示例:
不使用ROLLUP:
SELECT department_id, job_id, SUM(salary) FROM hr.employees WHERE department_id < 60 GROUP BY department_id, job_id ORDER BY department_id, job_id;
| DEPARTMENT_ID | JOB_ID | SUM(SALARY) |
|---|---|---|
| 10 | AD_ASST | 4400 |
| 20 | MK_MAN | 13000 |
| 20 | MK_REP | 6000 |
| 30 | PU_CLERK | 13900 |
| 30 | PU_MAN | 11000 |
| 40 | HR_REP | 6500 |
| 50 | SH_CLERK | 64300 |
| 50 | ST_CLERK | 55700 |
| 50 | ST_MAN | 36400 |
9 rows selected.
注:此处为了对比添加了order by子句。
使用ROLLUP:
SELECT department_id, job_id, SUM(salary) FROM employees WHERE department_id < 60 GROUP BY ROLLUP(department_id, job_id);
| DEPARTMENT_ID | JOB_ID | SUM(SALARY) |
|---|---|---|
| 10 | AD_ASST | 4400 |
| 10 | - | 4400 |
| 20 | MK_MAN | 13000 |
| 20 | MK_REP | 6000 |
| 20 | - | 19000 |
| 30 | PU_MAN | 11000 |
| 30 | PU_CLERK | 13900 |
| 30 | - | 24900 |
| 40 | HR_REP | 6500 |
| 40 | - | 6500 |
| 50 | ST_MAN | 36400 |
| 50 | SH_CLERK | 64300 |
| 50 | ST_CLERK | 55700 |
| 50 | - | 156400 |
| - | - | 211200 |
15 rows selected.
注:这里说ROLLUP的顺序是从右向左,即从job_id开始,然后是department_id。结果集像Excel中的分类汇总,包含了不使用ROLLUP的结果集。
CUBE操作符
CUBE是对GROUP BY子句的扩展,产生类似于笛卡尔集的分组结果。
SELECT [column,] group_function(column)... FROM table [WHERE condition] [GROUP BY [CUBE] group_by_expression] [HAVING having_expression] [ORDER BY column];
示例:
SELECT department_id, job_id, SUM(salary) FROM hr.employees WHERE department_id < 60 GROUP BY CUBE (department_id, job_id) ;
| DEPARTMENT_ID | JOB_ID | SUM(SALARY) |
|---|---|---|
| - | - | 211200 |
| - | HR_REP | 6500 |
| - | MK_MAN | 13000 |
| - | MK_REP | 6000 |
| - | PU_MAN | 11000 |
| - | ST_MAN | 36400 |
| - | AD_ASST | 4400 |
| - | PU_CLERK | 13900 |
| - | SH_CLERK | 64300 |
| - | ST_CLERK | 55700 |
| 10 | - | 4400 |
| 10 | AD_ASST | 4400 |
| 20 | - | 19000 |
| 20 | MK_MAN | 13000 |
| 20 | MK_REP | 6000 |
| 30 | - | 24900 |
| 30 | PU_MAN | 11000 |
| 30 | PU_CLERK | 13900 |
| 40 | - | 6500 |
| 40 | HR_REP | 6500 |
| 50 | - | 156400 |
| 50 | ST_MAN | 36400 |
| 50 | SH_CLERK | 64300 |
| 50 | ST_CLERK | 55700 |
24 rows selected.
注:与不使用ROLLUP、使用ROLLUP对比,CUBE用department_id和job_id分别进行了分类汇总,包含了ROLLUP的结果集。
GROUPING函数
GROUPING函数可以和CUBE或ROLLUP结合使用
使用GROUPING函数,可以找到哪些列在该行中参加了分组,可以区分空值产生的原因
GROUPING函数返回0或1
SELECT [column,] group_function(column) ... , GROUPING(expr) FROM table [WHERE condition] [GROUP BY [ROLLUP][CUBE] group_by_expression] [HAVING having_expression] [ORDER BY column];
示例1:
SELECT department_id DEPTID, job_id JOB, SUM(salary), GROUPING(department_id) GRP_DEPT, GROUPING(job_id) GRP_JOB FROM hr.employees WHERE department_id < 50 GROUP BY ROLLUP(department_id, job_id);
| DEPTID | JOB | SUM(SALARY) | GRP_DEPT | GRP_JOB |
|---|---|---|---|---|
| 10 | AD_ASST | 4400 | 0 | 0 |
| 10 | - | 4400 | 0 | 1 |
| 20 | MK_MAN | 13000 | 0 | 0 |
| 20 | MK_REP | 6000 | 0 | 0 |
| 20 | - | 19000 | 0 | 1 |
| 30 | PU_MAN | 11000 | 0 | 0 |
| 30 | PU_CLERK | 13900 | 0 | 0 |
| 30 | - | 24900 | 0 | 1 |
| 40 | HR_REP | 6500 | 0 | 0 |
| 40 | - | 6500 | 0 | 1 |
| - | - | 54800 | 1 | 1 |
11 rows selected.
注:可以看到GROUPING函数结果为1的是汇总行。
示例2:
SELECT department_id DEPTID, job_id JOB, SUM(salary), GROUPING(department_id) GRP_DEPT, GROUPING(job_id) GRP_JOB FROM hr.employees WHERE department_id < 50 GROUP BY CUBE(department_id, job_id);
| DEPTID | JOB | SUM(SALARY) | GRP_DEPT | GRP_JOB |
|---|---|---|---|---|
| - | - | 54800 | 1 | 1 |
| - | HR_REP | 6500 | 1 | 0 |
| - | MK_MAN | 13000 | 1 | 0 |
| - | MK_REP | 6000 | 1 | 0 |
| - | PU_MAN | 11000 | 1 | 0 |
| - | AD_ASST | 4400 | 1 | 0 |
| - | PU_CLERK | 13900 | 1 | 0 |
| 10 | - | 4400 | 0 | 1 |
| 10 | AD_ASST | 4400 | 0 | 0 |
| 20 | - | 19000 | 0 | 1 |
| 20 | MK_MAN | 13000 | 0 | 0 |
| 20 | MK_REP | 6000 | 0 | 0 |
| 30 | - | 24900 | 0 | 1 |
| 30 | PU_MAN | 11000 | 0 | 0 |
| 30 | PU_CLERK | 13900 | 0 | 0 |
| 40 | - | 6500 | 0 | 1 |
| 40 | HR_REP | 6500 | 0 | 0 |
17 rows selected.
GROUPING SETS
GROUPING SETS是对GROUP BY子句的进一步扩充。
使用GROUPING SETS在同一个查询中定义多个分组集,Oracle对GROUPING SETS子句指定的分组集进行分组后,用UNION ALL操作将各分组结果结合起来。
优点:
1)只进行一次分组即可;
2)不必书写复杂的UNION语句;
3)GROUPING SETS中包含的分组项越多,性能越好。
SELECT department_id, job_id, manager_id, avg(salary) FROM hr.employees GROUP BY GROUPING SETS ((department_id, job_id), (job_id, manager_id));
| DEPARTMENT_ID | JOB_ID | MANAGER_ID | AVG(SALARY) |
|---|---|---|---|
| 90 | AD_VP | - | 17000 |
| 100 | FI_MGR | - | 12008 |
| 80 | SA_REP | - | 8396.551724137931034482758620689655172414 |
| - | SA_REP | - | 7000 |
| 90 | AD_PRES | - | 24000 |
| 20 | MK_REP | - | 6000 |
| 110 | AC_MGR | - | 12008 |
| 60 | IT_PROG | - | 5760 |
| 30 | PU_CLERK | - | 2780 |
| 80 | SA_MAN | - | 12200 |
| 50 | SH_CLERK | - | 3215 |
| 20 | MK_MAN | - | 13000 |
| 30 | PU_MAN | - | 11000 |
| 50 | ST_CLERK | - | 2785 |
| 70 | PR_REP | - | 10000 |
| 110 | AC_ACCOUNT | - | 8300 |
| 50 | ST_MAN | - | 7280 |
| 100 | FI_ACCOUNT | - | 7920 |
| 10 | AD_ASST | - | 4400 |
| 40 | HR_REP | - | 6500 |
| - | ST_CLERK | 124 | 2925 |
| - | SA_MAN | 100 | 12200 |
| - | AC_MGR | 101 | 12008 |
| - | FI_ACCOUNT | 108 | 7920 |
| - | MK_REP | 201 | 6000 |
| - | PR_REP | 101 | 10000 |
| - | AD_VP | 100 | 17000 |
| - | PU_MAN | 100 | 11000 |
| - | AD_ASST | 101 | 4400 |
| - | HR_REP | 101 | 6500 |
| - | PU_CLERK | 114 | 2780 |
| - | SH_CLERK | 120 | 2900 |
| - | ST_CLERK | 123 | 3000 |
| - | SH_CLERK | 122 | 3200 |
| - | AC_ACCOUNT | 205 | 8300 |
| - | ST_CLERK | 120 | 2625 |
| - | ST_CLERK | 121 | 2675 |
| - | SA_REP | 145 | 8500 |
| - | SA_REP | 147 | 7766.666666666666666666666666666666666667 |
| - | SA_REP | 148 | 8650 |
| - | SH_CLERK | 121 | 3675 |
| - | SH_CLERK | 124 | 2825 |
| - | IT_PROG | 103 | 4950 |
| - | FI_MGR | 101 | 12008 |
| - | ST_MAN | 100 | 7280 |
| - | SH_CLERK | 123 | 3475 |
| - | AD_PRES | - | 24000 |
| - | IT_PROG | 102 | 9000 |
| - | ST_CLERK | 122 | 2700 |
| - | SA_REP | 146 | 8500 |
Rows 1 - 50. More rows exist.
注:可以看到前面一部分是按照department_id和job_id分组,后面是按照job_id和manager_id分组。
复合列
复合列是被作为整体处理的一组列的集合
ROLLUP(a,(b,c),d);
注:这里b列和c列形成复合列;
使用括号将若干列组成复合列,在ROLLUP或CUBE中作为整体进行操作,可以避免复合列产生不必要的分组结果。
示例:
SELECT department_id, job_id, manager_id, SUM(salary) FROM hr.employees GROUP BY ROLLUP(department_id, (job_id, manager_id) );
| DEPARTMENT_ID | JOB_ID | MANAGER_ID | SUM(SALARY) |
|---|---|---|---|
| - | SA_REP | 149 | 7000 |
| - | - | - | 7000 |
| 10 | AD_ASST | 101 | 4400 |
| 10 | - | - | 4400 |
| 20 | MK_MAN | 100 | 13000 |
| 20 | MK_REP | 201 | 6000 |
| 20 | - | - | 19000 |
| 30 | PU_MAN | 100 | 11000 |
| 30 | PU_CLERK | 114 | 13900 |
| 30 | - | - | 24900 |
| 40 | HR_REP | 101 | 6500 |
| 40 | - | - | 6500 |
| 50 | ST_MAN | 100 | 36400 |
| 50 | SH_CLERK | 120 | 11600 |
| 50 | SH_CLERK | 121 | 14700 |
| 50 | SH_CLERK | 122 | 12800 |
| 50 | SH_CLERK | 123 | 13900 |
| 50 | SH_CLERK | 124 | 11300 |
| 50 | ST_CLERK | 120 | 10500 |
| 50 | ST_CLERK | 121 | 10700 |
| 50 | ST_CLERK | 122 | 10800 |
| 50 | ST_CLERK | 123 | 12000 |
| 50 | ST_CLERK | 124 | 11700 |
| 50 | - | - | 156400 |
| 60 | IT_PROG | 102 | 9000 |
| 60 | IT_PROG | 103 | 19800 |
| 60 | - | - | 28800 |
| 70 | PR_REP | 101 | 10000 |
| 70 | - | - | 10000 |
| 80 | SA_MAN | 100 | 61000 |
| 80 | SA_REP | 145 | 51000 |
| 80 | SA_REP | 146 | 51000 |
| 80 | SA_REP | 147 | 46600 |
| 80 | SA_REP | 148 | 51900 |
| 80 | SA_REP | 149 | 43000 |
| 80 | - | - | 304500 |
| 90 | AD_VP | 100 | 34000 |
| 90 | AD_PRES | - | 24000 |
| 90 | - | - | 58000 |
| 100 | FI_MGR | 101 | 12008 |
| 100 | FI_ACCOUNT | 108 | 39600 |
| 100 | - | - | 51608 |
| 110 | AC_MGR | 101 | 12008 |
| 110 | AC_ACCOUNT | 205 | 8300 |
| 110 | - | - | 20308 |
| - | - | - | 691416 |
46 rows selected.
注:结果集不将job_id和manager_id分别汇总。
连接分组集
连接分组集可以产生有用的对分组项的结合:
将各分组集、ROLLUP和CUBE用逗号连接,Oracle自动在GROUP BY子句中将各分组集进行连接;
连接的结果是对各分组生成笛卡尔集。
GROUP BY GROUPING SETS(a, b), GROUPING SETS(c, d)
示例:
SELECT department_id, job_id, manager_id, SUM(salary) FROM hr.employees GROUP BY department_id, ROLLUP(job_id), CUBE(manager_id) ORDER BY department_id, job_id, manager_id ;
| DEPARTMENT_ID | JOB_ID | MANAGER_ID | SUM(SALARY) |
|---|---|---|---|
| 10 | AD_ASST | 101 | 4400 |
| 10 | AD_ASST | - | 4400 |
| 10 | - | 101 | 4400 |
| 10 | - | - | 4400 |
| 20 | MK_MAN | 100 | 13000 |
| 20 | MK_MAN | - | 13000 |
| 20 | MK_REP | 201 | 6000 |
| 20 | MK_REP | - | 6000 |
| 20 | - | 100 | 13000 |
| 20 | - | 201 | 6000 |
| 20 | - | - | 19000 |
| 30 | PU_CLERK | 114 | 13900 |
| 30 | PU_CLERK | - | 13900 |
| 30 | PU_MAN | 100 | 11000 |
| 30 | PU_MAN | - | 11000 |
| 30 | - | 100 | 11000 |
| 30 | - | 114 | 13900 |
| 30 | - | - | 24900 |
| 40 | HR_REP | 101 | 6500 |
| 40 | HR_REP | - | 6500 |
| 40 | - | 101 | 6500 |
| 40 | - | - | 6500 |
| 50 | SH_CLERK | 120 | 11600 |
| 50 | SH_CLERK | 121 | 14700 |
| 50 | SH_CLERK | 122 | 12800 |
| 50 | SH_CLERK | 123 | 13900 |
| 50 | SH_CLERK | 124 | 11300 |
| 50 | SH_CLERK | - | 64300 |
| 50 | ST_CLERK | 120 | 10500 |
| 50 | ST_CLERK | 121 | 10700 |
| 50 | ST_CLERK | 122 | 10800 |
| 50 | ST_CLERK | 123 | 12000 |
| 50 | ST_CLERK | 124 | 11700 |
| 50 | ST_CLERK | - | 55700 |
| 50 | ST_MAN | 100 | 36400 |
| 50 | ST_MAN | - | 36400 |
| 50 | - | 100 | 36400 |
| 50 | - | 120 | 22100 |
| 50 | - | 121 | 25400 |
| 50 | - | 122 | 23600 |
| 50 | - | 123 | 25900 |
| 50 | - | 124 | 23000 |
| 50 | - | - | 156400 |
| 60 | IT_PROG | 102 | 9000 |
| 60 | IT_PROG | 103 | 19800 |
| 60 | IT_PROG | - | 28800 |
| 60 | - | 102 | 9000 |
| 60 | - | 103 | 19800 |
| 60 | - | - | 28800 |
| 70 | PR_REP | 101 | 10000 |
Rows 1 - 50. More rows exist.
总结:
本节介绍了分组Group By子句的各种扩展:
1)ROLLUP:产生n+1个结果(增加了一行汇总);
2) CUBE:产生类似笛卡尔集的结果;
3) GROUPING:与ROLLUP和CUBE配合使用,处理他们产生的空值行;
4) GROUPING SETS:创建分组集;
5) 在Group By子句中组合分组:复合列(用括号将若干列组合起来,避免不想要的分组)、连接分组集(将使用ROLLUP、CUBE、Grouping Sets结合起来进行分组)。

浙公网安备 33010602011771号