第7章 透视、逆透视及分组集

7.1 透视

7.1.1 使用标准SQL进行透视转换

SELECT empid,
  SUM(CASE WHEN custid = 'A' THEN qty END) AS A,
  SUM(CASE WHEN custid = 'B' THEN qty END) AS B,
  SUM(CASE WHEN custid = 'C' THEN qty END) AS C,
  SUM(CASE WHEN custid = 'D' THEN qty END) AS D
FROM dbo.Orders
GROUP BY empid;

1、分组阶段:处理相关的分组或行元素。用GROUP BY子句实现,本例的分组元素为 GROUP BY empid。

2、扩展阶段:处理相关的扩展或列元素。为每个目标列指定case表达式来实现,本例中的扩展元素为custid。

3、聚合阶段:处理相关的聚合元素和聚合函数。为每个case表达式的结果应用相关的聚合函数,本例为SUM。当然,可根据不同的需求使用MAX、MIN和COUNT等。 

       据自己理解,将过程分析如下:

---------------------------------------
--按empid分组,此时只有empid 1列
---------------------------------------
SELECT empid
FROM dbo.Orders
GROUP BY empid;

----------------------------------------
--添加4个新的列,比如新列的值都取空
--从只有empid 1列,扩展成了5列
----------------------------------------
SELECT empid,
  '' AS A,
  '' AS B,
  '' AS C,
  '' AS D
FROM dbo.Orders
GROUP BY empid;

----------------------------------------
--下一步考虑的是在行列交叉处填什么值
--用case将符合条件行的qty全部筛出来
--用sum将这些筛出来的值求和
----------------------------------------
--先填充一列的值看看
SELECT empid,
  SUM(CASE WHEN custid = 'A' THEN qty END) AS A
FROM dbo.Orders
GROUP BY empid;
--再填充所有列 ...
View Code

7.1.2 使用 T-SQL PIVOT 运算符进行透视转换

SQL server2005引入了一个T-SQL 特有的运算符,和其他表运算符(如,JOIN)类似,PIVOT也是在查询FROM子句的上下文执行操作。

同样涉及三个阶段,但使用的是SQL Sever 自己的语法。

SELECT empid, A, B, C, D
FROM (SELECT empid, custid, qty
      FROM dbo.Orders) AS D
  PIVOT(SUM(qty) FOR custid IN(A, B, C, D)) AS P;

 

7.2 逆透视

7.2.1 使用标准SQL进行逆透视转换

SELECT *
FROM (SELECT empid, custid,
        CASE custid
          WHEN 'A' THEN A
          WHEN 'B' THEN B
          WHEN 'C' THEN C
          WHEN 'D' THEN D   
        END AS qty
      FROM dbo.EmpCustOrders
        CROSS JOIN (VALUES('A'),('B'),('C'),('D')) AS Custs(custid)) AS D
WHERE qty IS NOT NULL;

3个逻辑处理阶段:生成副本、提取元素、删除不相关的交叉:

 

---------------------------------------------------
--Step 1: 生成副本
--(1)使用cros join进行交叉连接
--(2)按values子句的格式来创建一个虚拟表
---------------------------------------------------
SELECT *
FROM dbo.EmpCustOrders
  CROSS JOIN (VALUES('A'),('B'),('C'),('D')) AS Custs(custid);
  
--
SELECT *
FROM dbo.EmpCustOrders
  CROSS JOIN (SELECT 'A' AS custid
              UNION ALL SELECT 'B'
              UNION ALL SELECT 'C'
              UNION ALL SELECT 'D') AS Custs;

----------------------------------------------------
--Step 2: 提取元素,生成一个数据列
----------------------------------------------------
SELECT empid, custid,
  CASE custid
    WHEN 'A' THEN A
    WHEN 'B' THEN B
    WHEN 'C' THEN C
    WHEN 'D' THEN D    
  END AS qty
FROM dbo.EmpCustOrders
  CROSS JOIN (VALUES('A'),('B'),('C'),('D')) AS Custs(custid);
  
------------------------------------------------------
-- Unpivot Step 3: 过滤掉不需要的值
------------------------------------------------------
SELECT *
FROM (SELECT empid, custid,
        CASE custid
          WHEN 'A' THEN A
          WHEN 'B' THEN B
          WHEN 'C' THEN C
          WHEN 'D' THEN D    
        END AS qty
      FROM dbo.EmpCustOrders
        CROSS JOIN (VALUES('A'),('B'),('C'),('D')) AS Custs(custid)) AS D
WHERE qty IS NOT NULL;
View Code

 7.2.2 使用 T-SQL 的 UNPIVOT 运算符进行逆透视转换

SELECT empid, custid, qty
FROM dbo.EmpCustOrders
  UNPIVOT(qty FOR custid IN(A, B, C, D)) AS U;

 

7.3 分组集

分组集就是分组(GROUP BY子句)使用的一组属性(或列名)。

如果将基于多个分组集的查询放在同一张表里呢?

SELECT empid, custid, SUM(qty) AS sumqty
FROM dbo.Orders
GROUP BY empid, custid

UNION ALL

SELECT empid, NULL, SUM(qty) AS sumqty
FROM dbo.Orders
GROUP BY empid

UNION ALL

SELECT NULL, custid, SUM(qty) AS sumqty
FROM dbo.Orders
GROUP BY custid

UNION ALL

SELECT NULL, NULL, SUM(qty) AS sumqty
FROM dbo.Orders;

 实现了,但代码长,执行效率低。应该用下面的方式(sqlserver2008新加入的,且是标准SQL):

7.3.1 GROUPING SETS 从属子句

SELECT empid, custid, SUM(qty) AS sumqty
FROM dbo.Orders
GROUP BY
  GROUPING SETS
  (
    (empid, custid),
    (empid),
    (custid),
    ()
  );

 7.3.2 CUBE从属子句

CUBE(a, b, c)与GROUPING SETS((a, b, c), (a,b), (a,c), (b, c), (a), (b), (c), ())等效。

在集合论中,由给定集合的所有子集构成的集合称为幂集(power set)。对于由给定元素构成的一个分组集(如,(a, b, c)),可以把CUBE从属子句看作是用于生成这个分组集的幂集。

SELECT empid, custid, SUM(qty) AS sumqty
FROM dbo.Orders
GROUP BY CUBE(empid, custid);

7.3.3 ROLLUP从属子句

ROLLUP认为输入成员之间存在一定的层次关系,从而生成让这种层次关系有意义的所有分组集。

ROLLUP(a, b, c)与GROUPING SETS((a, b, c), (a,b),  (a), ())等效。

SELECT 
  YEAR(orderdate) AS orderyear,
  MONTH(orderdate) AS ordermonth,
  DAY(orderdate) AS orderday,
  SUM(qty) AS sumqty
FROM dbo.Orders
GROUP BY ROLLUP(YEAR(orderdate), MONTH(orderdate), DAY(orderdate));

7.3.4 GROUPING 和 GROUPING_ID 函数

从总的结果中,判断具体的一条记录属于哪个分组集。

 

posted @ 2018-04-12 22:58  seaidler  阅读(190)  评论(0)    收藏  举报