多维分析-hive
官网链接 : https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup 案例链接 : https://blog.csdn.net/qq_29232943/article/details/106505717 -- grouping sets 作用: 一个select语句中,设置多种维度组合 要求: 在group by 后使用,且维度组合必须为group by组合 示例: group by a, b grouping sets( 维度组合 ) 相当于 group by a, b 维度组合: (a,b) 相当于 group by a, b a 相当于 select a,NULL group by a b 相当于 select NULL,b group by b () 相当于 select NULL,NULL 示例: select channel,id,sum(pv) from t1 group by channel,id union all select channel,NULL,sum(pv) from t1 group by channel union all select NULL,id,sum(pv) from t1 group by id union all select NULL,NULL,sum(pv) from t1 channel id pv oppo 005 1 oppo 006 2 oppo 007 3 oppo 008 4 vivo 001 8 vivo 002 7 vivo 003 6 NULL 001 8 NULL 002 7 NULL 003 6 NULL 005 1 NULL 006 2 NULL 007 3 NULL 008 4 oppo NULL 10 vivo NULL 21 NULL NULL 31 -- select channel,id,sum(pv) from t1 group by channel,id grouping sets ( channel,id,(channel,id),() ) channel id pv NULL NULL 31 NULL 001 8 NULL 002 7 NULL 003 6 NULL 005 1 NULL 006 2 NULL 007 3 NULL 008 4 oppo NULL 10 oppo 005 1 oppo 006 2 oppo 007 3 oppo 008 4 vivo NULL 21 vivo 001 8 vivo 002 7 vivo 003 6 2.grouping__id 作用: 判断维度是否参与计算,区分是维度内容为NULL还是多维设置为NULL 返回值:此函数返回一个与每列是否存在相对应的位向量(10进制数表示) 对应维度 参与为1 不参与为0 with t1 as ( select 'vivo' as channel ,'001' as id,'1983' as iphone,8 as pv ) select channel,id,iphone,grouping__id,sum(pv) from t1 group by channel,id,iphone grouping sets ( channel,id,iphone,(channel,id,iphone),() ) channel id iphone grouping__id _c4 NULL NULL NULL 0 8 NULL NULL 1983 4 8 NULL 001 NULL 2 8 vivo NULL NULL 1 8 vivo 001 1983 7 8 1 2 4 8 3. with rollup(上卷) 作用: GROUP BY a, b, c, WITH ROLLUP 相当于 GROUP BY a, b, c GROUPING SETS ( (a, b, c), (a, b), (a), ( )) 示例: with t1 as ( select '2021' as year ,'202109' as month,'20210909' as day,8 as pv ) select year,month,day,grouping__id,sum(pv) as total from t1 group by year,month,day with rollup year month day grouping__id total NULL NULL NULL 0 8 2021 NULL NULL 1 8 2021 202109 NULL 3 8 2021 202109 20210909 7 8 4. with cube(多维) 作用: 计算出所有的维度组合 GROUP BY a, b, c WITH CUBE is equivalent to GROUP BY a, b, c GROUPING SETS ( (a, b, c), (a, b), (b, c), (a, c), (a), (b), (c), ( )) 维度组合数: 2的x次幂, x = 维度个数 示例: with t1 as ( select '2021' as year ,'202109' as month,'20210909' as day,8 as pv ) select year,month,day,grouping__id,sum(pv) as total from t1 group by year,month,day with cube year month day grouping__id total NULL NULL NULL 0 8 NULL NULL 20210909 4 8 NULL 202109 NULL 2 8 NULL 202109 20210909 6 8 2021 NULL NULL 1 8 2021 NULL 20210909 5 8 2021 202109 NULL 3 8 2021 202109 20210909 7 8