Hive经典案例:求出每个用户到当月为止的最大当月访问次数和累积到当月的总访问次数
一、数据准备
有如下数据:用户名,月份,访问次数
A,2015-01,5
A,2015-01,15
B,2015-01,5
A,2015-01,8
B,2015-01,25
A,2015-01,5
A,2015-02,4
A,2015-02,6
B,2015-02,10
B,2015-02,5
A,2015-03,16
A,2015-03,22
B,2015-03,23
B,2015-03,10
B,2015-03,1
二、需求分析
1、创建表
create table requirement( name string, month string, num int ) row format delimited fields terminated by ',';
2、上传数据
load data local inpath '/usr/mydir/data/requirement.txt' into table requirement;
3、验证数据是否正确
select * from requirement;
4、查询数据
SELECT name, month, month_sum, MAX(month_sum) over(partition by name order by month) max, SUM(month_sum) over(partition by name order by month) sum from( SELECT name,month,SUM(num) as month_sum from requirement group by name,month ) t1;
三、结果