HIVE 开窗函数


first_value 用法:

select
id, money, first_value(money) over(partition by id order by money) from winfunc;
id
money first_value_window_0 1001 100 100 1001 150 100 1001 150 100 1001 200 100 1002 50 50 1002 100 50 1002 200 50 1002 200 50 1002 300 50 1002 400 50 1003 50 50 1003 400 50 1004 60 60

SUM 开窗函数用法:
select id, money, sum(money) over (partition by id order by money) from winfunc;  缺window frame子句,range between unbounded precending and current row是默认值,因此本查询语句等于👆的查询语句的结果。

== select id, money, sum(money) over (partition by id order by money RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) from winfunc;
#中间略 id money sum_window_0 1001 100 100 1001 150 400 --- ⚠️ 相同的值,被同时sum了。即第2行150,第3行150的第三列都是400,100+150+150. 1001 150 400 1001 200 600 1002 50 50 1002 100 150 1002 200 550 1002 200 550 1002 300 850 1002 400 1250 1003 50 50 1003 400 450 1004 60 60
如果改成用rows取代range,  结果不同:

⚠️用rows不会考虑相同的值,此时的sum相当于累加计算。


select id, money, sum(money) over (partition by id order by money rows BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)from winfunc;

id    money    sum_window_0
    NULL    NULL
1001    100    100
1001    150    250
1001    150    400
1001    200    600
1002    50    50
1002    100    150
1002    200    350
1002    200    550
1002    300    850
1002    400    1250
1003    50    50
1003    400    450
1004    60    60
 

看mysql文档关于rows和range:


  • ROWS: The frame is defined by beginning and ending row positions. Offsets are differences in row numbers from the current row number.框架范围是从开始到行结束位置。考虑的是行号。

  • RANGE: The frame is defined by rows within a value range. Offsets are differences in row values from the current row value.范围是关于值的范围。考虑的是这行的取值。如果当前行的上下行的值和当前行的相同,那么会被算作范围frame之内。


lead(列名,offset值, [default]) 用法:返回当前行的下面的指定行数的值。


select id, money,lead(money,2) over (partition by id order by money) from winfunc;
id    money    lead_window_0
NULL    NULL
1001    100    150
1001    150    200
1001    150    NULL
1001    200    NULL
1002    50    200
1002    100    200
1002    200    300
1002    200    400
1002    300    NULL
1002    400    NULL
1003    50    NULL
1003    400    NULL
1004    60    NULL

lag(列名, offset, [default]) :返回前offset行的值,如果没有,则返回defalut参数。default可以自己定义。

MYSQL开窗函数:https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lag

 

posted on 2022-09-21 00:38  大鹏的鸿鹄之志  阅读(107)  评论(0编辑  收藏  举报