Hive窗口函数详细介绍2,rank(),dense_rank() ,row_number()
在hive中,有三种窗口函数,rank(),dense_rank() 和row_number() 可以在窗口内实现对数据的排序。现在主要介绍这三个窗口函数的区别
1.rank() :生成数据项在分组内的排名,排名相等时会在名次中留下空位。
2. dense_rank() :生成数据项在分组内的排名,排名相等不会在名次中留下空位。
3.row_number() : 从1开始,按照顺序,生成分组内记录的序列
下面通过一个实例展示它们之间的区别。
create table if not exists buy_info ( name string, buy_date string, buy_num int ) row format delimited fields terminated by '|'; select * from buy_info;
| liulei | 2015-04-11 | 5 |
| liulei | 2015-04-12 | 7 |
| liulei | 2015-04-13 | 3 |
| liulei | 2015-04-14 | 2 |
| liulei | 2015-04-15 | 4 |
| liulei | 2015-04-16 | 4 |
select name,buy_date, buy_num , ranK() over(partition by name order by buy_num desc) as rank1, dense_rank() over(partition by name order by buy_num desc) as rank2, row_number() over(partition by name order by buy_num desc) as rank3
from buy_info;
| name | buy_date | buy_num | rank1 | rank2 | rank3 |
| liulei | 2015-04-12 | 7 | 1 | 1 | 1 |
| liulei | 2015-04-11 | 5 | 2 | 2 | 2 |
| liulei | 2015-04-15 | 4 | 3 | 3 | 3 |
| liulei | 2015-04-16 | 4 | 3 | 3 | 4 |
| liulei | 2015-04-13 | 3 | 5 | 4 | 5 |
| liulei | 2015-04-14 | 2 | 6 | 5 | 6 |

浙公网安备 33010602011771号