postgres —— 窗口函数入门

 注:测试数据在 postgres —— 分组集与部分聚集 中

聚集将多行转变成较少、聚集的行。而窗口则不同,它把当前行与分组中的所有行对比,并且返回的行数没有变化。

组合当前行与 production 的均值

SELECT country, year, production,comsumption, 
avg(production) over() 
from t_oil;

  

1.分组

AVG(production) 计算的是每年(year)所有国家的平均石油产量(production

用途:对比各国产量与年度平均水平的差异

SELECT country, year, production,comsumption, 
avg(production) over(partition by year) 
from t_oil;

AVG(production) 计算的是当前行所属时间段的(2000年前或2000年后)的平均石油产量(production

用途:对比各国产量与2000年内(后)的平均水平的差异

SELECT country, year, production,comsumption, 
avg(production) over(partition by year > 2000) 
from t_oil;

  

2.排序

MIN(production) 计算的是 ​从分组开始到当前行的逐年累计最小值​(滚动最小值)。

例如:伊朗1978-1980年的产量为 [100, 90, 110],则结果中的min(production)列会依次为 100, 90, 90

特点:每行的min(production)是该国家 ​截至当前年份的历史最低产量

SELECT country, year, production, 
min(production) over (partition by country order by year)
from t_oil
where year between 1978 and 1983
and country in ('Iran', 'Oman')

 

 注:可以看到,随着行数增加,最小值一直在改变。比如,到第5行为止时,最小值为 1321。

 

3.滑动窗口

① 移动窗口应与 order by 一起使用。此窗口的范围是当前行,当前行的前一行,当前行的下一行

SELECT country, year, production,
min(production) over 
(PARTITION by country ORDER BY year ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
from t_oil
WHERE year BETWEEN 1978 AND 1983
AND country IN ('Iran', 'Oman');

 

 

② 窗口的范围与上面一样。为上一行,当前行与下一行。注:array_agg 的作用是将范围内的该字段值转变成一个 postgresql 数组

SELECT *, array_agg(id) OVER 
(ORDER BY id ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM generate_series(1, 5) as id;

 

 

③ 此窗口的范围是:当前行与当前行之前的行

SELECT *, array_agg(id) OVER 
(ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND 0 FOLLOWING)
FROM generate_series(1, 5) as id;

 

 

④ 此窗口的范围是:当前行的前两行,当前行以及当前行之后的行

SELECT *, array_agg(id) OVER 
(ORDER BY id ROWS BETWEEN 2 PRECEDING AND UNBOUNDED FOLLOWING)
FROM generate_series(1, 5) as id;

 

 

4.提取窗口子句

组合当前行,按年份排序后 到当前行为止 最小与最大的产量

SELECT country, year, production, min(production) OVER (w), max(production) OVER (w)
FROM t_oil
WHERE country = 'Canada' AND year BETWEEN 1980 AND 1985
WINDOW w AS (ORDER BY year);

 

 5.使用内建窗口函数

四大排名函数

-- rank 
SELECT year, production, rank() over (order by production)
from t_oil
where country = 'Other Middle East'
order by rank
limit 7;

-- dense_rank
SELECT year, production, dense_rank() over (order by production)
from t_oil
where country = 'Other Middle East'
order by dense_rank
limit 7;

-- ntile
SELECT year, production, ntile(4) OVER (order by production)
from t_oil
where country = 'Iraq'
and year between 2000 and 2006;

-- row_number
select country, production, row_number() over (order by production)
from t_oil
limit 3;

  

233

 

posted on 2019-12-17 23:41  Lemo_wd  阅读(678)  评论(0)    收藏  举报

导航