OVER窗口函数

OVER简介:
  OVER窗口(OVER Window)是传统数据库的标准开窗,不同于Group By Window,OVER窗口中每1个元素都对应1个窗口,每个元素触发一次计算。OVER窗口可以按照实际元素的行或实际的元素值(时间戳值,支持事件时间(eventtime)和处理时间(proctime))确定窗口,因此流数据元素可能分布在多个窗口中。
语法:
select sum(amount) OVER (definition) AS amount,
avg(age) OVER (definition) AS age from table;
函数类型:
1、ROWS OVER Window
释义:每1行元素都被视为新的计算行,即每1行都是一个新的窗口;分为Unbounded(无界流)和Bounded(有界流)。
语法:
SELECT
   agg1(col1) OVER(
  [PARTITION BY (value_expression1,..., value_expressionN)]
  ORDER BY timeCol
  ROWS BETWEEN (UNBOUNDED | rowCount) PRECEDING AND CURRENT ROW) AS colName, ...
FROM Tab1;
例子:
统计当前商品上架之前同类的3个商品的最高价格(有界流)
CREATE TEMPORARY TABLE tmall_item(
   itemID VARCHAR,
   itemType VARCHAR,
   eventtime varchar,
   onSellTime AS TO_TIMESTAMP(eventtime),
   price DOUBLE, WATERMARK FOR onSellTime AS onSellTime - INTERVAL '0' SECOND --为Rowtime定义Watermark。
) WITH ('connector' = 'sls', ...);
SELECT itemID, itemType, onSellTime, price,
MAX(price) OVER ( PARTITION BY itemType ORDER BY onSellTime ROWS BETWEEN 2 preceding AND CURRENT ROW)
AS maxPrice
FROM tmall_item;
2、RANGE OVER Window
释义:具有相同时间值的所有元素行视为同一计算行,即具有相同时间值的所有行都是同一个窗口。
语法:
SELECT
  agg1(col1) OVER(
  [PARTITION BY (value_expression1,..., value_expressionN)]
  ORDER BY timeCol
  RANGE
  BETWEEN (UNBOUNDED | timeInterval) PRECEDING AND CURRENT ROW) AS colName,...
FROM Tab1;
例子:  
统计当前商品上架时间早2分钟的同类商品中的最高价格

CREATE TEMPORARY TABLE tmall_item(
  itemID VARCHAR,
  itemType VARCHAR,
  eventtime varchar,
  onSellTime AS TO_TIMESTAMP(eventtime),
  price DOUBLE,
  WATERMARK FOR onSellTime AS onSellTime - INTERVAL '0' SECOND --为Rowtime定义Watermark。
) WITH (
  'connector' = 'sls',
  ...
);
SELECT itemID, itemType, onSellTime, price,
MAX(price) OVER (
PARTITION BY itemType
ORDER BY onSellTime
RANGE BETWEEN INTERVAL '2' MINUTE preceding AND CURRENT ROW) AS maxPrice
FROM tmall_item;

posted @ 2023-08-29 11:04  梦呓空间  阅读(37)  评论(0编辑  收藏  举报