Flink Window Trigger(触发器)

Trigger 的作用

英文单词 trigger 的意思是触发,作为名词是扳机的意思,例如枪支上的扳机就叫 trigger,所以也有开火的意思。Flink中,window操作需要伴随对窗口中的数据进行处理的逻辑,也就是窗口函数,而 Trigger 的作用就是决定何时触发窗口函数中的逻辑执行。

Trigger 抽象类

Flink中定义了Trigger抽象类,任何trigger必须继承Trigger类,并实现其中的onElement(),onProcessingTime(),onEventTime(),clear()等抽象方法,Flink官方提供了几种常用的trigger实现,同时,用户可以根据需求自定义trigger,以下是Trigger类的部分代码:

public abstract class Trigger<T, W extends Window> implements Serializable {

	private static final long serialVersionUID = -4104633972991191369L;

	/**
	 * Called for every element that gets added to a pane. The result of this will determine
	 * whether the pane is evaluated to emit results.
	 *
	 * @param element The element that arrived.
	 * @param timestamp The timestamp of the element that arrived.
	 * @param window The window to which the element is being added.
	 * @param ctx A context object that can be used to register timer callbacks.
	 */
	public abstract TriggerResult onElement(T element, long timestamp, W window, TriggerContext ctx) throws Exception;

	/**
	 * Called when a processing-time timer that was set using the trigger context fires.
	 *
	 * @param time The timestamp at which the timer fired.
	 * @param window The window for which the timer fired.
	 * @param ctx A context object that can be used to register timer callbacks.
	 */
	public abstract TriggerResult onProcessingTime(long time, W window, TriggerContext ctx) throws Exception;

	/**
	 * Called when an event-time timer that was set using the trigger context fires.
	 *
	 * @param time The timestamp at which the timer fired.
	 * @param window The window for which the timer fired.
	 * @param ctx A context object that can be used to register timer callbacks.
	 */
	public abstract TriggerResult onEventTime(long time, W window, TriggerContext ctx) throws Exception;

	/**
	 * Returns true if this trigger supports merging of trigger state and can therefore
	 * be used with a
	 * {@link org.apache.flink.streaming.api.windowing.assigners.MergingWindowAssigner}.
	 *
	 * <p>If this returns {@code true} you must properly implement
	 * {@link #onMerge(Window, OnMergeContext)}
	 */
	public boolean canMerge() {
		return false;
	}

	/**
	 * Called when several windows have been merged into one window by the
	 * {@link org.apache.flink.streaming.api.windowing.assigners.WindowAssigner}.
	 *
	 * @param window The new window that results from the merge.
	 * @param ctx A context object that can be used to register timer callbacks and access state.
	 */
	public void onMerge(W window, OnMergeContext ctx) throws Exception {
		throw new UnsupportedOperationException("This trigger does not support merging.");
	}

	/**
	 * Clears any state that the trigger might still hold for the given window. This is called
	 * when a window is purged. Timers set using {@link TriggerContext#registerEventTimeTimer(long)}
	 * and {@link TriggerContext#registerProcessingTimeTimer(long)} should be deleted here as
	 * well as state acquired using {@link TriggerContext#getPartitionedState(StateDescriptor)}.
	 */
	public abstract void clear(W window, TriggerContext ctx) throws Exception;
	
}
  • onElement() 方法会在窗口中每进入一条数据的时候调用一次
  • onProcessingTime() 方法会在一个ProcessingTime定时器触发的时候调用
  • onEventTime()方法会在一个EventTime定时器触发的时候调用
  • clear()方法会在窗口清除的时候调用

onElement()onProcessingTime()onEventTime() 方法的返回类型都是 TriggerResult,这是一个枚举类:

public enum TriggerResult {

	/**
	 * No action is taken on the window.
	 */
	CONTINUE(false, false),

	/**
	 * {@code FIRE_AND_PURGE} evaluates the window function and emits the window
	 * result.
	 */
	FIRE_AND_PURGE(true, true),

	/**
	 * On {@code FIRE}, the window is evaluated and results are emitted.
	 * The window is not purged, though, all elements are retained.
	 */
	FIRE(true, false),

	/**
	 * All elements in the window are cleared and the window is discarded,
	 * without evaluating the window function or emitting any elements.
	 */
	PURGE(false, true);

	// ------------------------------------------------------------------------

	private final boolean fire;
	private final boolean purge;

	TriggerResult(boolean fire, boolean purge) {
		this.purge = purge;
		this.fire = fire;
	}

	public boolean isFire() {
		return fire;
	}

	public boolean isPurge() {
		return purge;
	}
}

TriggerResult中包含四个枚举值:

  • CONTINUE表示对窗口不执行任何操作。
  • FIRE表示对窗口中的数据按照窗口函数中的逻辑进行计算,并将结果输出。注意计算完成后,窗口中的数据并不会被清除,将会被保留
  • PURGE表示将窗口中的数据和窗口清除。All elements in the window are cleared and the window is discarded, without evaluating the window function or emitting any elements.
  • FIRE_AND_PURGE表示先将数据进行计算,输出结果,然后将窗口中的数据和窗口进行清除。

ProcessingTimeTrigger 类

Flink 提供了常用的一些 trigger 类型,如果用户不设置Trigger类,Flink将会调用默认的trigger,例如对于时间属性为 EventTime 的窗口,Flink 默认会使用EventTimeTrigger类;时间属性为 ProcessingTime 的窗口,Flink 默认使用 ProcessingTimeTrigger类,如果用户指定了要使用的 trigger,默认的 trigger 将会被覆盖,不会起作用。ProcessingTimeTrigger类源码如下:

public class ProcessingTimeTrigger extends Trigger<Object, TimeWindow> {
	private static final long serialVersionUID = 1L;

	private ProcessingTimeTrigger() {}

	@Override
	public TriggerResult onElement(Object element, long timestamp, TimeWindow window, TriggerContext ctx) {
		ctx.registerProcessingTimeTimer(window.maxTimestamp());
		return TriggerResult.CONTINUE;
	}

	@Override
	public TriggerResult onEventTime(long time, TimeWindow window, TriggerContext ctx) throws Exception {
		return TriggerResult.CONTINUE;
	}

	@Override
	public TriggerResult onProcessingTime(long time, TimeWindow window, TriggerContext ctx) {
		return TriggerResult.FIRE;
	}

	@Override
	public void clear(TimeWindow window, TriggerContext ctx) throws Exception {
		ctx.deleteProcessingTimeTimer(window.maxTimestamp());
	}

	@Override
	public boolean canMerge() {
		return true;
	}

	@Override
	public void onMerge(TimeWindow window,
			OnMergeContext ctx) {
		// only register a timer if the time is not yet past the end of the merged window
		// this is in line with the logic in onElement(). If the time is past the end of
		// the window onElement() will fire and setting a timer here would fire the window twice.
		long windowMaxTimestamp = window.maxTimestamp();
		if (windowMaxTimestamp > ctx.getCurrentProcessingTime()) {
			ctx.registerProcessingTimeTimer(windowMaxTimestamp);
		}
	}

	@Override
	public String toString() {
		return "ProcessingTimeTrigger()";
	}

	/**
	 * Creates a new trigger that fires once system time passes the end of the window.
	 */
	public static ProcessingTimeTrigger create() {
		return new ProcessingTimeTrigger();
	}

}

onElement()方法中,ctx.registerProcessingTimeTimer(window.maxTimestamp())将会注册一个ProcessingTime定时器,时间参数是window.maxTimestamp(),也就是窗口的最终时间,当时间到达这个窗口最终时间,定时器触发并调用 onProcessingTime()方法,在 onProcessingTime() 方法中,return TriggerResult.FIRE 即返回 FIRE,触发窗口中数据的计算。

需要注意的是ProcessingTimeTrigger类只会在窗口的最终时间到达的时候触发窗口函数的计算,计算完成后并不会清除窗口中的数据,这些数据存储在内存中,除非调用PURGEFIRE_AND_PURGE,否则数据将一直存在内存中。
实际上,Flink中提供的Trigger类,除了PurgingTrigger类,其他的都不会对窗口中的数据进行清除。

posted @ 2020-01-06 14:17  wangxiaofan~  阅读(10102)  评论(0)    收藏  举报