一、滚动窗口使用Eventime
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); //timestamp,flink,2 //timestamp,sprak,3 DataStream<String> lines = env.socketTextStream("192.168.87.130", 8888)//仅仅提取时间字段,不会改变数据的样式 .assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor<String>(Time.seconds(0)) { private static final long serialVersionUID = -4441231666252017557L; //将数据中的时间字段提取出来,然后转成long类型 @Override public long extractTimestamp(String element) { String[] time = element.split(","); return Long.parseLong(time[0]); } });; SingleOutputStreamOperator<Tuple2<String, Integer>> word = lines.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() { private static final long serialVersionUID = 7179469626039725354L; @Override public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception { String[] splitStr = value.split(","); out.collect(Tuple2.of(splitStr[1], Integer.parseInt(splitStr[2]))); } }); //先分组,再划分窗口 KeyedStream<Tuple2<String, Integer>, Tuple> keyed = word.keyBy(0); WindowedStream<Tuple2<String, Integer>, Tuple, TimeWindow> windowStream = keyed.window(TumblingEventTimeWindows.of(Time.seconds(5))); SingleOutputStreamOperator<Tuple2<String,Integer>> summed = windowStream.sum(1); summed.print();
输入:
1000,a,1 2000,a,1 4999,a,1 6666,a,1 7777,a,1 9998,a,1 10001,a,1
输出:
3> (a,3)
3> (a,3)

如果source并行度大于1,需要所有并行度输入滚动时间都满足大于5s才会执行
(如果使用的是并行Source),例如:KafkaSource,创建Kafka的Topic时有多个分区,每一个Source的分区都要满足触发的条件,整个窗口才会被触发
二、WaterMark延迟触发任务机制
一中相同代码,输入:
1000,1,1 1001,2,1 1005,1,1 4999,1,1 6666,1,1 10005,1,1 7777,1,1
输出:
2> (1,3) 1> (2,1) 2> (1,1)
最后输入的7777,1,1数据就被丢弃了
可通过设置延迟时间解决这一问题:
DataStream<String> lines = env.socketTextStream("192.168.87.130", 8888)//仅仅提取时间字段,不会改变数据的样式
.assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor<String>(Time.seconds(2)) {
private static final long serialVersionUID = -4441231666252017557L;
//将数据中的时间字段提取出来,然后转成long类型
@Override
public long extractTimestamp(String element) {
String[] time = element.split(",");
return Long.parseLong(time[0]);
}
});
输入:
1000,1,a
1000,a,1
2000,a,1
5000,a,1
6999,a,1
输出:
3> (a,2)

三、EventTime和SlidingWindow
代码同上,仅需修改
WindowedStream<Tuple2<String, Integer>, Tuple, TimeWindow> windowStream = keyed.window(SlidingEventTimeWindows.of(Time.seconds(6), Time.seconds(2)));
不设置延迟时间,输入:
1000,a,1 1999,a,1 1000,a,1 1999,a,1 2222,b,1 2999,a,1 4000,a,1
输出:
3> (a,2) 3> (a,3) 1> (b,1)