一、reduce
对keyed data stream做一个滚动操作,整合当前值与最近的reduced值然后产生一个新值
DataStreamSource<String> source = env.socketTextStream("192.168.87.130", 8888);
SingleOutputStreamOperator<Tuple2<String, Integer>> res = source.map(new MapFunction<String, Tuple2<String, Integer>>() {
@Override
public Tuple2<String, Integer> map(String value) throws Exception {
return Tuple2.of(value, 1);
}
});
KeyedStream<Tuple2<String,Integer>, Tuple> keyed = res.keyBy(0);
SingleOutputStreamOperator<Tuple2<String, Integer>> reduce = keyed.reduce(new ReduceFunction<Tuple2<String,Integer>>() {
@Override
public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
String key = value1.f0;
Integer c1 = value1.f1;
Integer c2 = value2.f1;
return Tuple2.of(key, c1 + c2);
}
});
reduce.print();
输入:
flink
flink
hadoop
spark
hadoop
输出:
4> (flink,1) 4> (flink,2) 4> (hadoop,1) 1> (spark,1) 4> (hadoop,2)
二、Aggregations
对keyed data stream做聚合操作
//spark,10 SingleOutputStreamOperator<Tuple2<String, Integer>> res = source.map(new MapFunction<String, Tuple2<String, Integer>>() { @Override public Tuple2<String, Integer> map(String value) throws Exception { String[] word = value.split(","); return Tuple2.of(word[0], Integer.parseInt(word[1])); } }); KeyedStream<Tuple2<String,Integer>, Tuple> keyed = res.keyBy(0); SingleOutputStreamOperator<Tuple2<String, Integer>> max = keyed.max(1);
输入:
spark,220 spark,20 hadoop,30 hadoop,29 hadoop,33
输出
1> (spark,220) 1> (spark,220) 4> (hadoop,30) 4> (hadoop,30) 4> (hadoop,33)
三、fold
A "rolling" fold on a keyed data stream with an initial value. Combines the current element with the last folded value and emits the new value.
注意方法入参初始值要和返回保持一致
SingleOutputStreamOperator<Tuple2<String, Integer>> res = source.map(new MapFunction<String, Tuple2<String, Integer>>() { @Override public Tuple2<String, Integer> map(String value) throws Exception { return Tuple2.of(value, 1); } }); KeyedStream<Tuple2<String,Integer>, Tuple> keyed = res.keyBy(0); SingleOutputStreamOperator<Tuple2<String, Integer>> foldResult = keyed.fold(new Tuple2<String, Integer>("", 0), new FoldFunction<Tuple2<String, Integer>, Tuple2<String, Integer>>() { @Override public Tuple2<String, Integer> fold( Tuple2<String, Integer> accumulator, Tuple2<String, Integer> value) throws Exception { String key = value.f0; Integer count = value.f1; accumulator.f0 = key; accumulator.f1 += count; return accumulator; } });