一、reduce

对keyed data stream做一个滚动操作,整合当前值与最近的reduced值然后产生一个新值

        DataStreamSource<String> source = env.socketTextStream("192.168.87.130", 8888);
        
        SingleOutputStreamOperator<Tuple2<String, Integer>> res = source.map(new MapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> map(String value) throws Exception {
                return Tuple2.of(value, 1);
            }
        });
        KeyedStream<Tuple2<String,Integer>, Tuple> keyed = res.keyBy(0);
        SingleOutputStreamOperator<Tuple2<String, Integer>> reduce = keyed.reduce(new ReduceFunction<Tuple2<String,Integer>>() {
            
            @Override
            public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
                String key = value1.f0;
                Integer c1 = value1.f1;
                Integer c2 = value2.f1;
                return Tuple2.of(key, c1 + c2);
            }
        });
        
        reduce.print();

输入:

flink
flink
hadoop
spark
hadoop

输出:

4> (flink,1)
4> (flink,2)
4> (hadoop,1)
1> (spark,1)
4> (hadoop,2)

二、Aggregations

对keyed data stream做聚合操作

        //spark,10
        SingleOutputStreamOperator<Tuple2<String, Integer>> res = source.map(new MapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> map(String value) throws Exception {
                String[] word = value.split(",");
                return Tuple2.of(word[0], Integer.parseInt(word[1]));
            }
        });
        KeyedStream<Tuple2<String,Integer>, Tuple> keyed = res.keyBy(0);
        SingleOutputStreamOperator<Tuple2<String, Integer>> max = keyed.max(1);

输入:

spark,220
spark,20
hadoop,30
hadoop,29
hadoop,33

输出

1> (spark,220)
1> (spark,220)
4> (hadoop,30)
4> (hadoop,30)
4> (hadoop,33)

 三、fold

A "rolling" fold on a keyed data stream with an initial value. Combines the current element with the last folded value and emits the new value.

注意方法入参初始值要和返回保持一致

        SingleOutputStreamOperator<Tuple2<String, Integer>> res = source.map(new MapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> map(String value) throws Exception {
                return Tuple2.of(value, 1);
            }
        });
        KeyedStream<Tuple2<String,Integer>, Tuple> keyed = res.keyBy(0);
        SingleOutputStreamOperator<Tuple2<String, Integer>> foldResult = keyed.fold(new Tuple2<String, Integer>("", 0), 
                new FoldFunction<Tuple2<String, Integer>, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> fold(
                    Tuple2<String, Integer> accumulator,
                    Tuple2<String, Integer> value) throws Exception {
                String key = value.f0;
                Integer count = value.f1;
                accumulator.f0 = key;
                accumulator.f1 += count;
                return accumulator;
            }
        });