一、PrintSink
printSink并行度默认和cpu核数相关,传参标识符默认添加到首字符里
DataStreamSource<String> source = env.socketTextStream("192.168.87.130", 8888);
source.print("res");
env.execute("PrintSink");
输入:
aaaa
bbb
ccccc
输出:
res:4> aaaa
res:1> bbb
res:2> ccccc
二、自定义sink(addSink)
DataStream<String> lines = env.socketTextStream("192.168.87.130", 8888);
SingleOutputStreamOperator<Tuple2<String, Integer>> wordAndOne = lines.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
@Override
public void flatMap(String line,
Collector<Tuple2<String, Integer>> out) throws Exception {
String[] words = line.split(" ");
for(String word : words) {
Tuple2<String, Integer> tp = Tuple2.of(word, 1);
out.collect(tp);
}
}
});
SingleOutputStreamOperator<Tuple2<String, Integer>> sumned = wordAndOne.keyBy(0).sum(1);
sumned.addSink(new RichSinkFunction<Tuple2<String,Integer>>() {
@Override
public void invoke(
Tuple2<String, Integer> value, Context context)
throws Exception {
int index = getRuntimeContext().getIndexOfThisSubtask();
System.out.println(index + ">" + value);
}
});
输入:
aaaa
bbb
ccccc
spark flink spring spark
spring spring flink
输出:
0>(spark,1)
3>(flink,1)
2>(spring,1)
0>(spark,2)
3>(flink,2)
2>(spring,2)
2>(spring,3)
三、writeAsCsv
只能用于tuple类型的data streams,默认4096个字节才会写入csv中