一、划分Task的依据
用例:
DataStream<String> lines = env.socketTextStream(args[0], Integer.parseInt(args[1])); SingleOutputStreamOperator<Tuple2<String, Integer>> wordAndOne = lines.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() { @Override public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception { String[] words = line.split(" "); for(String word : words) { Tuple2<String, Integer> tp = Tuple2.of(word, 1); out.collect(tp); } } }); SingleOutputStreamOperator<Tuple2<String, Integer>> sumned = wordAndOne.keyBy(0).sum(1); //调用Sink(Sink必须调用) sumned.print();
Flink界面视图如图所示:

共有3个task(state),9个subtask,其中souce->flatMap并行度发生了改变,划分了一个task,flatMap—>keyby由于keyby 进行了hash分组划分一个task
二、startNewChain和disableChaining(改变task划分的算子)
官网地址:https://ci.apache.org/projects/flink/flink-docs-release-1.10/concepts/runtime.html

startNewChain:从该算子开始,开启一个新的链,从这个算子之前,发生redistributing
disableChaining:将这个算子单独划分处理,生成一个Task,跟其他的算子不再有operator chain
public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream<String> lines = env.socketTextStream(args[0], Integer.parseInt(args[1])); SingleOutputStreamOperator<String> word = lines.flatMap(new FlatMapFunction<String, String>() { @Override public void flatMap(String line, Collector<String> out) throws Exception { String[] words = line.split(" "); for(String word : words) { out.collect(word); } } }); SingleOutputStreamOperator<String> filterd = word.filter(new FilterFunction<String>() { @Override public boolean filter(String value) throws Exception { return value.startsWith("h"); } }); SingleOutputStreamOperator<Tuple2<String, Integer>> wordAndOne = filterd.map(new MapFunction<String, Tuple2<String, Integer>>() { @Override public Tuple2<String, Integer> map(String value) throws Exception { return Tuple2.of(value, 1); } }); SingleOutputStreamOperator<Tuple2<String, Integer>> sumned = wordAndOne.keyBy(0).sum(1); sumned.print(); env.execute("StreamWordCount"); }
任务计划图:

filter算子使用starNewChain后:
SingleOutputStreamOperator<String> filterd = word.filter(new FilterFunction<String>() { @Override public boolean filter(String value) throws Exception { return value.startsWith("h"); } }).startNewChain();
任务计划图:

filter算子使用过disableChaining后
SingleOutputStreamOperator<String> filterd = word.filter(new FilterFunction<String>() { @Override public boolean filter(String value) throws Exception { return value.startsWith("h"); } }).disableChaining();
任务计划图:

三、共享资源槽

资源槽名字不设置默认为deault,设置后,当前算子及后面的算子会用新的资源槽。例:
SingleOutputStreamOperator<String> word = lines.flatMap(new FlatMapFunction<String, String>() { @Override public void flatMap(String line, Collector<String> out) throws Exception { String[] words = line.split(" "); for(String word : words) { out.collect(word); } } }).slotSharingGroup("doit");
重新设置资源槽名字后,此时并行度要改为3方可运行,因为一共4个资源槽,source用的默认资源槽
