Flink : Setup Flink on Yarn

Flink on Yarn 的两种模式

Yarn Session：启动一个长期运行的 Yarn 程序，这个 Yarn 程序在不同的 container 上启动 Job Manager 和 Task Manager，实现了 Flink 集群功能，然后每个 Flink app 都提交到这个由 Yarn 实现的 Flink 集群
Single Job on Yarn：不需要有长期运行的 Flink 集群，每次运行 Flink 程序都是一个独立的 Yarn 程序，这个程序会启动 JobManager 和 TaskManager 并运行 Flink app，程序结束后会全部退出

两种模式的比较：

Yarn Session 需要在通过 Yarn 启动集群时分配足够大的资源，因为可能需要支持很多个 Job 的运行，一旦 Yarn Session 的资源耗尽就无法再提交 Job，哪怕此时 Yarn 依然有足够的资源，并且这个 Yarn Session 如果出了什么问题，会影响所有的 Job，好处是不用每次都启动 Flink 集群，并且有统一的 Job Manager 对所有 Job 进行管理
Single Job 的资源配置更灵活，完全依据每个 Job 的需求进行配置，没有 Job 运行时不会有资源被占用，运行的 Job 很多时，只要 Yarn 有足够的资源就可以提交，每个 Job 都是在独立的 Yarn 程序运行，但每次都要启动 Flink 集群，并且所有 Job 都是独立的，没有统一的 Job Manager 管理
Yarn Session 一般用于测试环境，或是计算量小、运行时间短的程序
Single Job 一般用于生产环境，或是计算量大，长期运行的程序
两种模式可以在一个环境中同时存在（因为提交 Job 的命令是不一样的）

安装

Flink 包（https://flink.apache.org/downloads.html）

wget https://archive.apache.org/dist/flink/flink-1.10.0/flink-1.10.0-bin-scala_2.12.tgz

tar xzf flink-1.10.0-bin-scala_2.12.tgz

cd flink-1.10.0/lib
wget https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.7.5-9.0/flink-shaded-hadoop-2-uber-2.7.5-9.0.jar

Flink on Yarn 不需要对 masters，slaves 做配置

Yarn Session

通过下面的命令启动

./bin/yarn-session.sh

该命令的参数如下

Usage:
   Optional
     -at,--applicationType <arg>     Set a custom application type for the application on YARN
     -D <property=value>             use value for given property
     -d,--detached                   If present, runs the job in detached mode
     -h,--help                       Help for the Yarn session CLI.
     -id,--applicationId <arg>       Attach to running YARN session
     -j,--jar <arg>                  Path to Flink jar file
     -jm,--jobManagerMemory <arg>    Memory for JobManager Container with optional unit (default: MB)
     -m,--jobmanager <arg>           Address of the JobManager (master) to which to connect. Use this flag to connect to a different JobManager than the one specified in the configuration.
     -nl,--nodeLabel <arg>           Specify YARN node label for the YARN application
     -nm,--name <arg>                Set a custom name for the application on YARN
     -q,--query                      Display available YARN resources (memory, cores)
     -qu,--queue <arg>               Specify YARN queue.
     -s,--slots <arg>                Number of slots per TaskManager
     -t,--ship <arg>                 Ship files in the specified directory (t for transfer)
     -tm,--taskManagerMemory <arg>   Memory per TaskManager Container with optional unit (default: MB)
     -yd,--yarndetached              If present, runs the job in detached mode (deprecated; use non-YARN specific option instead)
     -z,--zookeeperNamespace <arg>   Namespace to create the Zookeeper sub-paths for high availability mode

命令输出如下

lin@Ubuntu-VM-1:$ ./bin/yarn-session.sh
2020-05-16 08:28:41,872 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2020-05-16 08:28:41,877 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2020-05-16 08:28:41,878 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2020-05-16 08:28:41,879 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.memory.process.size, 1568m
2020-05-16 08:28:41,879 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2020-05-16 08:28:41,882 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2020-05-16 08:28:41,885 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-05-16 08:28:42,579 WARN  org.apache.flink.runtime.util.HadoopUtils                     - Could not find Hadoop configuration via any of the supported methods (Flink configuration, environment variables).
2020-05-16 08:28:44,641 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-05-16 08:28:45,093 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to lin (auth:SIMPLE)
2020-05-16 08:28:45,169 INFO  org.apache.flink.runtime.security.modules.JaasModule          - Jaas file will be created as /tmp/jaas-8610345087872643912.conf.
2020-05-16 08:28:45,278 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/home/lin/myTest/flink/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-16 08:28:45,592 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-16 08:28:46,495 INFO  org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils  - The derived from fraction jvm overhead memory (156.800mb (164416719 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead
2020-05-16 08:28:47,184 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2020-05-16 08:28:47,621 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1568, slotsPerTaskManager=1}
2020-05-16 08:28:49,579 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system
2020-05-16 08:28:56,822 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1589570585722_0007
2020-05-16 08:28:57,435 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1589570585722_0007
2020-05-16 08:28:57,436 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2020-05-16 08:28:57,464 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2020-05-16 08:29:29,514 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - YARN application has been deployed successfully.
2020-05-16 08:29:29,516 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Found Web Interface ubuntu-vm-1:45993 of application 'application_1589570585722_0007'.
JobManager Web Interface: http://ubuntu-vm-1:45993

通过 yarn 命令查看运行的程序

lin@Ubuntu-VM-1:$ ./bin/yarn application -list
20/05/16 09:51:46 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1
                Application-Id      Application-Name        Application-Type          User           Queue                   State             Final-State             Progress                        Tracking-URL
application_1589570585722_0007  Flink session cluster           Apache Flink           lin         default                 RUNNING               UNDEFINED                 100%            http://Ubuntu-VM-1:45993

可以看到无论 yarn-session.sh 的输出还是 yarn application -list 的输出都有 http://Ubuntu-VM-1:45993，这是 Job Manager 的 Web UI，可以登录查看

通过 jps -ml 命令可以看到启动的程序是

org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint

另外 yarn-session.sh 命令不会退出，如果希望用分离模式，应该加上 -d （detach）选项

./bin/yarn-session.sh -d

在没有用分离模式的情况下，yarn-session.sh 可以接受 stop 命令退出

stop
2020-05-16 09:58:54,146 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Deleted Yarn properties file at /tmp/.yarn-properties-lin
2020-05-16 09:58:54,759 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Application application_1589570585722_0007 finished with state FINISHED and final state SUCCEEDED at 1589594332039

在分离模式下，可以用下面的命令重新 attach 上

./bin/yarn-session.sh -id <appId>

可以用下面的命令退出

echo "stop" | ./bin/yarn-session.sh -id <appId>

实际上这时还没有启动 Task Manager，在 Flink on Yarn 模式下，Task Manager 的数量由系统依据提交的 Job 的并发度以及配置的每个 Task Manager 的 slot 数量决定，并且是动态创建和停止的

看官网的说法：
The example invocation starts a single container for the ApplicationMaster which runs the Job Manager.
The session cluster will automatically allocate additional containers which run the Task Managers when jobs are submitted to the cluster.

为了提交 Job 的时候可以知道 Yarn Session 的 ID，这个 ID 被记录到一个文件，地址配在

# flink-conf.yaml
yarn.properties-file.location

如果没配就用

System.getProperty("java.io.tmpdir");

默认在 /tmp 目录

lin@Ubuntu-VM-1:$ cat /tmp/.yarn-properties-lin
#Generated YARN properties file
#Sat May 16 10:02:09 CST 2020
dynamicPropertiesString=
applicationID=application_1589570585722_0008

提交 Job

lin@Ubuntu-VM-1:$ ./bin/flink run -p 5 examples/batch/WordCount.jar
2020-05-16 08:03:07,531 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /tmp/.yarn-properties-lin.
2020-05-16 08:03:07,531 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /tmp/.yarn-properties-lin.
2020-05-16 08:03:08,080 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/home/lin/myTest/flink/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-16 08:03:08,080 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/home/lin/myTest/flink/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
2020-05-16 08:03:09,443 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-16 08:03:09,775 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2020-05-16 08:03:09,783 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2020-05-16 08:03:10,052 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Found Web Interface ubuntu-vm-1:35973 of application 'application_1589570585722_0006'.
Job has been submitted with JobID 6d0be776c392f5408e2d611e5f71a011
Program execution finished
Job with JobID 6d0be776c392f5408e2d611e5f71a011 has finished.
Job Runtime: 27096 ms
Accumulator Results:
- c2f8af07593812e7ae866e6da873aa95 (java.util.ArrayList) [170 elements]

(after,1)
(bare,1)
(coil,1)
......

run 命令通过 /tmp/.yarn-properties-lin 文件找到 Yarn Session 的 APP ID，并提交 Job

jps -ml 命令可以看到提交了 Job 后多了下面这个程序（有一个或多个）

org.apache.flink.yarn.YarnTaskExecutorRunner

这个程序就是 Task Manager

程序运行结束后过一会这些 Task Manager 就不在了

run 命令会一直保持在前端运行，直到 Job 完成，如果希望提交后退出要用 -d （detach）参数

run 命令的部分参数如下

Action "run" compiles and runs a program.

  Syntax: run [OPTIONS] <jar-file> <arguments>
  "run" action options:
     -c,--class <classname>               Class with the program entry point
                                          ("main()" method). Only needed if the
                                          JAR file does not specify the class in
                                          its manifest.

     -C,--classpath <url>                 Adds a URL to each user code
                                          classloader  on all nodes in the cluster. 

     -d,--detached                        If present, runs the job in detached mode

     -n,--allowNonRestoredState           Allow to skip savepoint state that
                                          cannot be restored. You need to allow
                                          this if you removed an operator from
                                          your program that was part of the
                                          program when the savepoint was
                                          triggered.

     -p,--parallelism <parallelism>       The parallelism with which to run the
                                          program. Optional flag to override the
                                          default value specified in the configuration.

     -py,--python <pythonFile>            Python script with the program entry
                                          point. The dependent resources can be
                                          configured with the `--pyFiles` option.

     -pyarch,--pyArchives <arg>           Add python archive files for job. 

     -pyfs,--pyFiles <pythonFiles>        Attach custom python files for job.

     -pym,--pyModule <pythonModule>       Python module with the program entry point.

     -pyreq,--pyRequirements <arg>        Specify a requirements.txt file

     -s,--fromSavepoint <savepointPath>   Path to a savepoint to restore the job
                                          from (for example hdfs:///flink/savepoint-1537).

     -sae,--shutdownOnAttachedExit        If the job is submitted in attached
                                          mode, perform a best-effort cluster
                                          shutdown when the CLI is terminated
                                          abruptly, e.g., in response to a user
                                          interrupt, such as typing Ctrl + C.

  Options for yarn-cluster mode:
     -d,--detached                        If present, runs the job in detached mode
     -m,--jobmanager <arg>                Address of the JobManager
     -yat,--yarnapplicationType <arg>     Set a custom application type for the application on YARN
     -yD <property=value>                 use value for given property
     -yh,--yarnhelp                       Help for the Yarn session CLI.
     -yid,--yarnapplicationId <arg>       Attach to running YARN session
     -yj,--yarnjar <arg>                  Path to Flink jar file
     -yjm,--yarnjobManagerMemory <arg>    Memory for JobManager Container with optional unit (default: MB)
     -ynl,--yarnnodeLabel <arg>           Specify YARN node label for the YARN application
     -ynm,--yarnname <arg>                Set a custom name for the application on YARN
     -yq,--yarnquery                      Display available YARN resources (memory, cores)
     -yqu,--yarnqueue <arg>               Specify YARN queue.
     -ys,--yarnslots <arg>                Number of slots per TaskManager
     -yt,--yarnship <arg>                 Ship files in the specified directory (t for transfer)
     -ytm,--yarntaskManagerMemory <arg>   Memory per TaskManager Container with optional unit (default: MB)
     -yz,--yarnzookeeperNamespace <arg>   Namespace to create the Zookeeper sub-paths for high availability mode
     -z,--zookeeperNamespace <arg>        Namespace to create the Zookeeper sub-paths for high availability mode

  Options for executor mode:
     -D <property=value>   Generic configuration options for
                           execution/deployment and for the configured executor.
                           The available options can be found at
                           https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html
     -e,--executor <arg>   The name of the executor to be used for executing the
                           given job, which is equivalent to the
                           "execution.target" config option. The currently
                           available executors are: "remote", "local",
                           "kubernetes-session", "yarn-per-job", "yarn-session".

  Options for default mode:
     -m,--jobmanager <arg>           Address of the JobManager (master) to which to connect.
     -z,--zookeeperNamespace <arg>   Namespace to create the Zookeeper sub-paths for high availability mode

程序如果出现异常可以在 Web UI 上看到，但更具体的 log 似乎不好查，因为 Task Manager 退出了
非分离模式下 flink run 命令可以看到所有 log

Single Job

无需先启动 Flink 集群，直接提交 Job 即可

./bin/flink run \
            -m yarn-cluster \
            -ynm "Word Counter Test" \
            examples/batch/WordCount.jar \
            --input /home/lin/myTest/flink/flink-1.10.0/README.txt

输出为

lin@Ubuntu-VM-1:$ ./bin/flink run -m yarn-cluster -ynm "Word Counter Test" examples/batch/WordCount.jar --input /home/lin/myTest/flink/flink-1.10.0/README.txt
2020-05-16 16:23:49,429 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/home/lin/myTest/flink/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-16 16:23:49,429 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/home/lin/myTest/flink/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
Printing result to stdout. Use --output to specify output path.
2020-05-16 16:23:50,316 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-16 16:23:50,680 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2020-05-16 16:23:51,042 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - Neither the HADOOP_CONF_DIR nor
the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2020-05-16 16:23:51,159 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1568, slotsPerTaskManager=1}
2020-05-16 16:23:51,534 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system
2020-05-16 16:23:52,847 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1589570585722_0014
2020-05-16 16:23:52,957 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1589570585722_0014
2020-05-16 16:23:52,958 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2020-05-16 16:23:52,964 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2020-05-16 16:24:06,594 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - YARN application has been deployed successfully.
2020-05-16 16:24:06,596 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Found Web Interface ubuntu-vm-1:33437 of application 'application_1589570585722_0014'.
Job has been submitted with JobID f0c62fc833814a2dfc59eecd360bc2a8
Program execution finished
Job with JobID f0c62fc833814a2dfc59eecd360bc2a8 has finished.
Job Runtime: 16036 ms
Accumulator Results:
- 8a1ab0e55aed34e646fc443e00ab961d (java.util.ArrayList) [111 elements]

(1,1)
(13,1)
(5d002,1)
(740,1)
(about,1)
(account,1)
(administration,1)
......

查看 Yarn 程序

lin@Ubuntu-VM-1:$ ./bin/yarn application -list
20/05/16 16:21:16 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1
                Application-Id      Application-Name        Application-Type          User           Queue                   State             Final-State             Progress                        Tracking-URL
application_1589570585722_0014     Word Counter Test            Apache Flink           lin         default                 RUNNING               UNDEFINED                 100%            http://Ubuntu-VM-1:36203

查看 log

./bin/yarn logs -applicationId application_1589570585722_0014

但好像看不到 Job 通过 System.out.println 打出来的 log

非分离模式下，flink run 命令可以看到所有 log

Recovery Behaviour

Flink’s YARN client has the following configuration parameters to control how to behave in case of container failures. These parameters can be set either from the conf/flink-conf.yaml or when starting the YARN session, using -D parameters.

yarn.application-attempts: The number of ApplicationMaster (+ its TaskManager containers) attempts. If this value is set to 1 (default), the entire YARN session will fail when the Application master fails. Higher values specify the number of restarts of the ApplicationMaster by YARN.

Setup for application priority on YARN

Flink’s YARN client has the following configuration parameters to setup application priority. These parameters can be set either from the conf/flink-conf.yaml or when starting the YARN session, using -D parameters.

yarn.application.priority: A non-negative integer indicating the priority for submitting a Flink YARN application. It will only take effect if YARN priority scheduling setting is enabled. Larger integer corresponds with higher priority. If priority is negative or set to ‘-1’(default), Flink will unset yarn priority setting and use cluster default priority. Please refer to YARN’s official documentation for specific settings required to enable priority scheduling for the targeted YARN version.

Running Flink on YARN behind Firewalls

Some YARN clusters use firewalls for controlling the network traffic between the cluster and the rest of the network. In those setups, Flink jobs can only be submitted to a YARN session from within the cluster’s network (behind the firewall). If this is not feasible for production use, Flink allows to configure a port range for its REST endpoint, used for the client-cluster communication. With this range configured, users can also submit jobs to Flink crossing the firewall.

The configuration parameter for specifying the REST endpoint port is the following:

rest.bind-port

Flink on Yarn 的架构

The YARN client needs to access the Hadoop configuration to connect to the YARN resource manager and HDFS. It determines the Hadoop configuration using the following strategy:

Test if YARN_CONF_DIR, HADOOP_CONF_DIR or HADOOP_CONF_PATH are set (in that order). If one of these variables is set, it is used to read the configuration.
If the above strategy fails (this should not be the case in a correct YARN setup), the client is using the
HADOOP_HOME environment variable. If it is set, the client tries to access HADOOP_HOME/etc/hadoop (Hadoop 2) and HADOOP_HOME/conf (Hadoop 1).

When starting a new Flink YARN session, the client first checks if the requested resources (memory and vcores for the ApplicationMaster) are available. After that, it uploads a jar that contains Flink and the configuration to HDFS (step 1).

The next step of the client is to request (step 2) a YARN container to start the ApplicationMaster (step 3). Since the client registered the configuration and jar-file as a resource for the container, the NodeManager of YARN running on that particular machine will take care of preparing the container (e.g. downloading the files). Once that has finished, the ApplicationMaster (AM) is started.

The JobManager and AM are running in the same container. Once they successfully started, the AM knows the address of the JobManager (its own host). It is generating a new Flink configuration file for the TaskManagers (so that they can connect to the JobManager). The file is also uploaded to HDFS. Additionally, the AM container is also serving Flink’s web interface. All ports the YARN code is allocating are ephemeral ports. This allows users to execute multiple Flink YARN sessions in parallel.

After that, the AM starts allocating the containers for Flink’s TaskManagers, which will download the jar file and the modified configuration from the HDFS. Once these steps are completed, Flink is set up and ready to accept Jobs.

posted @ 2020-05-17 16:14 moon~light 阅读(1931) 评论(0) 收藏举报

刷新页面返回顶部

moon__light