Flink入门(二)

接上一篇Flink入门(一)WordCount,Flink得三种运行模式Stand alone,YARN,K8S。我就不多赘述了,主要说一下Flink On YARN

简单粗暴分布讲解:

1. 保证HDFS,YARN集群均开启得前提下,开启Flink得yarn-session

hadoop@hadoop1:/opt/flink-1.10.1/bin$ ./yarn-session.sh -n 2 -s 2 -jm 1024 -tm 1024 -nm test -d

上面的参数详情我简单贴一下吧:

-n(--container):   TaskManager的数量。
-s(--slots):     每个 TaskManager的slot 数量, 默认一个slot一个 core, 默认每个taskmanager 的 slot 的个数为 1, 有时可以多一些 taskmanager, 做冗余。
-jm:          JobManager的内存(单位MB)。
-tm:          每个taskmanager的内存(单位MB)。
-nm:          yarn的appName(现在yarn的ui上的名字)。
-d:           后台执行。

 

2. 提交Flink任务,跟Stand alone没有任何区别,就直接启动就完事儿了,命令行为例

hadoop@hadoop1:/opt/flink-1.10.1$ ./bin/flink run -c WordCount_Streaming \
-p 2 \
flinktestlearn-1.0-SNAPSHOT-jar-with-dependencies.jar \
--host localhost --port 8765

2020-09-02 16:17:21,853 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-hadoop.
2020-09-02 16:17:21,853 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-hadoop.
2020-09-02 16:17:22,478 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli - The configuration directory ('/opt/flink-1.10.1/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-09-02 16:17:22,478 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli - The configuration directory ('/opt/flink-1.10.1/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-09-02 16:17:26,304 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop1/192.168.6.21:8032
2020-09-02 16:17:26,507 INFO org.apache.flink.yarn.YarnClusterDescriptor - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2020-09-02 16:17:26,517 WARN org.apache.flink.yarn.YarnClusterDescriptor - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2020-09-02 16:17:26,608 INFO org.apache.flink.yarn.YarnClusterDescriptor - Found Web Interface hadoop2:39620 of application 'application_1596791033658_0003'.
Job has been submitted with JobID 909ef10669c8ed6fea50fabd947481ac

 

如果没什么问题打印的日志就是上面得样子,标红得就是yarn得applicationID,当然了,既然是wordcount,总要有输出看得见结果得地方,往下看:

进ResourceManager-host:8088之后,找到你的applicationID点进去如下图

 

 

 继续点击Attmpt ID,然后就会看到下图,两个containerID,这两个一个是JobManager,另一个是TaskManager,TaskManager是真正输出数据得地方,找到标准输出得日志,就能看到结果了如下:

 以上就是最简单得wordcount用Flink on YARN的演示,至于中间遇到的问题,当然有,往下看:


 

启动YARN-Session的时候遇到如下问题:、

1. Error: A JNI error has occurred, please check your installation and try again

hadoop@hadoop1:/opt/flink-1.10.1/bin$ ./yarn-session.sh -n 2 -s 2 -jm 1024 -tm 1024 -nm test -d
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/exceptions/YarnException
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
    at java.lang.Class.getMethod0(Class.java:3018)
    at java.lang.Class.getMethod(Class.java:1784)
    at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.yarn.exceptions.YarnException
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more
hadoop@hadoop1:/opt/flink-1.10.1/bin$ 

 

 点击这里(提取码pb4y)下载安装包,丢进$FLINK_HOME/lib下,重新启动yarn-session就OK了,至于为什么,我还没研究

posted @ 2020-09-02 16:43  wen1995  阅读(429)  评论(0编辑  收藏  举报