Flink ON YARN模式
一、安装Flink
1、下载
1.1、下载 flink 包
官网地址:https://archive.apache.org/dist/flink/
采用flink-1.8.0-bin-scala_2.11.tgz安装,因目前Flink尚未集成hadoop2.9版本,因此选择2.6稳定版进行安装(兼容),flink-shaded-hadoop-2-uber-2.6.5-7.0.jar
flink-1.8.0-bin-scala_2.11.tgz下载路径:
https://archive.apache.org/dist/flink/flink-1.8.0/flink-1.8.0-bin-scala_2.11.tgz
1.2、下载 hadoop 依赖包
官网下载flink-shaded-hadoop-2-uber-2.6.5-7.0.jar
拷贝到${flink_home}/lib/目录下
下载地址 https://flink.apache.org/downloads.html
2、解压
解压安装在 /opt/module 目录下
$ cd /opt/module
$ tar -zxvf flink-1.8.0-bin-scala_2.11.tgz
3、配置环境变量
$ vim /etc/profile
export FLINK_HOME=/opt/module/flink-1.8.0
export PATH=$FLINK_HOME/bin:$PATH
$ source /etc/profile
$ flink #执行成功,表示Flink安装成功
注意:非root用户,配在.bash_profile文件里
4、添加 hadoop 依赖包
把 flink-shaded-hadoop-2-uber-2.6.5-7.0.jar 拷贝到 /opt/module/flink-1.8.0/lib 目录下
4、修改yarn-site.xml文件
$ sudo vim /opt/module/hadoop-2.7.6/etc/hadoop/yarn-site.xml
4.1、配置AM在尝试重启的最大次数
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>4</value>
<description>The maximum number of application master execution attempts</description>
</property>
5、修改 flink-conf.yaml 文件
$ sudo vim /opt/module/flink-1.8.0/conf/flink-conf.yaml
#jobmanager taskmanager配置
jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
jobmanager.heap.mb: 256
taskmanager.heap.mb: 512
taskmanager.numberOfTaskSlots: 1
#是否应在 TaskManager 启动时预先分配 TaskManager 管理的内存
#默认不进行预分配,这样在我们不使用flink集群时候不会占用集群资源
taskmanager.memory.preallocate: false
parallelism.default: 1
jobmanager.web.port: 33069 #默认端口8081
# yarn ApplicationMaster 能接受最多的失败 container 数,直到 YARN 会话失败。
# 需配置,注意":"之后有个空格
yarn.maximum-failed-containers: 99999
#akka config
# 需配置,注意":"之后有个空格
akka.watch.heartbeat.interval: 5 s
akka.watch.heartbeat.pause: 20 s
akka.ask.timeout: 60 s
akka.framesize: 20971520b
#high-avaliability(高可用配置)
# 注释打开
high-availability: zookeeper
## 根据安装的zookeeper信息填写
#high-availability.zookeeper.quorum: 10.141.61.226:2181,10.141.53.244:2181,10.141.18.219:2181
high-availability.zookeeper.quorum: localhost:2181
high-availability.zookeeper.path.root: /data/flink/flink-on-yarn
## HA 信息存储到HDFS的目录,根据各自的Hdfs情况修改,hdfs://mycluster/取自配置fs.defaultFS的值
#high-availability.zookeeper.storageDir: hdfs://mycluster/flink/recovery
high-availability.zookeeper.storageDir: hdfs://mycluster/flink/recovery
##ApplicationMaster尝试次数
yarn.application-attempts=10
#checkpoint config(容错配置)
##支持Backend有memory、fs、rocksdb
state.backend: rocksdb
## checkpoint到HDFS的目录 根据各自安装的HDFS情况修改
state.backend.fs.checkpointdir: hdfs://mycluster/flink/checkpoint
## 对外checkpoint到HDFS的目录
state.checkpoints.dir: hdfs://mycluster/flink/savepoint
#memory config
env.java.opts: -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -
XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -XX:+HeapDumpOnOutOfMemoryError
yarn.heap-cutoff-ratio: 0.2
taskmanager.memory.off-heap: true
6、Flink ON YARN相关服务启动
启动Zookeeper集群
$ /opt/module/zookeeper-3.4.6/bin/zkServer.sh start
##zookeeper重启可以改用如下命令
$ /opt/module/zookeeper-3.4.6/bin/zkServer.sh restart
启动HDFS集群
$ start-dfs.sh
验证hadoop是否启动成功
http://182.61.*.60:50070/
http://106.12.*.89:50070/
启动YARN集群
$ start-yarn.sh
查看服务状态
[admin@145 sbin]$ yarn rmadmin -getServiceState rm1
standby
[admin@145 sbin]$ yarn rmadmin -getServiceState rm2
active
http://106.12.241.145:33069
sudo vim /opt/module/hadoop-2.7.6/etc/hadoop/yarn-site.xml
<!-- ResourceManager对外web ui地址。用户可通过该地址在浏览器中查看集群各类信息。-->
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>node145:33069</value>
</property>
3、以yarnSession方式提交job
3.1、创建session
创建yarnsession
$ yarn-session.sh -nm test -n 2 -tm 1024 -s 2
#-nm Yarn的应用名字
#-n (container) taskmanager的数量
#-tm 每个taskmanager的内存大小
#-s (slot) 每个taskmanager的slot 数量
yarn-session的信息会写到:/tmp/.yarn-properties-hadoop
这种机制不太好(手动获取appid)
问题一:
The configuration directory ('/opt/module/flink-1.8.0/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
https://www.cnblogs.com/frankdeng/p/9047698.html
https://blog.csdn.net/magic_kid_2010/article/details/97004746?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task
zookeper 安装教程
https://www.cnblogs.com/linjiqin/p/8407084.html