pyflink基础环境构建及任务运行
pyflink基础环境构建及任务运行
Flink #pyflink #部署
本地开发环境构建
准备
- java 8或者 java11 已经可以使用
- 本地拥有 python 或者 miniconda(建议),一下内容使用 conda管理虚拟环境
java -version
openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
- python 环境
主机安装 python, 或者 使用 conda虚拟环境
python --version
Python 3.10.8
安装 apache-flink
# 激活虚拟环境
source ~/Documents/install/miniconda/bin/activate
# 创建 pyflink 虚拟环境
conda create --name py310_pyflink171_venv -y -q python=3.10.8
conda activate py310_pyflink171_venv
pip install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple
pip install apache-flink==1.17.1 --no-cache-dir -i https://mirrors.aliyun.com/pypi/simple --use-pep517
环境校验
访问:https://nightlies.apache.org/flink/flink-docs-release-1.16/api/python/examples/table/word_count.html 复制word_count代码
python word_count.py
控制台显示如下表示成功:
Executing word_count example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/Users/faron/Documents/others/envs/py368_pyflink1161_test_venv/lib/python3.6/site-packages/pyflink/lib/flink-dist-1.16.1.jar) to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java.ClosureCleaner
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
+I[To, 1]
+I[be,, 1]
+I[or, 1]
+I[not, 1]
+I[to, 1]
+I[be,--that, 1]
+I[is, 1]
+I[the, 1]
+I[question:--, 1]
任务提交服务器运行
以下命令都有指定 压缩后的虚拟环境,如果 flink 集群所在服务器上安装了 python+apache-flink,则无需再指定压缩虚拟环境
-
打包运行环境
# 找到 minconda(安装路径 envs目录下) 或者对应虚拟环境安装目录 # 打包 py310_pyflink171_venv 虚拟环境 cd ~/Documents/install/miniconda/env zip -r py310_pyflink171_venv.zip py310_pyflink171_venv
提交至 jobmanager
-
单文件提交
./flink run \ --jobmanager localhost:8081 \ -pyarch file:///workplace/py310_pyflink171_venv.zip \ -pyexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -pyclientexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -py /workplace/src/word_count.py -
带目录,指定入口模块提交
./flink run \ --jobmanager localhost:8081 \ -pyarch file:///workplace/py310_pyflink171_venv.zip \ -pyexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -pyclientexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -pyfs /workplace/src \ -pym word_count
提交至 yarn 集群管理
-
提交运行
-
本地 py虚拟环境
./flink run -m yarn-cluster \ -pyarch file:///workplace/py310_pyflink171_venv.zip \ -pyexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -pyclientexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -py word_count.py -
hdfs py虚拟环境
./flink run -m yarn-cluster \ -pyarch hdfs://dae-ns/py_env/py310_pyflink171_venv.zip \ -pyexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -pyclientexec py310_pyflink171_venv.zip/py310_pyflink171_venv \ -py word_count.py -
带目录
src./bin/flink run-application -t yarn-application \ -Dyarn.application.name=wordcount \ -Dyarn.ship-files=/workplace/src \ -pyarch shipfiles/py310_pyflink171_venv.zip \ -pyclientexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -pyexec py310_pyflink171_venv.zip/py310_pyflink171_venv/bin/python3 \ -pyfs src \ -pym word_count
-
注意
- 虚拟环境打包,该虚拟环境创建方式建议使用 conda,或者virtualenv --always-copy 方式创建,这样打的虚拟环境更全
- 提交虚拟环境地址:py310_pyflink171_venv.zip/py310_pyflink171_venv 注意这个地址是双层
参考内容
Flink 学习网 PyFlink 作业的多种部署模式
Flink 文档提交方式
Flink 官方文档 python installer
Flink 官方文档 python word_count 示例
python 虚拟环境管理之 minconda

浙公网安备 33010602011771号