使用Python脚本提交Spark任务到Yarn

背景:使用脚本管理Spark任务,正处于RUNNING状态的任务跳过,不提交

一、涉及到的知识点:

脚本不重要,重要的是知识点

1.查询yarn处于RUNNING状态的任务列表

yarn application -list -appStates RUNNING

2.在Python中使用Linux命令

import os
os.system('ll /')
os.popen("ll /")

二、完整脚本

import os
file = open(r'./bash.txt', 'r')

name_bash_dict = {}

for line in file.readlines():
    words = line.split(':')
    name_bash_dict[words[0]] = words[1]
file.close()

running_job_lines = os.popen("yarn application -list -appStates RUNNING")
line_num = 0
for line in running_job_lines.readlines():
    line_num += 1
    if line_num == 3:
        column = line.split('\t')
        if len(column) == 9 and column[5].strip() == 'RUNNING':
            jobName = column[1].strip()
            del name_bash_dict[jobName]

for v in name_bash_dict.values():
    os.system(v)
posted @ 2020-11-30 00:29  yangyh11  阅读(854)  评论(0编辑  收藏  举报