APScheduler中两种调度器的区别及使用过程中要注意的问题

摘要

本文介绍APScheduler最基本的用法“定时几秒后启动job”，解释其中两种调度器BackgroundScheduler和BlockingScheduler的区别，说明了如何做到“让job在start()后就开始运行”，详述“job执行时间大于定时调度时间”这种特殊情况的问题及解决方法，并说明了每个job都会以thread的方式被调度。

基本的定时调度

APScheduler是python的一个定时任务调度框架，能实现类似linux下crontab类型的任务，使用起来比较方便。它提供基于固定时间间隔、日期以及crontab配置类似的任务调度，并可以持久化任务，或将任务以daemon方式运行。

下面是一个最基本的使用示例：

from apscheduler.schedulers.blocking import BlockingScheduler

def job():

print('job 3s')

if __name__=='__main__':

sched = BlockingScheduler(timezone='MST')

sched.add_job(job, 'interval', id='3_second_job', seconds=3)

sched.start()

它能实现每隔3s就调度job()运行一次，所以程序每隔3s就输出’job 3s’。通过修改add_job()的参数seconds，就可以改变任务调度的间隔时间。

BlockingScheduler与BackgroundScheduler区别

APScheduler中有很多种不同类型的调度器，BlockingScheduler与BackgroundScheduler是其中最常用的两种调度器。那他们之间有什么区别呢？简单来说，区别主要在于BlockingScheduler会阻塞主线程的运行，而BackgroundScheduler不会阻塞。所以，我们在不同的情况下，选择不同的调度器：

BlockingScheduler: 调用start函数后会阻塞当前线程。当调度器是你应用中唯一要运行的东西时（如上例）使用。

BackgroundScheduler: 调用start后主线程不会阻塞。当你不运行任何其他框架时使用，并希望调度器在你应用的后台执行。

下面用两个例子来更直观的说明两者的区别。

BlockingScheduler的真实例子

from apscheduler.schedulers.blocking import BlockingScheduler

import time

def job():

print('job 3s')

if __name__=='__main__':

sched = BlockingScheduler(timezone='MST')

sched.add_job(job, 'interval', id='3_second_job', seconds=3)

sched.start()

while(True):

print('main 1s')

time.sleep(1)

运行这个程序，我们得到如下的输出：

job 3s

可见，BlockingScheduler调用start函数后会阻塞当前线程，导致主程序中while循环不会被执行到。

BackgroundScheduler的真实例子

from apscheduler.schedulers.background import BackgroundScheduler

import time

def job():

print('job 3s')

if __name__=='__main__':

sched = BackgroundScheduler(timezone='MST')

sched.add_job(job, 'interval', id='3_second_job', seconds=3)

sched.start()

while(True):

print('main 1s')

time.sleep(1)

可见，BackgroundScheduler调用start函数后并不会阻塞当前线程，所以可以继续执行主程序中while循环的逻辑。

main 1s

job 3s

main 1s

job 3s

通过这个输出，我们也可以发现，调用start函数后，job()并不会立即开始执行。而是等待3s后，才会被调度执行。

如何让job在start()后就开始运行

如何才能让调度器调用start函数后，job()就立即开始执行呢？

其实APScheduler并没有提供很好的方法来解决这个问题，但有一种最简单的方式，就是在调度器start之前，就运行一次job()，如下

from apscheduler.schedulers.background import BackgroundScheduler

import time

def job():

print('job 3s')

if __name__=='__main__':

job()

sched = BackgroundScheduler(timezone='MST')

sched.add_job(job, 'interval', id='3_second_job', seconds=3)

sched.start()

while(True):

print('main 1s')

time.sleep(1)

这样就能得到如下的输出

job 3s

main 1s

job 3s

main 1s

这样虽然没有绝对做到“让job在start()后就开始运行”，但也能做到“不等待调度，而是刚开始就运行job”。

如果job执行时间过长会怎么样

如果执行job()的时间需要5s，但调度器配置为每隔3s就调用一下job()，会发生什么情况呢？我们写了如下例子：

from apscheduler.schedulers.background import BackgroundScheduler

import time

def job():

print('job 3s')

time.sleep(5)

if __name__=='__main__':

sched = BackgroundScheduler(timezone='MST')

sched.add_job(job, 'interval', id='3_second_job', seconds=3)

sched.start()

while(True):

print('main 1s')

time.sleep(1)

运行这个程序，我们得到如下的输出：

main 1s

job 3s

main 1s

Execution of job "job (trigger: interval[0:00:03], next run at: 2018-05-07 02:44:29 MST)" skipped: maximum number of running instances reached (1)

main 1s

job 3s

main 1s

可见，3s时间到达后，并不会“重新启动一个job线程”，而是会跳过该次调度，等到下一个周期（再等待3s），又重新调度job()。

为了能让多个job()同时运行，我们也可以配置调度器的参数max_instances，如下例，我们允许2个job()同时运行：

from apscheduler.schedulers.background import BackgroundScheduler

import time

def job():

print('job 3s')

time.sleep(5)

if __name__=='__main__':

job_defaults = { 'max_instances': 2 }

sched = BackgroundScheduler(timezone='MST', job_defaults=job_defaults)

sched.add_job(job, 'interval', id='3_second_job', seconds=3)

sched.start()

while(True):

print('main 1s')

time.sleep(1)

运行程序，我们得到如下的输出：

main 1s

job 3s

main 1s

job 3s

main 1s

job 3s

每个job是怎么被调度的

通过上面的例子，我们发现，调度器是定时调度job()函数，来实现调度的。

那job()函数会被以进程的方式调度运行，还是以线程来运行呢？

为了弄清这个问题，我们写了如下程序：

from apscheduler.schedulers.background import BackgroundScheduler

import time,os,threading

def job():

print('job thread_id-{0}, process_id-{1}'.format(threading.get_ident(), os.getpid()))

time.sleep(50)

if __name__=='__main__':

job_defaults = { 'max_instances': 20 }

sched = BackgroundScheduler(timezone='MST', job_defaults=job_defaults)

sched.add_job(job, 'interval', id='3_second_job', seconds=3)

sched.start()

while(True):

print('main 1s')

time.sleep(1)

运行程序，我们得到如下的输出：

main 1s

job thread_id-10644, process_id-8872

main 1s

job thread_id-3024, process_id-8872

main 1s

job thread_id-6728, process_id-8872

main 1s

job thread_id-11716, process_id-8872

可见，每个job()的进程ID都相同，但线程ID不同。所以，job()最终是以线程的方式被调度执行。

参考

https://www.cnblogs.com/quijote/p/4385774.html

http://debugo.com/apscheduler/

https://stackoverflow.com/questions/34020161/python-apscheduler-skipped-maximum-number-of-running-instances-reached

---------------------

作者：ybdesire

来源：CSDN

原文：https://blog.csdn.net/ybdesire/article/details/82228840

摘要本文介绍APScheduler最基本的用法“定时几秒后启动job”，解释其中两种调度器BackgroundScheduler和BlockingScheduler的区别，说明了如何做到“让job在start()后就开始运行”，详述“job执行时间大于定时调度时间”这种特殊情况的问题及解决方法，并说明了每个job都会以thread的方式被调度。
基本的定时调度APScheduler是python的一个定时任务调度框架，能实现类似linux下crontab类型的任务，使用起来比较方便。它提供基于固定时间间隔、日期以及crontab配置类似的任务调度，并可以持久化任务，或将任务以daemon方式运行。
下面是一个最基本的使用示例：
from apscheduler.schedulers.blocking import BlockingScheduler
def job(): print('job 3s')

if __name__=='__main__':
sched = BlockingScheduler(timezone='MST') sched.add_job(job, 'interval', id='3_second_job', seconds=3) sched.start()123456789101112它能实现每隔3s就调度job()运行一次，所以程序每隔3s就输出’job 3s’。通过修改add_job()的参数seconds，就可以改变任务调度的间隔时间。
BlockingScheduler与BackgroundScheduler区别APScheduler中有很多种不同类型的调度器，BlockingScheduler与BackgroundScheduler是其中最常用的两种调度器。那他们之间有什么区别呢？简单来说，区别主要在于BlockingScheduler会阻塞主线程的运行，而BackgroundScheduler不会阻塞。所以，我们在不同的情况下，选择不同的调度器：
BlockingScheduler: 调用start函数后会阻塞当前线程。当调度器是你应用中唯一要运行的东西时（如上例）使用。BackgroundScheduler: 调用start后主线程不会阻塞。当你不运行任何其他框架时使用，并希望调度器在你应用的后台执行。下面用两个例子来更直观的说明两者的区别。
BlockingScheduler的真实例子from apscheduler.schedulers.blocking import BlockingSchedulerimport time
def job(): print('job 3s')

if __name__=='__main__':
sched = BlockingScheduler(timezone='MST') sched.add_job(job, 'interval', id='3_second_job', seconds=3) sched.start()
while(True): print('main 1s') time.sleep(1)1234567891011121314151617运行这个程序，我们得到如下的输出：
job 3sjob 3sjob 3sjob 3s1234可见，BlockingScheduler调用start函数后会阻塞当前线程，导致主程序中while循环不会被执行到。
BackgroundScheduler的真实例子from apscheduler.schedulers.background import BackgroundSchedulerimport time
def job(): print('job 3s')

if __name__=='__main__':
sched = BackgroundScheduler(timezone='MST') sched.add_job(job, 'interval', id='3_second_job', seconds=3) sched.start()
while(True): print('main 1s') time.sleep(1)1234567891011121314151617可见，BackgroundScheduler调用start函数后并不会阻塞当前线程，所以可以继续执行主程序中while循环的逻辑。
main 1smain 1smain 1sjob 3smain 1smain 1smain 1sjob 3s12345678通过这个输出，我们也可以发现，调用start函数后，job()并不会立即开始执行。而是等待3s后，才会被调度执行。
如何让job在start()后就开始运行如何才能让调度器调用start函数后，job()就立即开始执行呢？
其实APScheduler并没有提供很好的方法来解决这个问题，但有一种最简单的方式，就是在调度器start之前，就运行一次job()，如下
from apscheduler.schedulers.background import BackgroundSchedulerimport time
def job(): print('job 3s')

if __name__=='__main__': job() sched = BackgroundScheduler(timezone='MST') sched.add_job(job, 'interval', id='3_second_job', seconds=3) sched.start()
while(True): print('main 1s') time.sleep(1)12345678910111213141516这样就能得到如下的输出
job 3smain 1smain 1smain 1sjob 3smain 1smain 1smain 1s12345678这样虽然没有绝对做到“让job在start()后就开始运行”，但也能做到“不等待调度，而是刚开始就运行job”。
如果job执行时间过长会怎么样如果执行job()的时间需要5s，但调度器配置为每隔3s就调用一下job()，会发生什么情况呢？我们写了如下例子：
from apscheduler.schedulers.background import BackgroundSchedulerimport time
def job(): print('job 3s') time.sleep(5)
if __name__=='__main__':
sched = BackgroundScheduler(timezone='MST') sched.add_job(job, 'interval', id='3_second_job', seconds=3) sched.start()
while(True): print('main 1s') time.sleep(1)1234567891011121314151617运行这个程序，我们得到如下的输出：
main 1smain 1smain 1sjob 3smain 1smain 1smain 1sExecution of job "job (trigger: interval[0:00:03], next run at: 2018-05-07 02:44:29 MST)" skipped: maximum number of running instances reached (1)main 1smain 1smain 1sjob 3smain 1s12345678910111213可见，3s时间到达后，并不会“重新启动一个job线程”，而是会跳过该次调度，等到下一个周期（再等待3s），又重新调度job()。
为了能让多个job()同时运行，我们也可以配置调度器的参数max_instances，如下例，我们允许2个job()同时运行：
from apscheduler.schedulers.background import BackgroundSchedulerimport time
def job(): print('job 3s') time.sleep(5)
if __name__=='__main__': job_defaults = { 'max_instances': 2 } sched = BackgroundScheduler(timezone='MST', job_defaults=job_defaults) sched.add_job(job, 'interval', id='3_second_job', seconds=3) sched.start()
while(True): print('main 1s') time.sleep(1)12345678910111213141516运行程序，我们得到如下的输出：
main 1smain 1smain 1sjob 3smain 1smain 1smain 1sjob 3smain 1smain 1smain 1sjob 3s123456789101112每个job是怎么被调度的通过上面的例子，我们发现，调度器是定时调度job()函数，来实现调度的。
那job()函数会被以进程的方式调度运行，还是以线程来运行呢？
为了弄清这个问题，我们写了如下程序：
from apscheduler.schedulers.background import BackgroundSchedulerimport time,os,threading
def job(): print('job thread_id-{0}, process_id-{1}'.format(threading.get_ident(), os.getpid())) time.sleep(50)
if __name__=='__main__': job_defaults = { 'max_instances': 20 } sched = BackgroundScheduler(timezone='MST', job_defaults=job_defaults) sched.add_job(job, 'interval', id='3_second_job', seconds=3) sched.start()
while(True): print('main 1s') time.sleep(1)12345678910111213141516运行程序，我们得到如下的输出：
main 1smain 1smain 1sjob thread_id-10644, process_id-8872main 1smain 1smain 1sjob thread_id-3024, process_id-8872main 1smain 1smain 1sjob thread_id-6728, process_id-8872main 1smain 1smain 1sjob thread_id-11716, process_id-887212345678910111213141516可见，每个job()的进程ID都相同，但线程ID不同。所以，job()最终是以线程的方式被调度执行。
参考https://www.cnblogs.com/quijote/p/4385774.htmlhttp://debugo.com/apscheduler/https://stackoverflow.com/questions/34020161/python-apscheduler-skipped-maximum-number-of-running-instances-reached--------------------- 作者：ybdesire 来源：CSDN 原文：https://blog.csdn.net/ybdesire/article/details/82228840 版权声明：本文为博主原创文章，转载请附上博文链接！

posted @ 2019-08-02 17:40 Littlefish- 阅读(1817) 评论(0) 收藏举报

Little Fish

APScheduler中两种调度器的区别及使用过程中要注意的问题

公告