celery — 异步、定时,python

celery

01 简介

​ celery是一个由python开发的用来处理大量数据的分布式系统。专注实时任务处理,支持任务调度。通常用它实现异步任务(async task)和定时任务(crontab)。

​ 首先,celery不是一个任务队列,是一个管理分布式队列的工具,而且与语言无关,是一个独立的工具。

​ 应用场景:

  1. 定时任务:每天几点执行
  2. 同步的可以换成异步:发邮件、推送消息等。

可以分为四部分:

brokers:中间人,接收任务生产者发来的消息(即任务),将任务存入队列,一般用redis

backend:存储结果的仓库

workers:工作者,它实时监控消息队列,获取队列中调度的任务,并执行它

tasks:任务,包含异步任务和定时任务。异步任务通常在业务逻辑中被触发并发往任务队列,而定时任务由 Celery Beat 进程周期性地将任务发往任务队列

02 安装

  • 安装redis数据库

https://www.cnblogs.com/tianzhh/articles/13646848.html

  • 下载python的redis包
pip3 install redis
  • 下载celery
pip3 install celery

03 应用

01 异步任务

  • 创建并发送一个异步任务
# tasks.py文件

from celery import Celery

broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"

app = Celery("tasks", broker=broker, backend=backend) # 实例名必须与文件名一致

@app.task # 装饰为celery 的task对象
def add(x, y):
    return x + y
  

注意:

Celery实例,第一个参数是指定任务名,必须与文件名一致

​ 当有多个装饰器的时候,app.task一定要在最外层

  • 开启worker执行任务
celery -A tasks worker --loglevel=info

注意:

-A指定创建celery对象的的文件路径

-Q参数指的是该worker接收指定的队列的任务,这是为了当多个队列有不同的任务时可以独立;如果不设会接收所有的队列的任务

  • 调用
from tasks import add

res = add.delay(1, 6)

print(res.id)
while True:
    if res.ready(): # celery 运行完毕,得到结果时为True
        print(res.result)  # 异步,没运行完或者没结果时,结果为None
        print(res.get()) # 一直等有结果
        break

注意:

delay把函数以及参数发送到workers中

  • 查看返回值
from celery.result import AsyncResult
from pro_celery import tasks


async_obj = AsyncResult(id="9a2c2c5e-764f-49f0-a70a-4144708517e5", app=tasks.app)

if async_obj.successful():
    result = async_obj.get()
    print(result)
    # result.forget() # 将结果删除
elif async_obj.failed():
    print('执行失败')
elif async_obj.status == 'PENDING':
    print('任务等待中被执行')
elif async_obj.status == 'RETRY':
    print('任务异常后正在重试')
elif async_obj.status == 'STARTED':
    print('任务已经开始被执行')
  • 多任务

celery.py

from celery import Celery


broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"
app = Celery("cele", backend=broker, broker=backend, include=(
    "tasks2",
    "task1",
))

注意:

​ 如果task不在该文件中,应指定task所在的文件名

任务一:task1.py

import time
from cele import app


@app.task
def add(x, y):
    time.sleep(x)
    return x + y

任务二:task2.py

import time
from cele import app


@app.task
def multi(x, y):
    time.sleep(x)
    return x * y

开启workers

celery -A tasks worker --loglevel=info

注意:

-A后跟celery对象所在的文件名

加入-调用

import tasks2, task1

t = tasks2.multi.delay(5, 5)
t1 = task1.add.delay(1, 5)
print(t.get())
print(t1.get())

注意:

​ 一般不用get方法,会变成同步,阻塞

02 定时任务

  • 普通方法

在具体之间执行:2020-09-15 15:04:10执行

import time, task2
from datetime import datetime


tim = datetime(2020, 9, 15, 15, 54, 12) 
v2 = datetime.utcfromtimestamp(tim.timestamp())
print(v2)

task2.math_add.apply_async(args=(3, 6), eta=v2) # 异步

注意:

​ 默认是的etc时间

apply_async是异步执行的

间隔多久执行:10秒执行一次

import task2
from datetime import datetime, timedelta

ctime = datetime.now()
utc_ctime = datetime.utcfromtimestamp(ctime.timestamp())

delta_time = timedelta(seconds=10) # days,weekw,hours,minutes都可以
utc_time = utc_ctime + delta_time 

print(utc_time)
t = task2.math_add.apply_async(args=(3, 6), eta=utc_time)

注意:

​ 都是格式化的时间

  • 类似contab

在celery的对象中配置

from celery import Celery
from celery.schedules import crontab


broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"

app = Celery("celery_crontab", broker=broker, backend=backend, include="tasks")

# app.conf.timezone = 'Asia/Shanghai'
# app.conf.enable_utc = False
app.conf.beat_schedule = {
    "10-seconds": {
        "task": "tasks.write_number",
        "schedule": crontab(minute=1), # 每一分钟
        "args": (19, 3)
    },
    "each10s_task": {
        "task": "tasks.write_number",
        "schedule": 10,  # 每10秒钟执行一次
        "args": (10, 10)
    }
}

注意:

​ 要创建beat,在配置的时间往worker中提交请求

创建beat

celery -A tasks beat

开启worker

celery -A tasks worker --loglevel=info

04 celery的配置文件

基本配置

# 注意,celery4版本后,CELERY_BROKER_URL改为BROKER_URL
BROKER_URL = 'amqp://username:passwd@host:port/虚拟主机名'
# 指定结果的接受地址
CELERY_RESULT_BACKEND = 'redis://username:passwd@host:port/db'
# 指定任务序列化方式
CELERY_TASK_SERIALIZER = 'msgpack' 
# 指定结果序列化方式
CELERY_RESULT_SERIALIZER = 'msgpack'
# 任务过期时间,celery任务执行结果的超时时间
CELERY_TASK_RESULT_EXPIRES = 60 * 20   
# 指定任务接受的序列化类型.
CELERY_ACCEPT_CONTENT = ["msgpack"]   
# 任务发送完成是否需要确认,这一项对性能有一点影响     
CELERY_ACKS_LATE = True  
# 压缩方案选择,可以是zlib, bzip2,默认是发送没有压缩的数据
CELERY_MESSAGE_COMPRESSION = 'zlib' 
# 规定完成任务的时间
CELERYD_TASK_TIME_LIMIT = 5  # 在5s内完成任务,否则执行该任务的worker将被杀死,任务移交给父进程
# celery worker的并发数,默认是服务器的内核数目,也是命令行-c参数指定的数目
CELERYD_CONCURRENCY = 4 
# celery worker 每次去rabbitmq预取任务的数量
CELERYD_PREFETCH_MULTIPLIER = 4 
# 每个worker执行了多少任务就会死掉,默认是无限的
CELERYD_MAX_TASKS_PER_CHILD = 40 
# 设置默认的队列名称,如果一个消息不符合其他的队列就会放在默认队列里面,如果什么都不设置的话,数据都会发送到默认的队列中
CELERY_DEFAULT_QUEUE = "default" 
# 设置详细的队列
CELERY_QUEUES = {
    "default": { # 这是上面指定的默认队列
        "exchange": "default",
        "exchange_type": "direct",
        "routing_key": "default"
    },
    "topicqueue": { # 这是一个topic队列 凡是topictest开头的routing key都会被放到这个队列
        "routing_key": "topic.#",
        "exchange": "topic_exchange",
        "exchange_type": "topic",
    },
    "task_eeg": { # 设置扇形交换机
        "exchange": "tasks",
        "exchange_type": "fanout",
        "binding_key": "tasks",
    },
}

在celery4.0以后配置参数改成了小写,对于4.0以后的版本替代参数:

CELERY_ACCEPT_CONTENT	accept_content
CELERY_ENABLE_UTC	enable_utc
CELERY_IMPORTS	imports
CELERY_INCLUDE	include
CELERY_TIMEZONE	timezone
CELERYBEAT_MAX_LOOP_INTERVAL	beat_max_loop_interval
CELERYBEAT_SCHEDULE	beat_schedule
CELERYBEAT_SCHEDULER	beat_scheduler
CELERYBEAT_SCHEDULE_FILENAME	beat_schedule_filename
CELERYBEAT_SYNC_EVERY	beat_sync_every
BROKER_URL	broker_url
BROKER_TRANSPORT	broker_transport
BROKER_TRANSPORT_OPTIONS	broker_transport_options
BROKER_CONNECTION_TIMEOUT	broker_connection_timeout
BROKER_CONNECTION_RETRY	broker_connection_retry
BROKER_CONNECTION_MAX_RETRIES	broker_connection_max_retries
BROKER_FAILOVER_STRATEGY	broker_failover_strategy
BROKER_HEARTBEAT	broker_heartbeat
BROKER_LOGIN_METHOD	broker_login_method
BROKER_POOL_LIMIT	broker_pool_limit
BROKER_USE_SSL	broker_use_ssl
CELERY_CACHE_BACKEND	cache_backend
CELERY_CACHE_BACKEND_OPTIONS	cache_backend_options
CASSANDRA_COLUMN_FAMILY	cassandra_table
CASSANDRA_ENTRY_TTL	cassandra_entry_ttl
CASSANDRA_KEYSPACE	cassandra_keyspace
CASSANDRA_PORT	cassandra_port
CASSANDRA_READ_CONSISTENCY	cassandra_read_consistency
CASSANDRA_SERVERS	cassandra_servers
CASSANDRA_WRITE_CONSISTENCY	cassandra_write_consistency
CASSANDRA_OPTIONS	cassandra_options
CELERY_COUCHBASE_BACKEND_SETTINGS	couchbase_backend_settings
CELERY_MONGODB_BACKEND_SETTINGS	mongodb_backend_settings
CELERY_EVENT_QUEUE_EXPIRES	event_queue_expires
CELERY_EVENT_QUEUE_TTL	event_queue_ttl
CELERY_EVENT_QUEUE_PREFIX	event_queue_prefix
CELERY_EVENT_SERIALIZER	event_serializer
CELERY_REDIS_DB	redis_db
CELERY_REDIS_HOST	redis_host
CELERY_REDIS_MAX_CONNECTIONS	redis_max_connections
CELERY_REDIS_PASSWORD	redis_password
CELERY_REDIS_PORT	redis_port
CELERY_RESULT_BACKEND	result_backend
CELERY_MAX_CACHED_RESULTS	result_cache_max
CELERY_MESSAGE_COMPRESSION	result_compression
CELERY_RESULT_EXCHANGE	result_exchange
CELERY_RESULT_EXCHANGE_TYPE	result_exchange_type
CELERY_TASK_RESULT_EXPIRES	result_expires
CELERY_RESULT_PERSISTENT	result_persistent
CELERY_RESULT_SERIALIZER	result_serializer
CELERY_RESULT_DBURI	请result_backend改用。
CELERY_RESULT_ENGINE_OPTIONS	database_engine_options
[...]_DB_SHORT_LIVED_SESSIONS	database_short_lived_sessions
CELERY_RESULT_DB_TABLE_NAMES	database_db_names
CELERY_SECURITY_CERTIFICATE	security_certificate
CELERY_SECURITY_CERT_STORE	security_cert_store
CELERY_SECURITY_KEY	security_key
CELERY_ACKS_LATE	task_acks_late
CELERY_TASK_ALWAYS_EAGER	task_always_eager
CELERY_TASK_ANNOTATIONS	task_annotations
CELERY_TASK_COMPRESSION	task_compression
CELERY_TASK_CREATE_MISSING_QUEUES	task_create_missing_queues
CELERY_TASK_DEFAULT_DELIVERY_MODE	task_default_delivery_mode
CELERY_TASK_DEFAULT_EXCHANGE	task_default_exchange
CELERY_TASK_DEFAULT_EXCHANGE_TYPE	task_default_exchange_type
CELERY_TASK_DEFAULT_QUEUE	task_default_queue
CELERY_TASK_DEFAULT_RATE_LIMIT	task_default_rate_limit
CELERY_TASK_DEFAULT_ROUTING_KEY	task_default_routing_key
CELERY_TASK_EAGER_PROPAGATES	task_eager_propagates
CELERY_TASK_IGNORE_RESULT	task_ignore_result
CELERY_TASK_PUBLISH_RETRY	task_publish_retry
CELERY_TASK_PUBLISH_RETRY_POLICY	task_publish_retry_policy
CELERY_QUEUES	task_queues
CELERY_ROUTES	task_routes
CELERY_TASK_SEND_SENT_EVENT	task_send_sent_event
CELERY_TASK_SERIALIZER	task_serializer
CELERYD_TASK_SOFT_TIME_LIMIT	task_soft_time_limit
CELERYD_TASK_TIME_LIMIT	task_time_limit
CELERY_TRACK_STARTED	task_track_started
CELERYD_AGENT	worker_agent
CELERYD_AUTOSCALER	worker_autoscaler
CELERYD_CONCURRENCY	worker_concurrency
CELERYD_CONSUMER	worker_consumer
CELERY_WORKER_DIRECT	worker_direct
CELERY_DISABLE_RATE_LIMITS	worker_disable_rate_limits
CELERY_ENABLE_REMOTE_CONTROL	worker_enable_remote_control
CELERYD_HIJACK_ROOT_LOGGER	worker_hijack_root_logger
CELERYD_LOG_COLOR	worker_log_color
CELERYD_LOG_FORMAT	worker_log_format
CELERYD_WORKER_LOST_WAIT	worker_lost_wait
CELERYD_MAX_TASKS_PER_CHILD	worker_max_tasks_per_child
CELERYD_POOL	worker_pool
CELERYD_POOL_PUTLOCKS	worker_pool_putlocks
CELERYD_POOL_RESTARTS	worker_pool_restarts
CELERYD_PREFETCH_MULTIPLIER	worker_prefetch_multiplier
CELERYD_REDIRECT_STDOUTS	worker_redirect_stdouts
CELERYD_REDIRECT_STDOUTS_LEVEL	worker_redirect_stdouts_level
CELERYD_SEND_EVENTS	worker_send_task_events
CELERYD_STATE_DB	worker_state_db
CELERYD_TASK_LOG_FORMAT	worker_task_log_format
CELERYD_TIMER	worker_timer
CELERYD_TIMER_PRECISION	worker_timer_precision

加载配置文件中的配置

from celery import Celery

broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"

app = Celery("celery_crontab", broker=broker, backend=backend, include="tasks")

# app.conf.timezone = 'Asia/Shanghai'
# app.conf.enable_utc = False

app.config_from_object("celery_conf") # 加载配置文件		

05 django中应用

在app下创建celery_disk文件夹

创建celery_conf.py

import djcelery
djcelery.setup_loader()
CELERY_IMPORTS=(
    'app01.tasks',
)
#有些情况可以防止死锁
CELERYD_FORCE_EXECV=True
# 设置并发worker数量
CELERYD_CONCURRENCY=4
#允许重试
CELERY_ACKS_LATE=True
# 每个worker最多执行100个任务被销毁,可以防止内存泄漏
CELERYD_MAX_TASKS_PER_CHILD=100
# 超时时间
CELERYD_TASK_TIME_LIMIT=12*30

创建tasks.py

from celery import task
@task
def add(a,b):
    with open('a.text', 'a', encoding='utf-8') as f:
        f.write('a')
    print(a+b)

视图函数:view.py

from django.shortcuts import render,HttpResponse
from app01.tasks import add
from datetime import datetime
def test(request):
    # result=add.delay(2,3)
    ctime = datetime.now()
    # 默认用utc时间
    utc_ctime = datetime.utcfromtimestamp(ctime.timestamp())
    from datetime import timedelta
    time_delay = timedelta(seconds=5)
    task_time = utc_ctime + time_delay
    result = add.apply_async(args=[4, 3], eta=task_time)
    print(result.id)
    return HttpResponse('ok')

setting.py

INSTALLED_APPS = [
    ...
    'djcelery',
    'app01'
]

from djagocele import celeryconfig
BROKER_BACKEND='redis'
BOOKER_URL='redis://127.0.0.1:6379/1'
CELERY_RESULT_BACKEND='redis://127.0.0.1:6379/2'
posted @ 2020-09-15 16:27  tianzhh_lynn  阅读(204)  评论(0编辑  收藏  举报