[翻译] Gevent介绍

  Gevent是一个基于协程的Python网络库

  以下是Gevent的一些特性。

  • 基于libev的快速的事件循环(在 linux 上使用 epoll,FreeBSD 上使用 kqueue)
  • 基于 greenlet 的轻量级执行单元
  • 沿用了 Python 标准库的 API 重用理念
  • 协作式的 socket 和 ssl 模块
  • 能够使用标准库和三方模块来写标准阻塞型 socket (使用 gevnet.mokey)
  • 通过 threadpoll(默认)或 c-ares(通过设置环境变量 GEVENT_RESOLVER=ares)执行DNS查询
  • TCP/UDP/HTTP 服务端
  • 支持子进程(通过使用 gevent.subprocess)
  • 线程池

  Installation

  gevent 运行支持 Python 2.5 及更高版本.

  

  • greenlet which can be installed(安装) with pip install greenlet.
  • libev apt-get install libev
  • For ssl to work on Python older than 2.6, ssl package is required.

  

 Example

  the following example shows how to run tasks concurrently

  下面的例子演示了   

   >>> import gevent

   >>> from gevent import socke

     >>> urls = ['www.google.com', 'www.example.com', 'www.python.org']

   >>> jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls]
   >>> gevent.joinall(jobs, timeout=2)
   >>> [job.value for job in jobs]
   ['74.125.79.106', '208.77.188.166', '82.94.164.162']


After the jobs have been spawned, gevent.joinall() waits for them to complete, no longer than 2 seconds though. The results are then collected by checking gevent.Greenlet.value property. The gevent.socket.gethostbyname() function has the same interface as the standardsocket.gethostbyname() but it does not block the whole interpreter and thus lets the other greenlets proceed with their requests unhindered.
  
使用 gevent.spawn() 将任务注册好之后,gevent.joinall() 会让主线程等待全部任务完成, 等待时间不超过2秒。通过 gevent.Greenlet.value 可以查看存储的任务结果。
  gevent.socket.gethostbyname()函数拥有和标准函数 socket.gethostbyname() 相同的接口,不过前者不会被阻塞, 这样就能让其他 greenlets 的执行不会因其请求而受到影响。


Monkey patching

The example above used gevent.socket for socket operations. If the standard socketmodule was used the example would have taken 3 times longer to complete because the DNS requests would be sequential. Using the standard socket module inside greenlets makes gevent rather pointless, so what about module and packages that are built on top of socket?

上面的例子使用了 gevent.socket 用来进行 socket 操作,如果是使用标准 socket 模块则需要花费超过3倍的时间,因为DSN的请求是连续的。使用 grentlets 中的标准 socket 模块会让 gevent 的工作更加没有意义,  那么那些使用顶层 socket 的模块和包呢?

That’s what monkey patching is for. The functions in gevent.monkey carefully replace functions and classes in the standard socket module with their cooperative(合作的)counterparts. That way even the modules that are unaware of gevent canbenefit from running in a multi-greenlet environment.

这就是 monkey 需要 patching的。 gevent.monkey 模块中的函数通过它们的协作副本十分小心地替换了那些在标准 socket 模块中的 function 和 class。通过这种方式,即使是(unware)不属于gevent中的模块,也能从 运行在多 greenlet 的环境中收益。

 

>>> from gevent import monkey; monkey.patch_socket()
>>> import urllib2 # it's usable from multiple greenlets now


See examples/concurrent(共点)_download.py

Event loop

Unlike other network libraries, in similar fashion to eventlet, gevent starts the event loop implicitly in a dedicated greenlet. There’s no reactor that you must call a run() ordispatch() function on. When a function from gevent’s API wants to block, it obtains the Hubinstance - a greenlet that runs the event loop - and switchesto it. If there’s no Hub instance yet, one is created on the fly.

gevent 并不像其他的网络库那样,它与eventlet类似,gevent 在一个专门的 greentlet 里隐式地开始事件循环,所以你不需要像在 reactor 机制那样必须调用 run()  或者 distptach() 函数使其运行。当 gevent API中的函数将要阻塞时,它包含的 Hub 实例 --- 一个greenlet,就会运行 event loop,切换回到Hub。

The event loop provided by libev uses the fastest polling mechanism available on the system by default. It is possible to command libev to use a particular polling mechanism by setting theLIBEV_FLAGS` environment variable. Possible values include LIBEV_FLAGS=1 for the select backend, LIBEV_FLAGS=2 for the poll backend, LIBEV_FLAGS=4 for the epoll backend andLIBEV_FLAGS=8 for the kqueue backend. Please read the libev documentation for more information.

libev提供的事件循环运用了系统默认的最快轮循机制。这可以让libev通过设置环境变量 LIBEV_FALGS 来使用一个特殊轮循机制。可能的值包括 LIBEV_FLAGS=1( select backend),LIBEV_FLAGS=2 (poll backend),LIBEV_FLAGS=4 (epoll backend) 和 LIBEV_FLAGS=8 (kqueue backends)。请阅读libev documentation获取更多信息。

The Libev API is available under gevent.core module. Note, that the callbacks supplied to the libev API are run in the Hub greenlet and thus cannot use the synchronous gevent API. It is possible to use the asynchronous API there, like spawn() andEvent.set().

Libev API 可以在 gevent.core 模块中使用。值得注意的是,libev API还提供了callback回调函数运行在 Hub greenlet 里,因此不能使用同步的 gevent API。这里可以使用异步的 API ,例如 spawn() 和 Event.set()

Cooperative multitasking

The greenlets all run in the same OS thread and are scheduled cooperatively. This means that until a particular greenlet gives up control, (by calling a blocking function that willswitch to the Hub), other greenlets won’t get a chance to run. It is typically not an issue for an I/O bound app, but one should be aware of this when doing something CPUintensive, or when calling blocking I/O functions that bypass the libev event loop.

所有的 greenlet 都在一个操作系统的线程里面运行。这意味着除非某个特定的 greenlet 通过执行阻塞函数放弃对线程的控制,不然其它的 greenlet 将得不到运行的机会。这对于 I/O 约束的应用来说并不是什么问题,但在一些 CPU 密集型任务,或者 绕开 libev 事件循环去调用阻塞的 I/O 函数时就需要注意了。

Synchronizing access to objects shared across the greenlets is unnecessary in most cases, thusLock and Semaphore classes, although present, aren’t used very often. Other abstractions from threading and multiprocessing remain useful in the cooperative world:

在多数情况下,不需要同步访问在 greenlets 中共享的对象,因此 Lock 和 Semaphore 对象虽被保留了下来,但派上用场的机会并不多。其他一些在 threading 和 multiporcessing 中的抽象在协作式的世界依然大有作为:

  • Event allows one to wake up a number of greenlets that are callingEvent.wait() method.
  • AsyncResult is similar to Event but allows passing a value or an exception to the waiters.
  • Queue and JoinableQueue

 

Lightweight pseudothreads

轻量级伪线程

The greenlets are spawned by creating a Greenlet instance and calling its startmethod. (The spawn() function is a shortcut that does exactly that). The start methodschedules a switch to the greenlet that will happen as soon as the current greenlet gives up control. If there is more than one active event, they will be executed one by one, in an undefined order.

greentlets 对象是通过创建一个 Greenlet 实例并且调用它的 start 方法来产生的。(spanw() 函数正是用于此)。start 方法用于调度切换到greenlet会发生只要当前的greentlet放弃了控制权 。如果有超过一个活动事件,它们会在一个不确定的次序里被一一执行。 

If there is an error during execution(执行) it won’t escape greenlet’s boundaries. An unhandled error results in a stacktrace being printed, complemented(补足) by the failed function’s signature and arguments:

如果在执行过程中出现错误,程序不会全部中止。 一个不确定的错误结果会被打印在stack tarce中, 并补充了失败函数的信号和参数:

>>> gevent.spawn(lambda : 1/0)
>>> gevent.sleep(1)
Traceback (most recent call last):
 ...
ZeroDivisionError: integer division or modulo by zero
<Greenlet at 0x7f2ec3a4e490: <function <lambda...>> failed with ZeroDivisionError

The tracebackis asynchronously printed to sys.stderr when the greenlet dies.

当 greenlet 结束时打印至 sys.stderr 的 traceback 是异步执行的。

Greenlet instances(实例) have a number of useful methods:

Green let 实例拥有大量实用的方法:

  • join – waits until the greenlet exits;
  • kill – interrupts greenlet’s execution;
  • get – returns the value returned by greenlet or re-raised the exception(例外) that killed it.

It is possible to customize(定做) the string printed after the traceback by subclassing(把…划入亚纲)the Greenlet class and redefining(重新定义) its __str__ method.

在 tracback 后打印的字符串是可以通过继承Greenlet 并重新定义__str__方法来修改的。

 

To subclass a Greenlet, override its _run() method and call Greenlet.__init__(self) in__init__:

为了继承 Greenlet, 重载它的 _run() 和 __init__():

class MyNoopGreenlet(Greenlet):

    def __init__(self, seconds):
        Greenlet.__init__(self)
        self.seconds = seconds

    def _run(self):
        gevent.sleep(self.seconds)

    def __str__(self):
        return 'MyNoopGreenlet(%s)' % self.seconds

Greenlets can be killed asynchronously. Killing will resume the sleeping greenlet, but instead of continuing execution, a GreenletExit will be raised.

>>> g = MyNoopGreenlet(4)
>>> g.start()
>>> g.kill()
>>> g.dead
True

The GreenletExit exception and its subclasses are handled differently than other exceptions. Raising GreenletExit is not considered an exceptional situation, so the traceback is not printed. The GreenletExit is returned by get as if it were returned by the greenlet, not raised.

The kill method can accept a custom exception to be raised:

>>> g = MyNoopGreenlet.spawn(5) # spawn() creates a Greenlet and starts it
>>> g.kill(Exception("A time to kill"))
Traceback (most recent call last):
 ...
Exception: A time to kill
MyNoopGreenlet(5) failed with Exception

The kill can also accept a timeout argument specifying the number of seconds to wait for the greenlet to exit. Note, that kill cannot guarantee that the target greenlet will not ignore the exception, thus it’s a good idea always to pass a timeout to kill.

Timeouts

Many functions in the gevent API are synchronous, blocking the current greenlet until the operation is done. For example, kill waits until the target greenlet is dead before returning [1]. Many of those functions can be made asynchronous by passing the argument block=False.

Furthermore, many of the synchronous functions accept a timeout argument, which specifies a limit on how long the function can block (examples: Event.wait()Greenlet.join(),Greenlet.kill()AsyncResult.get(), and many more).

The socket and SSLObject instances can also have a timeout, set by the settimeout method.

When these are not enough, the Timeout class can be used to add timeouts to arbitrary sections of (yielding) code.

Futher reading

To limit concurrency, use the Pool class (see example: dns_mass_resolve.py).

Gevent comes with TCP/SSL/HTTP/WSGI servers. See Implementing servers.

 

 

posted @ 2014-12-31 16:23  Ontseason  阅读(864)  评论(0)    收藏  举报