Day 21：Python 多线程和协程

C++中的多线程使用：https://www.runoob.com/cplusplus/cpp-multithreading.html

如果想同时干两件事情，可以使用多线程或者多进程。

下面总结一下python中多线程和协程的使用。

创建一个线程

线程相关的模块 threading

#插入线程相关模块
import threading

#返回当前线程
t = threading.current_thread()
t

output:
<_MainThread(MainThread, started 25769803792)>

一般程序默认在一个进程的一个线程中执行，这个线程称为主线程(MainThread)
线程对象有很多方法

print(type(t))
print(t.getName())
print(t.ident)
print(t.isAlive())

output:
<class 'threading._MainThread'>
MainThread
25769803792
True

上面是默认的主线程，如何自己创建一个线程：

my_thread = threading.Thread()
# 创建一个有名字的线程，名字通过关键字参数name传入
my_thread = threading.Thread(name="my_thread")

def print_i(i):
    print('打印i:%d'%(i,))
# 线程要做什么事，用target参数传入一个函数，命令等等
my_thread = threading.Thread(target=print_i,args=(777,))

# 启动线程
my_thread.start()

# 查看相关参数
print(type(my_thread))
print(my_thread.getName())
print(my_thread.ident)
print(my_thread.isAlive())

output：
打印i:777
<class 'threading.Thread'>
Thread-13
25777080560
False

可以看到

线程的ident是自动分配的
这样的线程做完了我们想要让他做的事后就死掉了

再创建一个执行while(true)的线程：

import time
my_thread = threading.Thread()
# 创建一个有名字的线程，名字通过关键字参数name传入
my_thread = threading.Thread(name="my_thread")

def print_i(i):
    while(1):
        
        print('打印i:%d'%(i,))
        print(my_thread.getName())
        print(my_thread.ident)
        print(my_thread.isAlive())
        time.sleep(2)
# 线程要做什么事，用target参数传入一个函数，命令等等
my_thread = threading.Thread(target=print_i,args=(777,)) #其中 args 指定函数 print_i 需要的参数 i，类型为元组。

# 启动线程
my_thread.start()

# 查看相关参数
print(type(my_thread))
print(my_thread.getName())
print(my_thread.ident)
print(my_thread.isAlive())

output:
打印i:777
Thread-19
25777221568
True
<class 'threading.Thread'>
Thread-19
25777221568
True
打印i:777
Thread-19
25777221568
True
打印i:777
Thread-19
25777221568
True
打印i:777
Thread-19
25777221568
True
打印i:777
Thread-19
25777221568
True
打印i:777
Thread-19
25777221568
True

只要线程未结束，就还活着
sleep()的这段时间里，执行的命令从创建的my_thread线程里跳出来执行主线程的几行命令了，执行完后等sleep(2)结束又继续执行

上面的例子引出了时间片知识

交替获得 CPU 时间片

import time
import threading
def print_time():
    for _ in range(7):
        time.sleep(0.5)
        print('当前线程%s,打印结束时间为:%s\n' %(threading.current_thread().getName(),time.time()))
        
threads = [threading.Thread(name='t%d'%(i,),target=print_time) for i in range(3)]
# 创建三个线程分别打印7次
[t.start() for t in threads]

output:
[None, None, None]
当前线程t0,打印结束时间为:1625925366.0244334

当前线程t2,打印结束时间为:1625925366.025522

当前线程t1,打印结束时间为:1625925366.026201

当前线程t2,打印结束时间为:1625925366.5365055
当前线程t1,打印结束时间为:1625925366.536608
当前线程t0,打印结束时间为:1625925366.5366366



当前线程t1,打印结束时间为:1625925367.0371497
当前线程t2,打印结束时间为:1625925367.0373096


当前线程t0,打印结束时间为:1625925367.0376115

当前线程t2,打印结束时间为:1625925367.5508695
当前线程t1,打印结束时间为:1625925367.550959
当前线程t0,打印结束时间为:1625925367.5510285



当前线程t0,打印结束时间为:1625925368.0537887
当前线程t1,打印结束时间为:1625925368.0540538
当前线程t2,打印结束时间为:1625925368.054124



当前线程t0,打印结束时间为:1625925368.5652854
当前线程t2,打印结束时间为:1625925368.5653586
当前线程t1,打印结束时间为:1625925368.5654306



当前线程t0,打印结束时间为:1625925369.0662143
当前线程t1,打印结束时间为:1625925369.066377
当前线程t2,打印结束时间为:1625925369.0664334

根据操作系统的调度算法，t0、t1、t2 三个线程，轮询获得 CPU 时间片。

抢夺全局变量

全局变量，被当前进程中所有存活线程共享，当多个线程共享一个全局变量时候，会出现竞争：

import threading
a = 7
def add1():
    global a
    time.sleep(5)
    a += 1
    
    print('%s  adds a to 1: %d\n'%(threading.current_thread().getName(),a))
    
threads = [threading.Thread(name='myt%d'%(i,),target=add1) for i in range(10)]
[t.start() for t in threads]

output:
[None, None, None, None, None, None, None, None, None, None]
myt0  adds a to 1: 8
myt3  adds a to 1: 9

myt8  adds a to 1: 10

myt6  adds a to 1: 11

myt2  adds a to 1: 12

myt1  adds a to 1: 13


myt4  adds a to 1: 14

myt7  adds a to 1: 15
myt5  adds a to 1: 16
myt9  adds a to 1: 17

编写多线程程序，只要有读取和修改全局变量的情况，如果不采取措施，就一定不是线程安全的。

尽管，有时某些情况的资源竞争，暴露出问题的概率极低。

如果某个线程修改全局变量 a 后，其他线程获取的，还是未修改前的值，问题就会暴露。

但是，a=a+1 这种修改操作，花费的时间太短，短到我们无法想象。线程间轮询执行时，都能获取到最新的、修改后的值。所以，暴露问题的概率就变得很低。

不过，现实中使用多线程，目的也不会仅仅就是为了跑一个 a=a+1 这种操作。更大可能，线程中执行任务，会耗费一定时间。

所以，怎样编写线程安全的代码，变得非常重要。

可以看看暴露时候的问题：

import threading
a = 7
def add1():
    global a
    temp = a+1
    time.sleep(1)
    a = temp
    print('%s  adds a to 1: %d\n'%(threading.current_thread().getName(),a))
    
threads = [threading.Thread(name='myt%d'%(i,),target=add1) for i in range(10)]
[t.start() for t in threads]

output:
[None, None, None, None, None, None, None, None, None, None]
myt8  adds a to 1: 8

myt4  adds a to 1: 8

myt0  adds a to 1: 8

myt9  adds a to 1: 8

myt1  adds a to 1: 8
myt7  adds a to 1: 8

myt6  adds a to 1: 8

myt5  adds a to 1: 8
myt2  adds a to 1: 8



myt3  adds a to 1: 8

10 个线程全部运行后，a 的值只相当于一个线程执行的结果。为什么?

在执行第一个线程时，a = temp前有1s的sleep()时间

这个线程被延时后，CPU 立即分配计算资源给其他线程。

可是所有的线程执行完sleep()前的命令的时间总和都比sleep()的时间短，任何一个sleep()醒来后temp = 7+1

所以才出现上面的结果。

和C++中一样，如果有种原子操作，锁，那么就可以避免暴露的问题。python中一样提供加锁机制

加锁（原子操作）

通过 locka.acquire() 获得锁，通过 locka.release() 释放锁。

import threading
import time
locka = threading.Lock()# 通过 locka.acquire() 获得锁，通过 locka.release() 释放锁。
a = 7
def add1():
    global a
    try:
        locka.acquire() # 获得锁
        temp = a+1
        time.sleep(1)
        a = temp
    finally:
        locka.release() # 释放锁
    print('%s  adds a to 1: %d\n'%(threading.current_thread().getName(),a))

threads = [threading.Thread(name='myt%d'%(i,),target=add1) for i in range(10)]
[t.start() for t in threads]

output:
[None, None, None, None, None, None, None, None, None, None]
myt0  adds a to 1: 8

myt1  adds a to 1: 9

myt2  adds a to 1: 10

myt3  adds a to 1: 11

myt4  adds a to 1: 12

myt5  adds a to 1: 13

myt6  adds a to 1: 14

myt7  adds a to 1: 15

myt8  adds a to 1: 16

myt9  adds a to 1: 17

就加锁案例而言，确实完成了锁的功能
但是没必要多线程，还增加了开销
当程序中只有一把锁，通过 try...finally 还能确保不发生死锁。但是，当程序中启用多把锁，很容易发生死锁，怎样避免死锁是必修课啊

高效的协程

呼~协程，新名词，新概念

在同一个线程中，如果发生以下事情：

A 函数执行时被中断，传递一些数据给 B 函数；
B 函数拿到这些数据后开始执行，执行一段时间后，发送一些数据到 A 函数；
就这样交替执行……

这种执行调用模式，被称为协程（同一个线程中，不同函数间交替的、协作的执行完成任务）。

协程是在同一线程中函数间的切换，而不是线程间的切换，
执行效率更优，Python 的异步操作正是基于高效的协程机制。

# 在主线程中执行以下函数

def A():
    a_list = ['1', '2', '3']
    for to_b in a_list:
        from_b = yield to_b
        print('receive %s from B' % (from_b,))
        print('do some complex process for A during 200ms ')
        
def B(a):
    from_a = a.send(None)
    print('response %s from A ' % (from_a,))
    print('B is analysising data from A')
    b_list = ['x', 'y', 'z']
    try:
        for to_a in b_list:
            from_a = a.send(to_a)
            print('response %s from A ' % (from_a,))
            print('B is analysising data from A')
    except StopIteration:
        print('---from a done---')
    finally:
        a.close()
        
# 调用
a = A()
B(a)

output:
response 1 from A 
B is analysising data from A
receive x from B
do some complex process for A during 200ms 
response 2 from A 
B is analysising data from A
receive y from B
do some complex process for A during 200ms 
response 3 from A 
B is analysising data from A
receive z from B
do some complex process for A during 200ms 
---from a done---

分析执行过程：

a.send(None) 激活 A 函数，并执行到 yield to_b，把变量 to_b 传递给 B 函数，A 函数中断；
from_a 就是上步 A 函数返回的 to_b 值，然后执行分析这个值；
当执行到 a.send(to_a) 时，B 函数将加工后的 to_a 值发送给 A 函数；
from_b 变量接收来自 B 函数的发送，然后使用此值做分析 200 ms 后，又将 to_b 传递给 B 函数，A 函数中断；
重复 2、3、4；
直到 from_a 获取不到响应值，函数触发 StopIteration 异常，程序执行结束。

多线程是抢占时间片的编程模型，通过锁和释放锁的机制控制全局变量的读取和修改，容易出现死锁
协程无需使用锁，也就不会发生死锁。同时，利用协程的协作这一特点，高效地完成了原编程模型只能通过多个线程才能完成的任务

P.S. 带 yield 用于定义生成器（generator）函数:

调用next()依次向下取值，yield类似return，中断流程，记录返回当前的值，

可以一直往下执行，直到执行生成器函数中的return触发生成器对象抛出stopiteration异常，

https://www.cnblogs.com/PiaYie/p/14977379.html#_label0

生成器有主要有四种方法：

next() 执行函数，直到遇到下一个yield为止，并返回值
send(value) 为生成器发送一个数值，next()方法就相当于send(None)
close() 终止生成器
throw(exc[exc_value,[exc_tb]]) 在生成器yield处引发一个异常，close()相当于引发一个GeneratorExit异常

posted @ 2021-07-10 23:03 PiaYie 阅读(452) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

PiaYie