python编程入门后学习笔记五——GIL全局解释锁、互斥锁、递归锁
本节内容
1.CPython GIL全局解释锁
2.用户自己加的互斥锁threading.Lock()
3.用户自己加的递归锁threading.RLock()
1.CPython GIL全局解释器锁
在正式介绍GIL之前,先看下官方解释:
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.) Note that Python's GIL is only really an issue for CPython, the reference implementation. Jython and IronPython don't have a GIL. As a Python developer, you don't generally come across the GIL unless you're writing a C extension. C extension writers need to release the GIL when their extensions do blocking I/O, so that other threads in the Python process get a chance to run.
官方解释上这段话核心意思是:不管启动了多少个线程,或者是不是多核CPU,CPython在执行时,同一时刻都只允许一个线程运行。
CPython的线程库直接封装了系统的原生线程,但CPython整体作为一个进程,同一时间只会有一个获得GIL的线程在跑,其他线程则处于等待状态。
这就造成了即使在多核CPU中,多线程也只是做着分时切换(上下文切换)而已。
2.用户自己加的互斥锁threading.Lock()
前面给大家简单概括了下GIL,GIL是CPython解释器给线程自动分配的一把锁,下面给大家介绍下什么时候需要用户自己去在代码中加锁。
看段代码:
#-*- coding:utf-8 -*- #Author:'Yang' import threading import time '''run函数:打印任务后,等待2秒,任务结束''' def run(n): global count #全局变量,希望在子线程中对num进行修改 time.sleep(0.8) #sleep时是不占用CPU的 count +=1 count = 0 #全局变量,在主线程中 t_objs=[] #定义一个空列表,存储线程实例 for i in range(500): t=threading.Thread(target=run,args=("t-%s" %i,)) #子线程 t.start() #启动子线程 t_objs.append(t) #为了不阻塞后面线程的启动,不在这里join,先将子线程装进列表中 '''循环线程实例列表,等待所有线程执行完毕''' for t in t_objs: t.join() print("all threads finished".center(50,"*")) print("count:",count)
上面这段代码正常理解下,count最后打印结果应该是500,但是实际执行下,却出现了这样的情况:
Python2.7(win7)下,4次运行结果
***************all threads finished*************** ('num:', 482) >>> ================================ RESTART ================================ >>> ***************all threads finished*************** ('num:', 486) >>> ================================ RESTART ================================ >>> ***************all threads finished*************** ('num:', 500) >>> ================================ RESTART ================================ >>> ***************all threads finished*************** ('num:', 493) >>>
Python3.4(win7)下,4次运行结果
***************all threads finished*************** num: 500 >>> ================================ RESTART ================================ >>> ***************all threads finished*************** num: 500 >>> ================================ RESTART ================================ >>> ***************all threads finished*************** num: 500 >>> ================================ RESTART ================================ >>> ***************all threads finished*************** num: 500 >>>
注:在Pyhton3.4上多次尝试都能输出正确结果;
但在Python2.7上多次运行得出了不同的输出结果。
先不管Pyhton3.4上是否自动加锁了,来看看出现Pyhton2.7上的情况,是什么原因呢?
其实了解了原理就不难理解了:

假设你有 py threading 1和py threading 2两个线程,此时都要对 count 进行 加1操作, 由于2个线程是并发同时运行的,所以2个线程很有可能同时拿走了count=0这个初始变量交给cpu去运算,当py threading 1线程去处完的结果是1,但此时py threading 2线程运算完的结果也是1,两个线程同时进行CPU运算的结果再赋值给count变量后,结果就都是1。那怎么办呢?
依据GIL的思路,很简单,每个线程在要 修改公共数据 时,为了避免自己在还没改完的时候别人也来修改此数据,可以给这个数据加一把锁, 这样其它线程想修改此数据时就必须等待你修改完毕并把锁释放掉后才能再访问此数据。
#-*- coding:utf-8 -*- #Author:'Yang' import threading import time '''run函数:打印任务后,等待2秒,任务结束''' def run(n): lock.acquire() #获取用户锁 global count #全局变量,希望在子线程中对num进行修改 time.sleep(0.8) #sleep时是不占用CPU的 count +=1 lock.release() #释放用户锁 lock=threading.Lock() #实例化一把用户锁 count = 0 #全局变量,在主线程中 t_objs=[] #定义一个空列表,存储线程实例 for i in range(500): t=threading.Thread(target=run,args=("t-%s" %i,)) #子线程 t.start() #启动子线程 t_objs.append(t) #为了不阻塞后面线程的启动,不在这里join,先将子线程装进列表中 '''循环线程实例列表,等待所有线程执行完毕''' for t in t_objs: t.join() print("all threads finished".center(50,"*")) print("count:",count)
在Python2.7(win7)多次执行代码,虽然都得到了正确结果500,但是输出时间却为500*0.8=400秒,因为time.sleep(0.8)也锁在里面。
改进:
#-*- coding:utf-8 -*- #Author:'Yang' import threading import time '''run函数:打印任务后,等待2秒,任务结束''' def run(n): global count #全局变量,希望在子线程中对num进行修改 time.sleep(0.8) #sleep时是不占用CPU的 lock.acquire() #获取用户锁 count +=1 lock.release() #释放用户锁 lock=threading.Lock() #实例化一把用户锁 count = 0 #全局变量,在主线程中 t_objs=[] #定义一个空列表,存储线程实例 for i in range(500): t=threading.Thread(target=run,args=("t-%s" %i,)) #子线程 t.start() #启动子线程 t_objs.append(t) #为了不阻塞后面线程的启动,不在这里join,先将子线程装进列表中 '''循环线程实例列表,等待所有线程执行完毕''' for t in t_objs: t.join() print("all threads finished".center(50,"*")) print("count:",count)
3.用户自己加的递归锁threading.RLock()
递归锁:其实就是在一个大锁里还包含有子锁。
先看段代码:
#-*- coding:utf-8 -*- #Author:'Yang' '''run3是一把大锁,run1和run2是平行的,里面还有锁。 run3获得锁后,开始运行run1,run1这时又获取锁,执行完成, run1释放锁后,运行run2,run2获得锁,执行完成,run2释放锁 ''' import threading def run1(): print("grab the first part data") lock.acquire() global num1 num1 +=1 lock.release() return num1 def run2(): print("grab the second part data") lock.acquire() global num2 num2 +=1 lock.release() return num2 def run3(): lock.acquire() res1=run1() #run3获得锁后,run1里面还有锁 print("between run1 and run2".center(50,"-")) res2=run2() #run3获得锁后,run2里面还有锁 lock.release() print(res1,res2) if __name__=="__main__": num1,num2=0,0 # 生成两个变量 lock=threading.Lock() for i in range(10): t=threading.Thread(target=run3) #每个线程,都启动了run3 t.start() while threading.active_count()!=1: print(threading.active_count()) else: print("all threads done".center(50,"*")) #只剩一个线程了,说明所有子线程都结束了 print(num1,num2)
这段代码的本意是希望:开10个线程,通过启动run3,并发执行run1和run2,且run1和run2的结果不相互干扰。
但是实际运行上述代码,进入了死循环(一直打印11,出不来)。
怎么修改代码呢?
只需要改一句代码,调用递归锁即可解决。
#-*- coding:utf-8 -*- #Author:'Yang' '''run3是一把大锁,run1和run2是平行的,里面还有锁。 run3获得锁后,开始运行run1,run1这时又获取锁,执行完成, run1释放锁后,运行run2,run2获得锁,执行完成,run2释放锁 ''' import threading def run1(): print("grab the first part data") lock.acquire() global num1 num1 +=1 lock.release() return num1 def run2(): print("grab the second part data") lock.acquire() global num2 num2 +=1 lock.release() return num2 def run3(): lock.acquire() res1=run1() #run3获得锁后,run1里面还有锁 print("between run1 and run2".center(50,"-")) res2=run2() #run3获得锁后,run2里面还有锁 lock.release() print(res1,res2) if __name__=="__main__": num1,num2=0,0 # 生成两个变量 lock=threading.RLock() #递归锁 for i in range(10): t=threading.Thread(target=run3) #每个线程,都启动了run3 t.start() while threading.active_count()!=1: print(threading.active_count()) else: print("all threads done".center(50,"*")) #只剩一个线程了,说明所有子线程都结束了 print(num1,num2)
grab the first part data --------------between run1 and run2--------------- grab the second part data 1 1 grab the first part data --------------between run1 and run2--------------- grab the second part data 2 2 grab the first part data --------------between run1 and run2--------------- grab the second part data 3 3 grab the first part data --------------between run1 and run2--------------- grab the second part data 4 4 grab the first part data --------------between run1 and run2--------------- grab the second part data 5 5 grab the first part data --------------between run1 and run2--------------- grab the second part data 6 6 grab the first part data --------------between run1 and run2--------------- grab the second part data 7 7 grab the first part data --------------between run1 and run2--------------- grab the second part data 8 8 grab the first part data --------------between run1 and run2--------------- grab the second part data 9 9 grab the first part data 2 --------------between run1 and run2--------------- grab the second part data 2 2 2 2 2 2 2 2 2 2 10 10 2 *****************all threads done***************** 10 10
浙公网安备 33010602011771号