Python多线程并发模型在处理CPU密集任务时退化为多线程串行执行

原因

  1. 由于GIL的存在,同一时刻只有一个线程的字节码可以被Python解释器执行,
    因此无论CPU有多个核心,程序执行的是CPU密集任务还是I/O密集型任务,多个线程也无法并行执行。
  2. 线程被I/O阻塞的时候会释放GIL,因此多线程并发执行I/O密集型任务的时候,并发度尚可。
  3. 多线程执行纯CPU任务的时候,不会释放GIL,因此多线程执行纯CPU任务的线程并发执行时,接近退化为串行执行。

线程池版本-代码运行结果

(base) sgj@sgj-laptop:~/my-ostep/chap26/cpu-busy$ python thread_pool.py
Seconds per round: 1.0589
Total rounds: 55
Total real time:66.0324
Total theory time: 1.0589 * 55 = 58.2401
线程池版本
#!/usr/bin/env python

import time
import sys
from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import as_completed


def spin(duration):
    i = 0
    while i < 1.18 * duration* 10**7:
        i += 1
    return duration

def main():
    # CPU密集任务中, Python 的多线程并发会退化为多线程串行
    t1 = time.time()
    spin(1)
    t2 = time.time()
    one = t2 - t1
    print(f"Seconds per round: {one:.4f}")

    durations = list(range(1, 11))
    with ThreadPoolExecutor() as e:
        start = time.time()
        futures = [e.submit(spin, x) for x in durations]
        gen = as_completed(futures)
        for f in gen:
            pass
        end = time.time()
        total_rounds = sum(durations)
        print(f"Total rounds: {total_rounds}")
        print(f"Total real time:{(end - start):.4f}")
        print(f"Total theory time: {one:.4f} * {total_rounds} = {(one * total_rounds):.4f}")

if __name__ == '__main__':
    main()

普通多线程版本-代码运行结果

(base) sgj@sgj-laptop:~/my-ostep/chap26$ python threads_threading.py
Seconds per round: 1.0434
Total real time:64.4664
Total rounds: 55
Total theory time: 1.0434 * 55 = 57.3856
普通多线程版本
#!/usr/bin/env python

import time
import sys
import threading


def spin(duration):
    i = 0
    while i < 1.18 * duration* 10**7:
        i += 1
    return duration

def main():
    # CPU密集任务中, Python 的多线程并发会退化为多线程串行
    t1 = time.time()
    spin(1)
    t2 = time.time()
    one = t2 - t1
    print(f"Seconds per round: {one:.4f}")

    durations = list(range(1, 11))
    lst = list()
    for duration in durations:
        t = threading.Thread(target=spin, args=(duration,))
        lst.append(t)

    start = time.time()
    for t in lst:
        t.start()

    for t in lst:
        t.join()
    end = time.time()

    total_rounds = sum(durations)
    print(f"Total real time:{(end - start):.4f}")
    print('Total rounds:', sum(durations))
    print(f"Total theory time: {one:.4f} * {total_rounds} = {(one * total_rounds):.4f}")

if __name__ == '__main__':
    main()

进程池版本-代码运行结果

(base) sgj@sgj-laptop:~/my-ostep/chap26/cpu-busy$ python process_pool.py
Seconds per round: 1.1104
Total real time:22.4188
Total rounds: 55
Total theory time: 1.1104 * 55 = 61.0702
进程池版本
#!/usr/bin/env python

import time
import sys
from concurrent.futures import ProcessPoolExecutor
from concurrent.futures import as_completed


def spin(duration):
    i = 0
    while i < 1.18 * duration* 10**7:
        i += 1
    return duration

def main():
    # python的多进程模型在多核CPU上可以并行执行
    t1 = time.time()
    spin(1)
    t2 = time.time()
    one = t2 - t1
    print(f"Seconds per round: {one:.4f}")

    durations = list(range(1, 11))
    with ProcessPoolExecutor() as e:
        start = time.time()
        futures = [e.submit(spin, duration) for duration in durations]
        gen = as_completed(futures)
        for f in gen:
            pass
        end = time.time()
        total_rounds = sum(durations)
        print(f"Total real time:{(end - start):.4f}")
        print(f"Total rounds: {total_rounds}")
        print(f"Total theory time: {one:.4f} * {total_rounds} = {(one * total_rounds):.4f}")

if __name__ == '__main__':
    main()

普通多进程版本-代码运行结果

Python的多进程模型不受GIL的限制,我的代码中创建了10个进程,但我的电脑只有8个核心,所以总的运行时间为20.6785
如果我的电脑有10个核心,可以预见总运行时间在会在15秒左右

(base) sgj@sgj-laptop:~/my-ostep/chap26$ python process_multiprocessing.py
Seconds per round: 1.0555
Total real time:20.6785
Total rounds: 55
Total theory time: 1.0555 * 55 = 58.0537
普通多进程版本
#!/usr/bin/env python

import time
import sys
from multiprocessing import Process

def spin(duration):
    i = 0
    while i < 1.18 * duration* 10**7:
        i += 1
    return duration

def main():
    # python的多进程模型在多核CPU上可以并行执行
    t1 = time.time()
    spin(1)
    t2 = time.time()
    one = t2 - t1
    print(f"Seconds per round: {one:.4f}")

    durations = list(range(1, 11))
    lst = list()
    for duration in durations:
        p = Process(target=spin, args=(duration,))
        lst.append(p)

    start = time.time()
    for p in lst:
        p.start()

    for p in lst:
        p.join()
    end = time.time()

    total_rounds = sum(durations)
    print(f"Total real time:{(end - start):.4f}")
    print('Total rounds:', sum(durations))
    print(f"Total theory time: {one:.4f} * {total_rounds} = {(one * total_rounds):.4f}")

if __name__ == '__main__':
    main()
posted @ 2025-08-14 23:22  Guanjie255  阅读(8)  评论(0)    收藏  举报