python计时模块

计时模块

简单的记录下python一些计时器。

1.timeit模块

timeit模块用来计时一些短的代码片段执行时间。该模块提供了两种使用方式，一种是命令行模式下直接调用，另一种就是当普通的模块调用。

命令行模式

yscl@yscl:~$ python3 -m timeit '"-".join(str(i) for i in range(100))'
10000 loops, best of 3: 19.2 usec per loop
yscl@yscl:~$ python3 -m timeit '"-".join([str(i) for i in range(100)])'
100000 loops, best of 3: 17 usec per loop
yscl@yscl:~$ python3 -m timeit '"-".join(map(str, range(100)))'
100000 loops, best of 3: 14 usec per loop

可以看出同样是字符串连接操作，当调用map函数花费时间最短。

交互模式

>>> timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
0.2559542650014919
>>> timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)
0.20574105799823883
>>> timeit.timeit('"-".join(map(str, range(100)))', number=10000)
0.1546621529996628

命令行模式还提供了一些可选参数，这里只列了3个最常用的，其他参考官方文档timeit.html

基础的python -m timeit

-n N, --number=N：指定执行的次数

-p P --repeat=N：定时器重复次数(只测量一次容易受到其他程序的影响)

-s S, --setup=S：对执行的代码进行初始化操作(常见的如导入自己的模块以测量自己写的函数运行时间)

timeit还有个常用函数就是repeat，使用方法同timeit.timeit,只是多了个定时器重复次数的参数。具体定义如下：

timeit.repeat(stmt='pass', setup='pass', timer=<default timer>, repeat=5, number=1000000, globals=None)

参数：stmt: 要执行的代码段，setup: 要执行代码的初始化， timer: 默认是time.perf_counter(), repeat: 定时器重复次数, number: 启动一次定时器代码执行的次数， globals: 指定代码执行的命名空间

功能：返回一个一个列表，是每一次定时器的测量的执行时间，一般程序运行会受到其他程序的影响，所以可以关注那个最小的值。

import timeit

# 获得倒序的列表
code_append = """
lst = []
for i in range(100):
    lst.append(i)
lst = lst[::-1]
"""

code_insert = """
lst = []
for i in range(100):
    lst.insert(0, i)
"""

print(timeit.repeat(code_append, number=10000))
print(timeit.repeat(code_insert, number=10000))

[0.0996780889981892, 0.08902960799605353, 0.0855226429994218]
[0.18702545399719384, 0.19733670599816833, 0.16741226500016637]

此处可以看出列表append操作速度远比插入速度快，至于只重复了3次，这是我的Python版本是3.5，在3.7版本默认重复5次。

再来一个使用初始化参数的版本，用来测量自定义的函数的执行时间

>>> import timeit
>>> timeit.timeit('func()', setup='from test_timeit import func')
6.636295215001155
>>> timeit.repeat('func()', setup='from test_timeit import func')
[6.601007812001626, 6.5381397309975, 6.733538911998039]

2. profilers

python提供了cProfile和profile两个模块进行程序性能分析。cProfile是一个c扩展，运行效率更高，开销也合理，适合长期运行的程序。而profile是用纯python实现的，但是开销明显增加了，使用方法两者都是一样的。

简单使用

import cProfile
import re
cProfile.run('re.compile(r"\d+[abcde]")', sort='cumulative')

运行字符串里的代码后输出如下

         160 function calls (157 primitive calls) in 0.000 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 re.py:222(compile)
        1    0.000    0.000    0.000    0.000 re.py:278(_compile)
        1    0.000    0.000    0.000    0.000 sre_compile.py:531(compile)
        1    0.000    0.000    0.000    0.000 sre_parse.py:819(parse)
      2/1    0.000    0.000    0.000    0.000 sre_compile.py:64(_compile)

第一行是说明当前有160个函数被调用，其中157个是正常调用(不是递归引发的调用)，第二行是按照总共累计运行时间从大到小排序的（默认是按照名称排序），每列的标题含义如下：

ncalls：函数调用的次数。其中（2/1）表示函数是递归的，前面的2表示递归总次数，1表示原始调用
tottime：函数花费的总时间，不包括子函数的调用时间
percall：每次调用的平均时间（tottime/ncalls）
cumtime：函数执行累计的总时间（包括子函数调用的时间和递归调用的时间）
percall：每次调用的平均时间（cumtime/calls）
filename:lineno(function)：函数所在文件的名字和函数执行所在行数

profile模块还提供将性能分析信息保存到文件中，使用也很方便，只要在run函数指定文件名即可，然后便可以用pstats模块从文件读取分析，并以各种方式格式化显示出来。

以下是一个例子。用来分析自己写的程序，该程序功能是简单统计N以内的素数对（即相差为2，且都为素数）的总数。

# prime_pair.py

import math

def is_prime(num):

    if num == 1:
        return False
    elif num == 2 or num == 3:
        return True
    elif num % 2 == 0:
        return False
    n = int(math.sqrt(num))
    for i in range(3, n + 1, 2):
        if not(num % i):
            return False
    return True

def main():
    # num = int(input())
    num = 100000
    count = 0
    n = 3
    while n < num - 1:
        if is_prime(n):
            while is_prime(n + 2) and n + 2 <= num:
                count += 1
                n += 2
            n += 2
        n += 2
    print(count)

if __name__ == '__main__':
    main()

import cProfile
cProfile.run('from prime_pair import main; main()', sort='cumulative', filename='restats')

执行上述代码，即可产生一个名为restats的文件，然后可以对该文件进行分析。

import pstats

p = pstats.Stats('restats')
p.strip_dirs().sort_stats('cumtime').print_stats(10)

1000215 function calls (1000209 primitive calls) in 3.044 seconds

   Ordered by: cumulative time
   List reduced from 93 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      2/1    0.001    0.000    3.044    3.044 {built-in method builtins.exec}
        1    0.000    0.000    3.044    3.044 <string>:1(<module>)
        1    0.158    0.158    3.044    3.044 prime_pair.py:19(main)
   499998    2.785    0.000    2.886    0.000 prime_pair.py:4(is_prime)
   499997    0.101    0.000    0.101    0.000 {built-in method math.sqrt}
      2/1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:966(_find_and_load)
      2/1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
      2/1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:659(_load_unlocked)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap_external>:659(exec_module)
        2    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:879(_find_spec)

上面代码中，首先创建一个stats统计对象，执行strip_dirs()方法可以去掉与模块无关的路径名称，sort_stats('cumtime')方法可以按照指定关键字对结果进行排序，print_stats(10)打印符合条件的前10条。然后根据结果分析来看，我的程序几乎所有时间都用在判断素数上了，所以优化素数判断的算法，就能优化整个程序了。关于pstats.Stats 类的详细方法查看官方文档

最后提一句，profile还支持在命令行模式直接执行。

python -m cProfile -s 'ncalls' prime_pair.py

1224
         149995 function calls in 0.206 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    49998    0.158    0.000    0.189    0.000 prime_pair.py:4(is_prime)
    49997    0.007    0.000    0.007    0.000 {math.sqrt}
    49997    0.024    0.000    0.024    0.000 {range}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.206    0.206 prime_pair.py:1(<module>)
        1    0.017    0.017    0.206    0.206 prime_pair.py:19(main)

命令行模式还有一个可选参数，就是-o，指定输出的文件名字，-s是指定排序方式。

参考：

官方文档 The Python Profilers

官方文档 timeit

posted @ 2018-12-03 13:02 yscl 阅读(209) 评论(0) 收藏举报

刷新页面返回顶部

yscl

python计时模块

计时模块

1.timeit模块

2. profilers

参考：

公告