工作中对大数据需要进行处理,发现某个2重循环运行效率很低,搜集试验了2种比较简单方便检测代码效率的方法如下
1)装饰器来测量函数的执行时间
2)利用库函数cProfile
样例如下:
#coding:utf-8
import time
from functools import wraps
import cProfile
def fn_timer(function):
@wraps(function)
def function_timer(*args, **kwargs):
t0 = time.time()
result = function(*args, **kwargs)
t1 = time.time()
print ("Total time running %s: %s seconds" %
(function.func_name, str(t1-t0))
)
return result
return function_timer
@fn_timer
def string_match1(a, b, n):
for i in range(n):
if a is b:
c = 'true'
else:
c = 'false'
@fn_timer
def string_match2(a, b, n):
for i in range(n):
if a == b:
c = 'true'
else:
c = 'false'
@fn_timer
def string_match3(a, b, n):
for i in range(n):
if cmp(a, b):
c = 'true'
else:
c = 'false'
def string_match4(a, b, n):
for i in range(n):
if a is b:
c = 'true'
else:
c = 'false'
def string_match5(a, b, n):
for i in range(n):
if a == b:
c = 'true'
else:
c = 'false'
def string_match6(a, b, n):
for i in range(n):
if cmp(a, b):
c = 'true'
else:
c = 'false'
if __name__ == "__main__":
n = 10000000
a = 'abcdef'
b = 'bcdefg'
string_match1(a, b, n)
string_match2(a, b, n)
string_match3(a, b, n)
cProfile.run('string_match4(a, b, n)')
cProfile.run('string_match5(a, b, n)')
cProfile.run('string_match6(a, b, n)')
运行上述代码后结果为:
Total time running string_match1: 0.505000114441 seconds
Total time running string_match2: 0.487000226974 seconds
Total time running string_match3: 1.15999984741 seconds
4 function calls in 0.444 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.444 0.444 <string>:1(<module>)
1 0.368 0.368 0.444 0.444 test1.py:43(string_match4)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.075 0.075 0.075 0.075 {range}
4 function calls in 0.481 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.481 0.481 <string>:1(<module>)
1 0.422 0.422 0.481 0.481 test1.py:49(string_match5)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.059 0.059 0.059 0.059 {range}
10000004 function calls in 1.816 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.816 1.816 <string>:1(<module>)
1 1.166 1.166 1.816 1.816 test1.py:56(string_match6)
10000000 0.585 0.000 0.585 0.000 {cmp}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.065 0.065 0.065 0.065 {range}
1)3种简单的字符串比较方法中,a is b 最快,使用cmp()函数变慢很多,所以在项目中只需修改一行代码,可以让执行时间缩短至1/3左右
2)使用cProfile来查看运行效率,可以检查内部每个子函数的效率,方便在大型程序中排查问题