python3.6中format函数_Python字符串格式:’%’比’format’函数效率更高?

我想比较不同以在Python中用不同的变量构建一个字符串:

>用于连接(称为’加’)

>使用%

>使用“”.join(列表)

>使用格式化功能

>使用“{0< attribute>}”.格式(对象)

我比较了3种类型的情景

>带有2个变量的字符串

>带有4个变量的字符串

>带有4个变量的字符串,每个变量使用两次

我每次测量100万次操作并且平均执行超过6次测量.我想出了以下时间:

在每个场景中,我得出以下结论

>连接似乎是最快的方法之一

>使用%格式化比使用格式化功能格式化要快得多

我认为格式比%好很多(例如在this question),%几乎已被弃用.

因此,我有几个问题:

>%真的比格式快吗?

>如果是这样,那为什么?

>为什么“{} {}”.format(var1,var2)比“{0.attribute1} {0.attribute2}”.格式(对象)更有效?

作为参考,我使用以下代码来测量不同的时序.

import time

def timing(f, n, show, *args):

if show: print f.__name__ + ":\t",

r = range(n/10)

t1 = time.clock()

for i in r:

f(*args); f(*args); f(*args); f(*args); f(*args); f(*args); f(*args); f(*args); f(*args); f(*args)

t2 = time.clock()

timing = round(t2-t1, 3)

if show: print timing

return timing

#Class

class values(object):

def __init__(self, a, b, c="", d=""):

self.a = a

self.b = b

self.c = c

self.d = d

def test_plus(a, b):

return a + "-" + b

def test_percent(a, b):

return "%s-%s" % (a, b)

def test_join(a, b):

return ''.join([a, '-', b])

def test_format(a, b):

return "{}-{}".format(a, b)

def test_formatC(val):

return "{0.a}-{0.b}".format(val)

def test_plus_long(a, b, c, d):

return a + "-" + b + "-" + c + "-" + d

def test_percent_long(a, b, c, d):

return "%s-%s-%s-%s" % (a, b, c, d)

def test_join_long(a, b, c, d):

return ''.join([a, '-', b, '-', c, '-', d])

def test_format_long(a, b, c, d):

return "{0}-{1}-{2}-{3}".format(a, b, c, d)

def test_formatC_long(val):

return "{0.a}-{0.b}-{0.c}-{0.d}".format(val)

def test_plus_long2(a, b, c, d):

return a + "-" + b + "-" + c + "-" + d + "-" + a + "-" + b + "-" + c + "-" + d

def test_percent_long2(a, b, c, d):

return "%s-%s-%s-%s-%s-%s-%s-%s" % (a, b, c, d, a, b, c, d)

def test_join_long2(a, b, c, d):

return ''.join([a, '-', b, '-', c, '-', d, '-', a, '-', b, '-', c, '-', d])

def test_format_long2(a, b, c, d):

return "{0}-{1}-{2}-{3}-{0}-{1}-{2}-{3}".format(a, b, c, d)

def test_formatC_long2(val):

return "{0.a}-{0.b}-{0.c}-{0.d}-{0.a}-{0.b}-{0.c}-{0.d}".format(val)

def test_plus_superlong(lst):

string = ""

for i in lst:

string += str(i)

return string

def test_join_superlong(lst):

return "".join([str(i) for i in lst])

def mean(numbers):

return float(sum(numbers)) / max(len(numbers), 1)

nb_times = int(1e6)

n = xrange(5)

lst_numbers = xrange(1000)

from collections import defaultdict

metrics = defaultdict(list)

list_functions = [

test_plus, test_percent, test_join, test_format, test_formatC,

test_plus_long, test_percent_long, test_join_long, test_format_long, test_formatC_long,

test_plus_long2, test_percent_long2, test_join_long2, test_format_long2, test_formatC_long2,

# test_plus_superlong, test_join_superlong,

]

val = values("123", "456", "789", "0ab")

for i in n:

for f in list_functions:

print ".",

name = f.__name__

if "formatC" in name:

t = timing(f, nb_times, False, val)

elif '_long' in name:

t = timing(f, nb_times, False, "123", "456", "789", "0ab")

elif '_superlong' in name:

t = timing(f, nb_times, False, lst_numbers)

else:

t = timing(f, nb_times, False, "123", "456")

metrics[name].append(t)

#Get Average

print "\n===AVERAGE OF TIMINGS==="

for f in list_functions:

name = f.__name__

timings = metrics[name]

print "{:>20}:\t{:0.5f}".format(name, mean(timings))

解决方法:

>是的,%字符串格式化比.format方法快

>最有可能(这可能有更好的解释),因为%是一个语法符号(因此快速执行),而.format涉及至少一个额外的方法调用

>因为属性值访问还涉及额外的方法调用,即. __getattr__

我使用各种格式化方法的timeit进行了稍微好一点的分析(在Python 3.6.0上),其结果如下(漂亮地用BeautifulTable打印) –

+-----------------+-------+-------+-------+-------+-------+--------+

| Type \ num_vars | 1 | 2 | 5 | 10 | 50 | 250 |

+-----------------+-------+-------+-------+-------+-------+--------+

| f_str_str | 0.306 | 0.064 | 0.106 | 0.183 | 0.737 | 3.422 |

+-----------------+-------+-------+-------+-------+-------+--------+

| f_str_int | 0.295 | 0.174 | 0.385 | 0.686 | 3.378 | 16.399 |

+-----------------+-------+-------+-------+-------+-------+--------+

| concat_str | 0.012 | 0.053 | 0.156 | 0.31 | 1.707 | 16.762 |

+-----------------+-------+-------+-------+-------+-------+--------+

| pct_s_str | 0.056 | 0.178 | 0.275 | 0.469 | 1.872 | 9.191 |

+-----------------+-------+-------+-------+-------+-------+--------+

| pct_s_int | 0.128 | 0.208 | 0.343 | 0.605 | 2.483 | 13.24 |

+-----------------+-------+-------+-------+-------+-------+--------+

| dot_format_str | 0.418 | 0.217 | 0.343 | 0.58 | 2.241 | 11.163 |

+-----------------+-------+-------+-------+-------+-------+--------+

| dot_format_int | 0.416 | 0.277 | 0.476 | 0.811 | 3.378 | 17.829 |

+-----------------+-------+-------+-------+-------+-------+--------+

| dot_format2_str | 0.433 | 0.242 | 0.416 | 0.675 | 3.152 | 16.783 |

+-----------------+-------+-------+-------+-------+-------+--------+

| dot_format2_int | 0.428 | 0.298 | 0.541 | 0.933 | 4.444 | 24.767 |

+-----------------+-------+-------+-------+-------+-------+--------+

尾随_str& _int表示对各个值类型执行的操作.

请注意,单个变量的concat_str结果基本上只是字符串本身,所以不应该考虑它.

我得到的结果 –

from timeit import timeit

from beautifultable import BeautifulTable # pip install beautifultable

times = {}

for num_vars in (1, 2, 5, 10, 50, 250):

f_str = "f'{" + '}{'.join([f'x{i}' for i in range(num_vars)]) + "}'"

# "f'{x0}{x1}"

concat = '+'.join([f'x{i}' for i in range(num_vars)])

# 'x0+x1'

pct_s = '"' + '%s'*num_vars + '" % (' + ','.join([f'x{i}' for i in range(num_vars)]) + ')'

# '"%s%s" % (x0,x1)'

dot_format = '"' + '{}'*num_vars + '".format(' + ','.join([f'x{i}' for i in range(num_vars)]) + ')'

# '"{}{}".format(x0,x1)'

dot_format2 = '"{' + '}{'.join([f'{i}' for i in range(num_vars)]) + '}".format(' + ','.join([f'x{i}' for i in range(num_vars)]) + ')'

# '"{0}{1}".format(x0,x1)'

vars = ','.join([f'x{i}' for i in range(num_vars)])

vals_str = tuple(map(str, range(num_vars)))

setup_str = f'{vars} = {vals_str}'

# "x0,x1 = ('0', '1')"

vals_int = tuple(range(num_vars))

setup_int = f'{vars} = {vals_int}'

# 'x0,x1 = (0, 1)'

times[num_vars] = {

'f_str_str': timeit(f_str, setup_str),

'f_str_int': timeit(f_str, setup_int),

'concat_str': timeit(concat, setup_str),

# 'concat_int': timeit(concat, setup_int), # this will be summation, not concat

'pct_s_str': timeit(pct_s, setup_str),

'pct_s_int': timeit(pct_s, setup_int),

'dot_format_str': timeit(dot_format, setup_str),

'dot_format_int': timeit(dot_format, setup_int),

'dot_format2_str': timeit(dot_format2, setup_str),

'dot_format2_int': timeit(dot_format2, setup_int),

}

table = BeautifulTable()

table.column_headers = ['Type \ num_vars'] + list(map(str, times.keys()))

# Order is preserved, so I didn't worry much

for key in ('f_str_str', 'f_str_int', 'concat_str', 'pct_s_str', 'pct_s_int', 'dot_format_str', 'dot_format_int', 'dot_format2_str', 'dot_format2_int'):

table.append_row([key] + [times[num_vars][key] for num_vars in (1, 2, 5, 10, 50, 250)])

print(table)

因为timeit的一些最大参数(255)限制,我无法超越num_vars = 250.

tl; dr – Python字符串格式化性能:f-strings最快,更优雅,但有时​​(由于某些implementation restrictions和仅为Py3.6),您可能必须使用其他格式化选项.

标签:python,performance

来源: https://codeday.me/bug/20191004/1854154.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值