在Python中连接字符串的首选方法是什么?

本文翻译自:Which is the preferred way to concatenate a string in Python?

Since Python's string can't be changed, I was wondering how to concatenate a string more efficiently? 由于Python的string无法更改,我想知道如何更有效地连接字符串?

I can write like it: 我可以这样写:

s += stringfromelsewhere

or like this: 或像这样:

s = []
s.append(somestring)

later

s = ''.join(s)

While writing this question, I found a good article talking about the topic. 在写这个问题时,我找到了一篇很好的文章,谈论这个话题。

http://www.skymind.com/~ocrow/python_string/ http://www.skymind.com/~ocrow/python_string/

But it's in Python 2.x., so the question would be did something change in Python 3? 但是它在Python 2.x中,所以问题是在Python 3中会有所改变吗?


#1楼

参考:https://stackoom.com/question/p3vj/在Python中连接字符串的首选方法是什么


#2楼

推荐的方法仍然是使用附加和联接。


#3楼

If you are concatenating a lot of values, then neither. 如果要串联很多值,那么两者都不是。 Appending a list is expensive. 追加列表很昂贵。 You can use StringIO for that. 您可以为此使用StringIO。 Especially if you are building it up over a lot of operations. 特别是如果您要通过大量操作来构建它。

from cStringIO import StringIO
# python3:  from io import StringIO

buf = StringIO()

buf.write('foo')
buf.write('foo')
buf.write('foo')

buf.getvalue()
# 'foofoofoo'

If you already have a complete list returned to you from some other operation, then just use the ''.join(aList) 如果已经有其他操作返回的完整列表,则只需使用''.join(aList)

From the python FAQ: What is the most efficient way to concatenate many strings together? 从python常见问题解答: 将许多字符串连接在一起的最有效方法是什么?

str and bytes objects are immutable, therefore concatenating many strings together is inefficient as each concatenation creates a new object. str和bytes对象是不可变的,因此将多个字符串连接在一起效率不高,因为每个串联都会创建一个新对象。 In the general case, the total runtime cost is quadratic in the total string length. 在一般情况下,总运行时成本在总字符串长度中是二次方的。

To accumulate many str objects, the recommended idiom is to place them into a list and call str.join() at the end: 要累积许多str对象,建议的惯用法是将它们放入列表中,并在最后调用str.join():

 chunks = [] for s in my_strings: chunks.append(s) result = ''.join(chunks) 

(another reasonably efficient idiom is to use io.StringIO) (另一个合理有效的习惯用法是使用io.StringIO)

To accumulate many bytes objects, the recommended idiom is to extend a bytearray object using in-place concatenation (the += operator): 要累积许多字节对象,建议的惯用法是使用就地串联(+ =运算符)扩展一个bytearray对象:

 result = bytearray() for b in my_bytes_objects: result += b 

Edit: I was silly and had the results pasted backwards, making it look like appending to a list was faster than cStringIO. 编辑:我很愚蠢,并且将结果向后粘贴,使其看起来比cStringIO更快。 I have also added tests for bytearray/str concat, as well as a second round of tests using a larger list with larger strings. 我还添加了针对bytearray / str concat的测试,以及使用较大列表和较大字符串的第二轮测试。 (python 2.7.3) (python 2.7.3)

ipython test example for large lists of strings 大量字符串的ipython测试示例

try:
    from cStringIO import StringIO
except:
    from io import StringIO

source = ['foo']*1000

%%timeit buf = StringIO()
for i in source:
    buf.write(i)
final = buf.getvalue()
# 1000 loops, best of 3: 1.27 ms per loop

%%timeit out = []
for i in source:
    out.append(i)
final = ''.join(out)
# 1000 loops, best of 3: 9.89 ms per loop

%%timeit out = bytearray()
for i in source:
    out += i
# 10000 loops, best of 3: 98.5 µs per loop

%%timeit out = ""
for i in source:
    out += i
# 10000 loops, best of 3: 161 µs per loop

## Repeat the tests with a larger list, containing
## strings that are bigger than the small string caching 
## done by the Python
source = ['foo']*1000

# cStringIO
# 10 loops, best of 3: 19.2 ms per loop

# list append and join
# 100 loops, best of 3: 144 ms per loop

# bytearray() +=
# 100 loops, best of 3: 3.8 ms per loop

# str() +=
# 100 loops, best of 3: 5.11 ms per loop

#4楼

While somewhat dated, Code Like a Pythonista: Idiomatic Python recommends join() over + in this section . 虽然有些过时,但是像Pythonista一样的代码:惯用Python 在本节中建议使用+ join() As does PythonSpeedPerformanceTips in its section on string concatenation , with the following disclaimer: 就像PythonSpeedPerformanceTips在其有关字符串连接的部分中一样,具有以下免责声明:

The accuracy of this section is disputed with respect to later versions of Python. 对于更高版本的Python,本节的准确性存在争议。 In CPython 2.5, string concatenation is fairly fast, although this may not apply likewise to other Python implementations. 在CPython 2.5中,字符串连接相当快,尽管这可能不适用于其他Python实现。 See ConcatenationTestCode for a discussion. 有关讨论,请参见ConcatenationTestCode。


#5楼

The best way of appending a string to a string variable is to use + or += . 将字符串附加到字符串变量的最好方法是使用++= This is because it's readable and fast. 这是因为它可读且快速。 They are also just as fast, which one you choose is a matter of taste, the latter one is the most common. 它们的速度也一样快,您选择的是一个品味问题,后者是最常见的。 Here are timings with the timeit module: 以下是timeit模块的计时:

a = a + b:
0.11338996887207031
a += b:
0.11040496826171875

However, those who recommend having lists and appending to them and then joining those lists, do so because appending a string to a list is presumably very fast compared to extending a string. 但是,那些建议拥有列表并附加到列表然后再连接这些列表的人之所以这样做,是因为将字符串附加到列表与扩展字符串相比可能非常快。 And this can be true, in some cases. 在某些情况下,这可能是正确的。 Here, for example, is one million appends of a one-character string, first to a string, then to a list: 例如,这里是一字符字符串的一百万个追加,首先是字符串,然后是列表:

a += b:
0.10780501365661621
a.append(b):
0.1123361587524414

OK, turns out that even when the resulting string is a million characters long, appending was still faster. 好的,事实证明,即使结果字符串的长度为一百万个字符,追加操作仍然更快。

Now let's try with appending a thousand character long string a hundred thousand times: 现在让我们尝试将十千个字符长的字符串追加十万次:

a += b:
0.41823482513427734
a.append(b):
0.010656118392944336

The end string, therefore, ends up being about 100MB long. 因此,最终字符串的长度约为100MB。 That was pretty slow, appending to a list was much faster. 那太慢了,追加到列表上要快得多。 That that timing doesn't include the final a.join() . 该时间不包括最终的a.join() So how long would that take? 那要花多长时间?

a.join(a):
0.43739795684814453

Oups. 哎呀 Turns out even in this case, append/join is slower. 即使在这种情况下,append / join也较慢。

So where does this recommendation come from? 那么,该建议来自何处? Python 2? Python 2?

a += b:
0.165287017822
a.append(b):
0.0132720470428
a.join(a):
0.114929914474

Well, append/join is marginally faster there if you are using extremely long strings (which you usually aren't, what would you have a string that's 100MB in memory?) 好吧,如果您使用的是非常长的字符串(通常不是,那么内存中有100MB的字符串呢?),append / join的速度会稍微快一些。

But the real clincher is Python 2.3. 但是真正的关键是Python 2.3。 Where I won't even show you the timings, because it's so slow that it hasn't finished yet. 我什至不告诉您时间安排,因为它是如此之慢以至于还没有完成。 These tests suddenly take minutes . 这些测试突然需要几分钟 Except for the append/join, which is just as fast as under later Pythons. 除了append / join一样,它和以后的Python一样快。

Yup. 对。 String concatenation was very slow in Python back in the stone age. 在石器时代,字符串连接在Python中非常缓慢。 But on 2.4 it isn't anymore (or at least Python 2.4.7), so the recommendation to use append/join became outdated in 2008, when Python 2.3 stopped being updated, and you should have stopped using it. 但是在2.4上已经不存在了(或者至少是Python 2.4.7),因此在2008年Python 2.3停止更新时,使用append / join的建议已过时,您应该停止使用它。 :-) :-)

(Update: Turns out when I did the testing more carefully that using + and += is faster for two strings on Python 2.3 as well. The recommendation to use ''.join() must be a misunderstanding) (更新:当我更仔细地进行测试时发现,在Python 2.3上使用++=在两个字符串上使用速度也更快。建议使用''.join()一定是一种误解)

However, this is CPython. 但是,这是CPython。 Other implementations may have other concerns. 其他实现可能还有其他问题。 And this is just yet another reason why premature optimization is the root of all evil. 这是过早优化是万恶之源的又一个原因。 Don't use a technique that's supposed "faster" unless you first measure it. 除非先进行测量,否则不要使用被认为“更快”的技术。

Therefore the "best" version to do string concatenation is to use + or += . 因此,进行字符串连接的“最佳”版本是使用+或+ = And if that turns out to be slow for you, which is pretty unlikely, then do something else. 如果事实证明这对您来说很慢,那是不太可能的,那么请执行其他操作。

So why do I use a lot of append/join in my code? 那么,为什么在我的代码中使用大量的添加/联接? Because sometimes it's actually clearer. 因为有时它实际上更清晰。 Especially when whatever you should concatenate together should be separated by spaces or commas or newlines. 尤其是当您应将其串联在一起时,应以空格,逗号或换行符分隔。


#6楼

Using in place string concatenation by '+' is THE WORST method of concatenation in terms of stability and cross implementation as it does not support all values. 就稳定性和交叉实现而言,最糟糕的串联方法是使用'+'进行就地字符串串联,因为它不支持所有值。 PEP8 standard discourages this and encourages the use of format(), join() and append() for long term use. PEP8标准不鼓励这样做,并鼓励长期使用format(),join()和append()。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值