go字符串拼接操作

最新推荐文章于 2023-11-22 17:05:26 发布

_冬木

最新推荐文章于 2023-11-22 17:05:26 发布

阅读量3.3k

点赞数

分类专栏： Golang

本文链接：https://blog.csdn.net/u013115610/article/details/80324949

版权

Golang 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

1. 使用 + 拼接

通过查看汇编代码可知 + 实际上调用的是 runtime/string.go中的concatstrings 函数，该函数源代码如下：

// concatstrings implements a Go string concatenation x+y+z+...
// The operands are passed in the slice a.
// If buf != nil, the compiler has determined that the result does not
// escape the calling function, so the string data can be stored in buf
// if small enough.
func concatstrings(buf *tmpBuf, a []string) string {
idx := 0
l := 0
count := 0
for i, x := range a {
    n := len(x)
    if n == 0 {
        continue
    }
    if l+n < l { //如果需要拼接的字符串太多，其字节数超过int的最大值，l将变为负数，所以l+n会小于l
        throw("string concatenation too long")
    }
    l += n
    count++
    idx = i
}
if count == 0 {
    return ""
}

// If there is just one string and either it is not on the stack
// or our result does not escape the calling frame (buf != nil),
// then we can return that string directly.
if count == 1 && (buf != nil || !stringDataOnStack(a[idx])) {
    return a[idx]
}
s, b := rawstringtmp(buf, l) //上面计算出所需要的字节空间，通过copy函数完成拼接
for _, x := range a {
    copy(b, x)
    b = b[len(x):]
}
return s
}

通过源码分析，使用 x+y+z+... 的方式完成一次拼接将与下面提到的strings.Join等方式没有太多的性能差异，但是在循环中出现性能损耗主要是在内存分配方面，因为每一次拼接都需要一次内存分配。即在一次拼接时+ 性能很好，但多次拼接性能就会变差。注意在最后没有将[]byte转换成string的损耗，而strings.Join在最后有将[]byte转换为string的损耗，故一次拼接其性能要比strings.Join好。

2. strings.Join

// Join concatenates the elements of a to create a single string. The separator string
  // sep is placed between elements in the resulting string.
  func Join(a []string, sep string) string {
    switch len(a) {
    case 0:
        return ""
    case 1:
        return a[0]
    case 2:
        // Special case for common small values.
        // Remove if golang.org/issue/6714 is fixed
        return a[0] + sep + a[1]
    case 3:
        // Special case for common small values.
        // Remove if golang.org/issue/6714 is fixed
        return a[0] + sep + a[1] + sep + a[2]
    }
    n := len(sep) * (len(a) - 1)
    for i := 0; i < len(a); i++ {
        n += len(a[i])
    }

    b := make([]byte, n)
    bp := copy(b, a[0])
    for _, s := range a[1:] {
        bp += copy(b[bp:], sep)
        bp += copy(b[bp:], s)
    }
    return string(b) //注意这里有转换的损耗
  }

该方法优于+ 的地方在于一次性分配所需空间，而+ 每一次迭代都需要重新分配。该方法一次调用其性能足够好，但多次调用就需要多次分配内存，其性能差于下述的bytes.Buffer。

3. bytes.Buffer

源码实现在bytes/buffer.go中小内存优化，能提前预分配内存，内存不足时*2倍增长，但是最后获取string结果有[]byte转string的消耗，故bytes.Buffer在一次初始化（提前计算总长度，一次性预分配好内存更好），多次字符串连接操作，最后一次性获取string结果的场景中是最快的。灵活性是最强的
byte.Buffer之所以在多次连接操作中性能会更好，是因为当内存不够时会以2倍重新分配，在后面将减少内存分配的次数，从而提升性能。

4. fmt.Sprintf

综上，
如果是少量小文本拼接，用 “+” 就好
如果是大量小文本拼接，用 strings.Join
如果是大量大文本拼接，用 bytes.Buffer

再次综上
+操作符通过汇编可知实现在runtime/string.go中，主要是concatstrings函数短字符串优化，没有借助[]byte造成转换string的消耗，故单次调用+操作符是最快的。灵活性最差。
bytes.Buffer 源码实现在bytes/buffer.go中小内存优化，能提前预分配内存，内存不足时*2倍增长，但是最后获取string结果有[]byte转string的消耗，故bytes.Buffer在一次初始化（提前计算总长度，一次性预分配好内存更好），多次字符串连接操作，最后一次性获取string结果的场景中是最快的。灵活性是最强的
strings.Join 源码实现在strings/strings.go中少量字符串连接优化，一次性分配内存，有[]byte转换string的消耗，故单次调用能达到bytes.Buffer的最好效果，但是它不够灵活
fmt.Sprintf 源码实现在fmt/print.go中因为a…interface{}有参数转换的消耗，借助[]byte每次添加调用append，逻辑相对复杂，最后获取结果有[]byte转string的消耗，故fmt.Sprintf一般要慢于bytes.Buffer和strings.Join，灵活性和strings.Join差不多
结论
单次调用性能：操作符+>strings.Join>=bytes.Buffer>fmt.Sprintf 灵活性：bytes.Buffer>fmt.Sprintf>=strings.Join>操作符+
正确使用，多次连接字符串操作的情况下，bytes.Buffer应该是最快的。