字符串拼接性能及原理

源代码/数据集已上传到 Github - high-performance-go

1. 字符串高效拼接

在 Go 语言中,字符串(string) 是不可变的,拼接字符串事实上是创建了一个新的字符串对象。如果代码中存在大量的字符串拼接,对性能会产生严重的影响。

1.1 常见的拼接方式

为了避免编译器优化,我们首先实现一个生成长度为 n 的随机字符串的函数。

1
2
3
4
5
6
7
8
9
const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

func randomString(n int) string {
b := make([]byte, n)
for i := range b {
b[i] = letterBytes[rand.Intn(len(letterBytes))]
}
return string(b)
}
strstr
+
1
2
3
4
5
6
7
func plusConcat(n int, str string) string {
s := ""
for i := 0; i < n; i++ {
s += str
}
return s
}
fmt.Sprintf
1
2
3
4
5
6
7
func sprintfConcat(n int, str string) string {
s := ""
for i := 0; i < n; i++ {
s = fmt.Sprintf("%s%s", s, str)
}
return s
}
strings.Builder
1
2
3
4
5
6
7
func builderConcat(n int, str string) string {
var builder strings.Builder
for i := 0; i < n; i++ {
builder.WriteString(str)
}
return builder.String()
}
bytes.Buffer
1
2
3
4
5
6
7
func bufferConcat(n int, s string) string {
buf := new(bytes.Buffer)
for i := 0; i < n; i++ {
buf.WriteString(s)
}
return buf.String()
}
[]byte
1
2
3
4
5
6
7
func byteConcat(n int, str string) string {
buf := make([]byte, 0)
for i := 0; i < n; i++ {
buf = append(buf, str...)
}
return string(buf)
}
[]byte
1
2
3
4
5
6
7
func preByteConcat(n int, str string) string {
buf := make([]byte, 0, n*len(str))
for i := 0; i < n; i++ {
buf = append(buf, str...)
}
return string(buf)
}

make([]byte, 0, n*len(str)) 第二个参数是长度,第三个参数是容量(cap),切片创建时,将预分配 cap 大小的内存。

1.2 benchmark 性能比拼

每个 benchmark 用例中,生成了一个长度为 10 的字符串,并拼接 1w 次。

1
2
3
4
5
6
7
8
9
10
11
12
13
func benchmark(b *testing.B, f func(int, string) string) {
var str = randomString(10)
for i := 0; i < b.N; i++ {
f(10000, str)
}
}

func BenchmarkPlusConcat(b *testing.B) { benchmark(b, plusConcat) }
func BenchmarkSprintfConcat(b *testing.B) { benchmark(b, sprintfConcat) }
func BenchmarkBuilderConcat(b *testing.B) { benchmark(b, builderConcat) }
func BenchmarkBufferConcat(b *testing.B) { benchmark(b, bufferConcat) }
func BenchmarkByteConcat(b *testing.B) { benchmark(b, byteConcat) }
func BenchmarkPreByteConcat(b *testing.B) { benchmark(b, preByteConcat) }

运行该用例:

1
2
3
4
5
6
7
8
9
10
11
12
$ go test -bench="Concat$" -benchmem .
goos: darwin
goarch: amd64
pkg: example
BenchmarkPlusConcat-8 19 56 ms/op 530 MB/op 10026 allocs/op
BenchmarkSprintfConcat-8 10 112 ms/op 835 MB/op 37435 allocs/op
BenchmarkBuilderConcat-8 8901 0.13 ms/op 0.5 MB/op 23 allocs/op
BenchmarkBufferConcat-8 8130 0.14 ms/op 0.4 MB/op 13 allocs/op
BenchmarkByteConcat-8 8984 0.12 ms/op 0.6 MB/op 24 allocs/op
BenchmarkPreByteConcat-8 17379 0.07 ms/op 0.2 MB/op 2 allocs/op
PASS
ok example 8.627s
+fmt.Sprintffmt.Sprintf
strings.Builderbytes.Buffer[]bytepreByteConcat

1.3 建议

strings.Builder
strings.Builder

A Builder is used to efficiently build a string using Write methods. It minimizes memory copying.

string.BuilderGrow
1
2
3
4
5
6
7
8
func builderConcat(n int, str string) string {
var builder strings.Builder
builder.Grow(n * len(str))
for i := 0; i < n; i++ {
builder.WriteString(str)
}
return builder.String()
}

使用了 Grow 优化后的版本的 benchmark 结果如下:

1
2
BenchmarkBuilderConcat-8   16855    0.07 ns/op   0.1 MB/op       1 allocs/op
BenchmarkPreByteConcat-8 17379 0.07 ms/op 0.2 MB/op 2 allocs/op
[]byte[]byte

2 性能背后的原理

+
strings.Builder+
+
1
10 + 2 * 10 + 3 * 10 + ... + 10000 * 10 byte = 500 MB 
strings.Builderbytes.Buffer[]bytebuilder.Cap()strings.Builder
1
2
3
4
5
6
7
8
9
10
11
12
func TestBuilderConcat(t *testing.T) {
var str = randomString(10)
var builder strings.Builder
cap := 0
for i := 0; i < 10000; i++ {
if builder.Cap() != cap {
fmt.Print(builder.Cap(), " ")
cap = builder.Cap()
}
builder.WriteString(str)
}
}

运行结果如下:

1
2
3
4
5
$ go test -run="TestBuilderConcat" . -v
=== RUN TestBuilderConcat
16 32 64 128 256 512 1024 2048 2688 3456 4864 6144 8192 10240 13568 18432 24576 32768 40960 57344 73728 98304 122880 --- PASS: TestBuilderConcat (0.00s)
PASS
ok example 0.007s
0.52 MB
1
16 + 32 + 64 + ... + 122880 = 0.52 MB

2.2 比较 strings.Builder 和 bytes.Buffer

strings.Builderbytes.Buffer[]bytestrings.Builderbytes.Bufferbytes.Bufferstrings.Builder[]byte
  • bytes.Buffer
1
2
3
4
5
6
7
8
// To build strings more efficiently, see the strings.Builder type.
func (b *Buffer) String() string {
if b == nil {
// Special case, useful in debugging.
return "<nil>"
}
return string(b.buf[b.off:])
}
  • strings.Builder
1
2
3
4
// String returns the accumulated string.
func (b *Builder) String() string {
return *(*string)(unsafe.Pointer(&b.buf))
}
bytes.Buffer

To build strings more efficiently, see the strings.Builder type.

附 推荐与参考


edit this page last updated at 2023-08-16