先整两句
很多时候程序的性能问题是由“锁”导致的。
这篇博客通过一段测试程序比较不加锁、加锁两段代码的性能差异。
上代码
package lock_test
import (
"fmt"
"sync"
"testing"
)
var cache map[string]string
const NUM_OF_READER int = 40
const READ_TIMES = 100000
func init() {
cache = make(map[string]string)
cache["a"] = "aa"
cache["b"] = "bb"
}
func lockFreeAccess() {
var wg sync.WaitGroup
wg.Add(NUM_OF_READER)
for i := 0; i < NUM_OF_READER; i++ {
go func() {
for j := 0; j < READ_TIMES; j++ {
_, err := cache["a"]
if !err {
fmt.Println("Nothing")
}
}
wg.Done()
}()
}
wg.Wait()
}
func lockAccess() {
var wg sync.WaitGroup
wg.Add(NUM_OF_READER)
m := new(sync.RWMutex)
for i := 0; i < NUM_OF_READER; i++ {
go func() {
for j := 0; j < READ_TIMES; j++ {
m.RLock()
_, err := cache["a"]
if !err {
fmt.Println("Nothing")
}
m.RUnlock()
}
wg.Done()
}()
}
wg.Wait()
}
func BenchmarkLockFree(b *testing.B) {
b.ResetTimer()
for i := 0; i < b.N; i++ {
lockFreeAccess()
}
}
func BenchmarkLock(b *testing.B) {
b.ResetTimer()
for i := 0; i < b.N; i++ {
lockAccess()
}
}
init函数初始化了一个map,lockFreeAccess函数中没有使用锁来频繁的读这个map,lockAccess函数中创建的是读写锁,但在实际读的过程中仅使用了读锁。
代码最后是两个Benchmark测试用例,分别对lockFreeAccess、lockAccess做了性能测试。
运行这两个测试用例:
$ go test -v -bench=.
goos: windows
goarch: amd64
pkg: lock_test
BenchmarkLockFree-4 99 13283820 ns/op
BenchmarkLock-4 6 186524700 ns/op
PASS
ok lock_test 2.940s
结果显示,未使用锁比使用锁的执行速度快了一个数量级。
使用上pprof工具来分析具体原因:
$ go test -bench=. -cpuprofile=cpu.prof
goos: windows
goarch: amd64
pkg: lock_test
BenchmarkLockFree-4 79 13845084 ns/op
BenchmarkLock-4 6 178110983 ns/op
PASS
ok lock_test 2.854s
$ go tool pprof cpu.prof
Type: cpu
Time: Feb 26, 2020 at 6:15pm (CST)
Duration: 2.52s, Total samples = 9.19s (364.49%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top -cum
Showing nodes accounting for 9.19s, 100% of 9.19s total
flat flat% sum% cum cum%
0.15s 1.63% 1.63% 4.96s 53.97% lock_test_test.lockAccess.func1
0.95s 10.34% 11.97% 4.23s 46.03% lock_test_test.lockFreeAccess.func1
3.28s 35.69% 47.66% 3.75s 40.81% runtime.mapaccess2_faststr
2.26s 24.59% 72.25% 2.26s 24.59% sync.(*RWMutex).RLock
2.08s 22.63% 94.89% 2.08s 22.63% sync.(*RWMutex).RUnlock
0.35s 3.81% 98.69% 0.35s 3.81% runtime.add
0.03s 0.33% 99.02% 0.18s 1.96% runtime.(*bmap).keys
0.09s 0.98% 100% 0.09s 0.98% runtime.isEmpty
(pprof) list lockAccess
Total: 9.19s
ROUTINE ======================== lock_test_test.lockAccess.func1 in D:\project_root\gogogo\hello_test\src\lock_test\lock_test.go
150ms 4.96s (flat, cum) 53.97% of Total
. . 41: var wg sync.WaitGroup
. . 42: wg.Add(NUM_OF_READER)
. . 43: m := new(sync.RWMutex)
. . 44: for i := 0; i < NUM_OF_READER; i++ {
. . 45: go func() {
50ms 50ms 46: for j := 0; j < READ_TIMES; j++ {
. . 47:
10ms 2.27s 48: m.RLock()
50ms 520ms 49: _, err := cache["a"]
40ms 40ms 50: if !err {
. . 51: fmt.Println("Nothing")
. . 52: }
. 2.08s 53: m.RUnlock()
. . 54: }
. . 55: wg.Done()
. . 56: }()
. . 57: }
. . 58: wg.Wait()
(pprof)
执行测试用例时,使用了-cpuprofile选项来生成cpu.prof文件。
然后使用“go tool pprof cpu.prof”命令来分析cpu.prof文件。
从top和list命令的输出结果可以知道,主要耗时原因来自48行和53行的m.RLock()和m.RUnlock()语句。
注:
博客内容为极客时间视频课《Go语言从入门到实战》学习笔记。
参考课程链接:
https://time.geekbang.org/course/intro/160?code=NHxez89MnqwIfa%2FvqTiTIaYof1kxYhaEs6o2kf3ZxhU%3D&utm_term=SPoster
博客参考代码:
https://github.com/geektime-geekbang/go_learning/blob/master/code/ch48/lock/lock_test.go
若访问github或Google较慢,可使用加速器:
http://91tianlu.date/aff.php?aff=3468