Golang1.17源码分析之map-003
Golang1.17 学习笔记003
源码包:runtime/map.go
数据结构:
// A header for a Go map.
type hmap struct {
// Note: the format of the hmap is also encoded in cmd/compile/internal/reflectdata/reflect.go.
// Make sure this stays in sync with the compiler's definition.
count int // # live cells == size of map. Must be first (used by len() builtin) 当前元素数量
flags uint8
B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
noverflow uint16 // 溢出桶数量
hash0 uint32 // hash seed
buckets unsafe.Pointer // 当前hash桶指针
oldbuckets unsafe.Pointer // 当扩容时,这里存储就的hash桶指针;其他时候这里是个空指针
nevacuate uintptr // 即将迁移的旧桶编号
extra *mapextra // 处理hash冲突和溢出
}
// mapextra 结构
type mapextra struct {
// If both key and elem do not contain pointers and are inline, then we mark bucket
// type as containing no pointers. This avoids scanning such maps.
// However, bmap.overflow is a pointer. In order to keep overflow buckets
// alive, we store pointers to all overflow buckets in hmap.extra.overflow and hmap.extra.oldoverflow.
// overflow and oldoverflow are only used if key and elem do not contain pointers.
// overflow contains overflow buckets for hmap.buckets.
// oldoverflow contains overflow buckets for hmap.oldbuckets.
// The indirection allows to store a pointer to the slice in hiter.
overflow *[]*bmap
oldoverflow *[]*bmap
// nextOverflow holds a pointer to a free overflow bucket.
nextOverflow *bmap
}
hash桶数据结构:
// Maximum number of key/elem pairs a bucket can hold.
const bucketCntBits = 3
const bucketCnt = 1 << bucketCntBits // bucketCnt => 8
// A bucket for a Go map.
type bmap struct {
// tophash generally contains the top byte of the hash value
// for each key in this bucket. If tophash[0] < minTopHash,
// tophash[0] is a bucket evacuation state instead.
tophash [bucketCnt]uint8
}
负载因子:负载因子用于衡量一个哈希表冲突情况
- 哈希因子过小,空间利用率低
- 哈希因子过大,冲突验证,存取效率低
// Maximum average load of a bucket that triggers growth is 6.5.
// Represent as loadFactorNum/loadFactorDen, to allow integer math.
loadFactorNum = 13
loadFactorDen = 2
扩容:
当负载因子过大,或者超过桶的数量,就会进行扩容。核心就是通过以下函数:hashGrow、growWork 和 evacuate
mapassign 触发 => hashGrow:将现有桶移至到 oldbuckets,并创建新桶来存储新增的值
mapassign 和 mapdelete 触发 => growWork:调用 evacuate 函数,进行扩容或者疏散
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
// 触发扩容的核心条件
if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
hashGrow(t, h)
goto again // Growing the table invalidates everything, so try again
}
}
func hashGrow(t *maptype, h *hmap) {
// If we've hit the load factor, get bigger.
// Otherwise, there are too many overflow buckets,
// so keep the same number of buckets and "grow" laterally.
bigger := uint8(1)
if !overLoadFactor(h.count+1, h.B) {
bigger = 0
h.flags |= sameSizeGrow // 想想什么是位运算
}
oldbuckets := h.buckets
newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)
flags := h.flags &^ (iterator | oldIterator)
if h.flags&iterator != 0 {
flags |= oldIterator
}
// commit the grow (atomic wrt gc)
h.B += bigger
h.flags = flags
h.oldbuckets = oldbuckets
h.buckets = newbuckets
h.nevacuate = 0
h.noverflow = 0
if h.extra != nil && h.extra.overflow != nil {
// Promote current overflow buckets to the old generation.
if h.extra.oldoverflow != nil {
throw("oldoverflow is not nil")
}
h.extra.oldoverflow = h.extra.overflow
h.extra.overflow = nil
}
if nextOverflow != nil {
if h.extra == nil {
h.extra = new(mapextra)
}
h.extra.nextOverflow = nextOverflow
}
// the actual copying of the hash table data is done incrementally
// by growWork() and evacuate().
}
等量扩容:
- LoadFactor(负载因子) 没有超标,noverflow 较多 -> 等量扩容
- noverflow 过多判断依据 B <= 15 & noverflow >= 2^B || B > 15 & noverflow > 2^15
地址冲突解决法:
-
开发地址法,如果 hash 冲突了,那么往下面桶写数据。获取时,虽然定位到原来的地址但是发现 key 不同,就会继续遍历,
直到获取到数据,或者遇到空桶。go 不是用这种 -
拉链法:在冲突的桶后面新增一个桶,并连接起来。go 用的就是