golang map实现原理

最新推荐文章于 2024-08-17 22:16:33 发布

elecjun

最新推荐文章于 2024-08-17 22:16:33 发布

阅读量382

点赞数

分类专栏： golang 文章标签： golang 后端

本文链接：https://blog.csdn.net/elecjun/article/details/123260811

版权

golang 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

map如何顺序读取

Golang中map的遍历输出的时候是无序的，不同的遍历会有不同的输出结果，如果想要顺序输出的话，需要额外保存顺序，例如使用slice，将slice中排序，再通过slice的顺序去读取。
https://studygolang.com/articles/27496?fr=sidebar

map实现

Golang采用了HashTable的实现，解决冲突采用的是链地址法。也就是说，使用数组+链表来实现map
实现的图示
hmap结构

// A header for a Go map.
type hmap struct {
    count     int    // 元素的个数
    flags     uint8  // 状态标志
    B         uint8  // 可以最多容纳 6.5 * 2 ^ B 个元素，6.5为装载因子
    noverflow uint16 // 溢出的个数
    hash0     uint32 // 哈希种子

    buckets    unsafe.Pointer // 桶的地址
    oldbuckets unsafe.Pointer // 旧桶的地址，用于扩容
    nevacuate  uintptr        // 搬迁进度，小于nevacuate的已经搬迁
    
    extra *mapextra // optional fields 
}

mapextra结构

type mapextra struct {
	// If both key and elem do not contain pointers and are inline, then we mark bucket
	// type as containing no pointers. This avoids scanning such maps.
	// However, bmap.overflow is a pointer. In order to keep overflow buckets
	// alive, we store pointers to all overflow buckets in hmap.extra.overflow and hmap.extra.oldoverflow.
	// overflow and oldoverflow are only used if key and elem do not contain pointers.
	// overflow contains overflow buckets for hmap.buckets.
	// oldoverflow contains overflow buckets for hmap.oldbuckets.
	// The indirection allows to store a pointer to the slice in hiter.
	overflow    *[]*bmap
	oldoverflow *[]*bmap

	// nextOverflow holds a pointer to a free overflow bucket.
	nextOverflow *bmap
}

bmap结构

// A bucket for a Go map.
type bmap struct {
    // 每个元素hash值的高8位，如果tophash[0] < minTopHash，表示这个桶的搬迁状态
    tophash [bucketCnt]uint8
    // 接下来是8个key、8个value，但是我们不能直接看到；为了优化对齐，go采用了key放在一起，value放在一起的存储方式，代码比交替key/elem/key/elem/...复杂一点，但是消除了填充
    // 再接下来是hash冲突发生时，下一个溢出桶的地址
}

上面这个数据结构并不是 golang runtime 时的结构, 在编译时候编译器会给它动态创建一个新的结构
type bmap struct {
    tophash  [8]uint8
    keys     [8]keytype
    values   [8]valuetype
    overflow uintptr  // 指向溢出桶指针
}

初始化

1、创建一个hmap结构
2、生成哈希种子hash0，并赋值到hmap对象中（用于为key创建哈希值）
3、根据算法规则计算得到合适的B
4、根据B去创建桶，并存放在buckets数组中。
- 当B<4时，创建桶的个数规则为：2^B(标准桶)
- 当B>=4时，创建桶的个数规则为：2^B + 2^(B-4)(标准桶+溢出桶)
每个bmap可以存8个键值对，当不够存时需要使用溢出桶，并用当前的overflow指针指向溢出桶位置

计算规则

根据key和hash0生成hash值，哈希值是 64个 bit 位

hash := t.hasher(key, uintptr(h.hash0))

tophash，获取hash值的高八位

top := uint8(hash >> (sys.PtrSize*8 - 8))

计算桶的index, bucket := hash & ((1<<B) - 1)，即 hash % 2^B

bucket := hash & bucketMask(h.B)

// bucketShift returns 1<<b, optimized for code generation.
func bucketShift(b uint8) uintptr {
	// Masking the shift amount allows overflow checks to be elided.
	return uintptr(1) << (b & (sys.PtrSize*8 - 1))
}

// bucketMask returns 1<<b - 1, optimized for code generation.
func bucketMask(b uint8) uintptr {
	return bucketShift(b) - 1
}

扩容

触发 map 扩容的时机：在向 map 插入新 key 的时候，会进行条件检测，符合下面这 2 个条件，就会触发扩容：

1、装载因子超过阈值（源码里定义的阈值是 6.5），翻倍扩容？

装载因子 loadFactor := count / (2^B)，count 就是 map 的元素个数，2^B 表示 bucket 数量。

// Maximum average load of a bucket that triggers growth is 6.5.
// Represent as loadFactorNum/loadFactDen, to allow integer math.
loadFactorNum = 13
loadFactorDen = 2

2、overflow 的 bucket 数量过多，即溢出桶过多，这有两种情况：