google hash table ------稀疏hash表之sparse hashtable

最新推荐文章于 2023-03-14 11:48:39 发布

夕阳那边

最新推荐文章于 2023-03-14 11:48:39 发布

阅读量1.7k

点赞数

分类专栏：开源

本文链接：https://blog.csdn.net/iihtd/article/details/51192640

版权

开源专栏收录该内容

8 篇文章 0 订阅

订阅专栏

A sparse hashtable is a particular implementation of
// a hashtable: one that is meant to minimize memory use.
// It does this by using a *sparse table* (cf sparsetable.h),
// which uses between 1 and 2 bits to store empty buckets

// (we may need another bit for hashtables that support deletion).

sparse hashtable 非常的节约内存，google通过了sparse table实现，具体的做法是将empty元素用一个bit（或者是两个bit来表示）

现在我们重点来分析sparse table

这个table里面的成员变量如下：

static const int kBlockSize = 32;

  int size_;
  std::vector<std::vector<T> > elements_;
  std::vector<uint32> masks_;

可以发现，这个元素是用的一个二维数组俩表示的。

我们以sparse table中的set方法为例

void set(int index, const T& elem) {
    const int offset = index / kBlockSize;
    const int pos = index % kBlockSize;
    if (elements_[offset].size() == 0) {
      elements_[offset].resize(kBlockSize);
    }
    elements_[offset][index % kBlockSize] = elem;
    masks_[offset] |= (1U << pos);
  }

就是说，当这个index对应的大约32范围内的元素几乎都会落到一个范围里面

如果在某个区间，比如有32个，都没有元素，那么有可能这个区间内都不分配vector空间

接下来我们看看如何某个index中是不是有元素的：

bool test(int index) const {
    const int offset = index / kBlockSize;
    const int pos = index % kBlockSize;
    return ((masks_[offset] & (1U << pos)) != 0);
  }

如上面的代码，在set后，会将指定的offset中对应的mask向左移动，也就说，bit位为1的mask对应的pos处是有元素的。

这也就是为什么“which uses between 1 and 2 bits to store empty buckets”开头这句话说，为空的bucket只需要一个位来存储。

综上所述，稀疏哈希表通过这个数据结构来节省空间，其余的计算方法都和稠密哈希表大同小异。

由于博主水平有限，不足请不吝指出。

夕阳那边

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
google hash table ------稀疏hash表之sparse hashtable

A sparse hashtable is a particular implementation of// a hashtable: one that is meant to minimize memory use.// It does this by using a *sparse table* (cf sparsetable.h),// which uses between 1
复制链接

扫一扫

专栏目录