数据结构 ---- 布隆过滤器

最新推荐文章于 2024-06-25 01:30:00 发布

liru_1996

最新推荐文章于 2024-06-25 01:30:00 发布

阅读量233

点赞数 1

分类专栏：数据结构

本文链接：https://blog.csdn.net/liru_1996/article/details/80597881

版权

数据结构专栏收录该内容

18 篇文章 0 订阅

订阅专栏

基本概念
如果想判断一个元素是不是在一个集合里，一般想到的是将所有元素保存起来，然后通过比较确定。链表、树等等数据结构都是这种思路，但是随着集合中元素的增加，我们需要的存储空间越来越大，检索速度也越来越慢，不过还有一种叫哈希表的数据结构，它可以通过Hash函数将一个元素映射成一个位阵列中的一个点，这样一来，我们只要看看这个点是不是1就知道集合中有没有它了，这就是布隆过滤器的思想。

Hash面临的问题就是冲突，在这里解决方法是使用多个Hash函数，如果他们中有一个说元素不在集合中，那肯定就不在。如果他们都说在，虽然也有一定可能性它们再说谎，不过直觉上判断这种事情的概率是比较低的。
这里写图片描述

代码实现：
bloom_filter.h

#pragma once
#include "bitmap.h"
#include <stddef.h>

//此处定义了布隆过滤器的Hash函数
//把字符串转成下标
typedef  uint64_t (*BloomHash)(const char*);
#define BloomHashCount 2

typedef struct BloomFilter{
    Bitmap bm; 
    BloomHash bloom_hash[BloomHashCount];
}BloomFilter;

void BloomFilterInit(BloomFilter* bf);

void BloomFilterDestroy(BloomFilter* bf);

void BloomFilterInsert(BloomFilter* bf,const char* str);

int BloomFilterIsExist(BloomFilter* bf,const char* str);

bloom_filter.c

#include "bloom_filter.h"                                                                              
#include "hash_func.h"
#include <stdlib.h>

#define BitmapMaxSize 10000

void BloomFilterInit(BloomFilter* bf){
    if(bf == NULL){
        return;
    }
    BitmapInit(&bf->bm,BitmapMaxSize);
    bf->bloom_hash[0] = SDBMHash;
    bf->bloom_hash[1] = BKDRHash;
    return;
}

void BloomFilterDestroy(BloomFilter* bf){
    if(bf == NULL){
        return;
    }
    bf->bloom_hash[0] = NULL;
       bf->bloom_hash[1] = NULL;
    BitmapDestroy(&bf->bm);
    return;
}

void BloomFilterInsert(BloomFilter* bf,const char* str){
    if(bf == NULL || str == NULL){
        return;
    }
    size_t i = 0;
    for(;i < BloomHashCount;++i){
        size_t hash = bf->bloom_hash[i](str) % BitmapMaxSize;
        BitmapSet(&bf->bm,hash);
    }
    return;
}                                                                                                      

int BloomFilterIsExist(BloomFilter* bf,const char* str){
    if(bf == NULL || str == NULL){
        if(bf == NULL || str == NULL){
        return 0;
    }
    size_t i = 0;
    for(;i < BloomHashCount;++i){
        uint64_t hash = bf->bloom_hash[i](str) % BitmapMaxSize;
        int ret = BitmapTest(&bf->bm,hash);
        if(ret == 0){
            return 0;
        }
    }
    return 1;
}

说明：这里的两个哈希函数，大家可以自己进行实现即可，无固定写法；

liru_1996

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
数据结构 ---- 布隆过滤器

基本概念如果想判断一个元素是不是在一个集合里，一般想到的是将所有元素保存起来，然后通过比较确定。链表、树等等数据结构都是这种思路，但是随着集合中元素的增加，我们需要的存储空间越来越大，检索速度也越来越慢，不过还有一种叫哈希表的数据结构，它可以通过Hash函数将一个元素映射成一个位阵列中的一个点，这样一来，我们只要看看这个点是不是1就知道集合中有没有它了，这就是布隆过滤器的思想。Hash面临...
复制链接

扫一扫

专栏目录