LintCode 556: Standard Bloom Filter (System Design题)

最新推荐文章于 2022-04-06 16:49:20 发布

纸上得来终觉浅绝知此事要躬行

最新推荐文章于 2022-04-06 16:49:20 发布

阅读量499

点赞数

分类专栏： System Design 文章标签： LintCode

本文链接：https://blog.csdn.net/roufoo/article/details/82077512

版权

System Design 专栏收录该内容

30 篇文章 3 订阅

订阅专栏

Standard Bloom Filter

Implement a standard bloom filter. Support the following method:

StandardBloomFilter(k) The constructor and you need to create k hash functions.
add(string) Add a string into bloom filter.
contains(string) Check a string whether exists in bloom filter.
Example
Example1

Input:
StandardBloomFilter(3)
add(“lint”)
add(“code”)
contains(“lint”)
contains(“world”)
Output: [true,false]
Example2

Input:
StandardBloomFilter(10)
add(“hello”)
contains(“hell”)
contains(“helloa”)
contains(“hello”)
contains(“hell”)
contains(“helloa”)
contains(“hello”)
Output: [false,false,true,false,false,true]

解法1：参考的网上的答案。
注意：

Bloom Filter的
add()的主要思想是用多个hash方程来将输入map到bit串中的若干位，并对这些位都设1。contains()的主要思想是用多个hash方程将输入map到bit串中的若干位，若这些位都为1，则返回true；有一个或多位为0，则返回false。
用一句话来总结Bloom Filter的主要思想就是：全真未必真，有假必定假。
hash方程的cap和seed可以随意。seed越大则map得越均匀。
stl的bitset很好用。bitset<200000>是设一个200000位的bit串。 bitset.set(100)是将第100位置1。

#include <bitset>

class HashClass{
public:
    HashClass(int c, int s) : cap(c), seed(s) {}
        
    int hashFunc(string &value) {
        int ret = 0;
        for (int i = 0; i < value.size(); ++i) {
            ret += seed * ret + value[i];
            ret %= cap;
        }
        return ret;
    }
private:
    int cap, seed;
};

class StandardBloomFilter {
public:
    /*
    * @param k: An integer
    */
    StandardBloomFilter(int k) {
       this->k = k;
       for (int i = 0; i < k; ++i) {
           hashVec.push_back(new HashClass(100000 + i, 2 * i + 3));
       }
    }

    /*
     * @param word: A string
     * @return: nothing
     */
    void add(string &word) {
        for (int i = 0; i < k; ++i) {
            bits.set(hashVec[i]->hashFunc(word));
        }
    }

    /*
     * @param word: A string
     * @return: True if contains word
     */
    bool contains(string &word) {
        for (int i = 0; i < k; ++i) {
            if (!bits[hashVec[i]->hashFunc(word)])
                return false;
        }
        return true;
    }

private:
    int k;
    vector<HashClass *> hashVec;
    bitset<200000> bits;
};

纸上得来终觉浅绝知此事要躬行

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
LintCode 556: Standard Bloom Filter (System Design题)

参考的网上的答案。注意： 1) Bloom Filter的add()的主要思想是用多个hash方程来将输入map到bit串中的若干位，并对这些位都设1。contains()的主要思想是用多个hash方程将输入map到bit串中的若干位，若这些位都为1，则返回true；有一个或多位为0，则返回false。 2) hash方程的cap和seed可以随意。seed越大则map得越均匀。 3) ...
复制链接

扫一扫

专栏目录