基于redis实现的BloomFilter

最新推荐文章于 2024-07-29 09:28:04 发布

Hilite。

最新推荐文章于 2024-07-29 09:28:04 发布

阅读量3.5k

点赞数 2

分类专栏： redis

本文链接：https://blog.csdn.net/weixin_36507118/article/details/100667442

版权

redis 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

众所周知，google的guava框架实现了BloomFilter，guava的BloomFilter和redis的bitMap都是基于位图算法的，所以redis也可以实现BloomFilter，并且相对于BloomFilter，redis的数据存在三方redis服务器上的，并不像guava的BloomFilter是存在本地的，这对于内存损耗及分布式系统来说显然是不适合的，所以今天分享一个基于redis实现的BloomFilter（看之前最好了解一下redis的bitmap----https://blog.csdn.net/u012888052/article/details/80380143）。

直接上代码：

@ConfigurationProperties("bloom.filter")
@Component
public class RedisBloomFilter {

    //预计插入量
    private long expectedInsertions;

    //可接受的错误率
    private double fpp;

    @Autowired
    private RedisTemplate redisTemplate;


    //bit数组长度
    private long numBits;
    //hash函数数量
    private int numHashFunctions ;

    public long getExpectedInsertions() {
        return expectedInsertions;
    }

    public void setExpectedInsertions(long expectedInsertions) {
        this.expectedInsertions = expectedInsertions;
    }

    public void setFpp(double fpp) {
        this.fpp = fpp;
    }

    public double getFpp() {
        return fpp;
    }

    @PostConstruct
    public void init(){
        this.numBits = optimalNumOfBits(expectedInsertions, fpp);
        this.numHashFunctions = optimalNumOfHashFunctions(expectedInsertions, numBits);
    }

    //计算hash函数个数
    private int optimalNumOfHashFunctions(long n, long m) {
        return Math.max(1, (int) Math.round((double) m / n * Math.log(2)));
    }

    //计算bit数组长度
    private long optimalNumOfBits(long n, double p) {
        if (p == 0) {
            p = Double.MIN_VALUE;
        }
        return (long) (-n * Math.log(p) / (Math.log(2) * Math.log(2)));
    }

    /**
     * 判断keys是否存在于集合
     */
    public boolean isExist(String key) {
        long[] indexs = getIndexs(key);
        List list = redisTemplate.executePipelined(new RedisCallback<Object>() {

            @Nullable
            @Override
            public Object doInRedis(RedisConnection redisConnection) throws DataAccessException {
                redisConnection.openPipeline();
                for (long index : indexs) {
                    redisConnection.getBit("bf:hilite".getBytes(), index);
                }
                redisConnection.close();
                return null;
            }
        });
        return !list.contains(false);
    }

    /**
     * 将key存入redis bitmap
     */
    public void put(String key) {
        long[] indexs = getIndexs(key);
        redisTemplate.executePipelined(new RedisCallback<Object>() {

            @Nullable
            @Override
            public Object doInRedis(RedisConnection redisConnection) throws DataAccessException {
                redisConnection.openPipeline();
                for (long index : indexs) {
                    redisConnection.setBit("bf:hilite".getBytes(),index,true);
                }
                redisConnection.close();
                return null;
            }
        });
    }

    /**
     * 根据key获取bitmap下标
     */
    private long[] getIndexs(String key) {
        long hash1 = hash(key);
        long hash2 = hash1 >>> 16;
        long[] result = new long[numHashFunctions];
        for (int i = 0; i < numHashFunctions; i++) {
            long combinedHash = hash1 + i * hash2;
            if (combinedHash < 0) {
                combinedHash = ~combinedHash;
            }
            result[i] = combinedHash % numBits;
        }
        return result;
    }

    /**
     * 获取一个hash值
     */
    private long hash(String key) {
        Charset charset = Charset.forName("UTF-8");
        return Hashing.murmur3_128().hashObject(key, Funnels.stringFunnel(charset)).asLong();
    }
}

yml配置：

bloom:
    filter:
      expectedInsertions: 1000
      fpp: 0.001F

大致流程为：

1：根据预计插入量及可接受错误率计算出bit数组长度及hash函数数量。

2：将key值hash后，根据hash函数的数量，计算出这个key的不同的下标数组，用于匹配key值。

3：遍历key值的下标，将相同的值（bf:hilite）根据下标的值，转存为对应下标值长度的二进制数存入bitmap。

4：判断时，将key按相同方式转换为下标数组，通过getBit（）方法判断是否存在。

看代码也可以知道hash函数数量numHashFunctions与预计插入量expectedInsertions无关，与可接受的错误率fpp成反比，bit数组长度numBits与预计插入量expectedInsertions成正比，与可接受的错误率fpp成反比。

Hilite。

关注

2
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录