http://blog.csdn.net/jiaomeng/article/details/1495500中这么介绍布隆过滤器
Bloom Filter是一种空间效率很高的随机数据结构,它利用位数组很简洁地表示一个集合,并能判断一个元素是否属于这个集合。Bloom Filter的这种高效是有一定代价的:在判断一个元素是否属于某个集合时,有可能会把不属于这个集合的元素误认为属于这个集合(false positive)。因此,Bloom Filter不适合那些“零错误”的应用场合。而在能容忍低错误率的应用场合下,Bloom Filter通过极少的错误换取了存储空间的极大节省。
http://www.360doc.com/content/14/0914/13/7703112_409376352.shtml中讨论了使用布隆过滤器实现java缓存的架构。开源爬虫webmagic,也实现了一个基于它的url队列的构件。u
guava中这样介绍它,A Bloom filter offers an approximate containment test with one-sided error: if it claims that an element is contained in it, this might be in error, but if it claims that an element is not contained in it, then this is definitely true.下面的这个例子使用的是guava-18.0.jar。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
package
com;
import
java.nio.charset.Charset;
import
com.google.common.hash.BloomFilter;
import
com.google.common.hash.Funnels;
public
class
Bloom {
public
static
void
main(String[] args) {
BloomFilter<String> b = BloomFilter.create(Funnels.stringFunnel(Charset.forName(
"utf-8"
)),
1000
,
0.000001
);
b.put(
"121"
);
b.put(
"122"
);
b.put(
"123"
);
System.out.println(b.mightContain(
"12321"
));
BloomFilter<String> b1 = BloomFilter.create(Funnels.stringFunnel(Charset.forName(
"utf-8"
)),
1000
,
0.000001
);
b1.put(
"aba"
);
b1.put(
"abb"
);
b1.put(
"abc"
);
b1.putAll(b);
System.out.println(b1.mightContain(
"123"
));
}
}
|