Bloom filter

优点:
  1. bloom filter 不存储字符串本身,需要的内存极少,不因为字符串变长,而需要大量的内 存。
  2. bloom filter 速度很快,时间复杂度 x 倍的 O(1),x 是 hash 函数查找的次数,x 一般是 1~20 以内。
  3. bloom filter 大小一旦确定,内存就不会随业务的增长而增加
缺点:
  1. 不能删除
  2. 不能扩容
  3. 有误判率 (很低)

需求分析
  1. Improve logger search
    Bloom Filter优点
    Determines membership in a set: Definitely not in the set, or Probably in the set
    Can add and lookup in constant time and constant space
    Constant time results for queries with 0 matches
    Dramatically speed up needle in haystack search (Searches for rare values)
    Worst case, we can’t rule out any data chunks

实际效果
本来的search: 2 millions/second.
now: billions/second
data file:1GB. data chunk: MB. events in one data chunk: millions or 100k
chunks in one filterchunk: by default 4054

prototype例子
Using Bleep + ESM to send CEF events to L7500 appliance (32 CPU, 64 GB RAM)
55 columns are bloom filtered, including 37 for full text indexing
All bloom filters are “scalable” to maintain our targeted false positive rate.
Two “tiers”
Master Bloom filter covers all events
Initial capacity: 1M elements
False positive rate: 1 in 1,000
Total size: 97MB
Chunk bloom filters cover 10,000 data chunks
Initial capacity: 85K elements
False positive rate: 1 in 100
Total size: 5.6MB per 10,000 data chunks
对于这样的数据量可以完全 in memory


Core data structures
  1. BasicBloomFilter:heavily modified Orestes bloom filter
    (1) given n=expected data size, p=false positive rate.
    算出m=BitMap的大小,再算出k=hash函数的个数
    (2) given a value,用Linear combinations of hashes 算出hash value,就是BitSet中为1的indices,存在PrecomputedHash中的int数组里
    论文: http://www.eecs.harvard.edu/~kirsch/pubs/bbbf/rsa.pdf
    具体实现用murmurhash, 得到hash1 和hash2.
    然后hash=(hash1+i*hash2)%m, i from 0 to k-1
  2. BasicPrecomputedHash: 存放key和其他相关信息like hashmethod, m, k. 在search的时候使用,so that terms are hashed once but checked against many bloomfilters
  3. scalable bloom filter解决数据量不断增加的问题
    Start with one bloom filter,When it reaches your specified number of elements:
    Don’t add any more elements to the existing bloom filter
    Add another bloom filter “under the hood” that can handle twice as many elements
    When querying, look for a match against any underlying filter
    底层实现就是linkedlist of basicbloomfilters.
    当然使用的hash也ScalablePrecomputedHash ,basically就linkedlist of BasicPrecomputedHash.
    Tradeoffs
    No need to choose an expected number of elements, only “initial capacity”
    Doing a union makes scalable bloom filters pretty useless?
  4. FilterChunk:
    每个FilterChunk的逻辑结构包含 by default 4054个ScalableBloomFilter. one ScalableBloomFilter per column, per event range, per storage group. one covers 4054 data chunks
  5. BloomFilterManager
    top level class handle event ingestion and search. MasterBloomFilterManager 和ColumnChunkBloomFilterManager都继承这个类
  6. FilterChunkStore: handles storage (write to disk, flush()…), and retrieval of FilterChunks.
    对于MasterBloomFilter 和 Data range BloomFilter 有不同的子类.
  7. FilterChunkFactory: Makes new FilterChunk objects for the BloomFilterManager

  8. Master Bloom filter 和ColumnChunkBloomFilter实现上的区别
    MasterBloomFilterManager只有一个很大的上限,然后每update一定次数之后flush()到硬盘
    ColumnChunkBloomFilter有CHUNKS_PER_FILTERCHUNK, by default 4054

basic bloom filter实现
mysqlQuey ->Bloom Filter Condition -> PrecomputedHash 中的成员变量,一个int数组,包含了hash后对应BitSet中的indices,然后传到 BasicBloomFilter.contains(PrecomputedHash h),调用contains(int[] hash), 最后由BasicBloomFilter中的BitSet成员变量,bloom调用get方法检查是否hit

3.BloomFilter creation flow

Created with Raphaël 2.1.2 FilterChunkFactory FilterChunk (Empty) BloomFilterManager FilterChunk(Full) FilterChunkStore

Event ingestion flow
overview
1. events arrive at ESM or Logger
2. Batches of events are constructed
3. Events are indexed and ready to persist
4. event indexes are “unpacked” and field values are added to

a. global summary ( for auto-complete)
b. bloom filter tracked columns
call hierarchy
1.ColumnChunkBloomFilterManager.addChunkDataFromSummaryProcessor
2.BloomFilterManager.addChunkDataFromSummaryProcessor:

FilterChunkStore.get()由 eventrange, chunkid 和storagegroupid找到FilterChunk
如果没有找到能cover当前chunkid的FilterChunk, call FilterChunkFactory.create()

3.FilterChunk.addChunkDataFromSummaryProcessor:通过column找到对应的ScalableBloomFilter
4.ScalableBloomFilter.add(), 调用linkedlist的最后一个bloomfilter的add()方法,如果这最后一个满了,create新的放在linkedlist末尾,再调用add

5. event data is persisted to Logger storage

1.Master call hierarchy
BloomFilterManager.updateFilterChunk
FilterChunkStore.update
SingleFilterChunkDiskStore.update判断chunk update counter, if 达到次数 调用flush():
flush(fileid)的call hierarchy

startWrite(fileid)
doWrite(fileId)
finishWrite(fileId)

Event Ingestion “hook”
1.In Logger product: ROSChunkPostProcessor
In ESM product: Logger code – SummaryProcessor
2.Chunk metadata and an array of columns/values are passed to BloomFilterManagers to add to bloom filter data structures.
3.Ultimately this data goes to FilterChunk.addChunkData
Both BloomFIlterManagers have a FilterChunk under-construction kept in memory.
4.Bloom filter data is persisted at intervals determined by the chunks ingested, NOT directly by time.
This is why we say a single data range bloom filter chunk covers about 1-2 hours.
5.When chunk data is added, range metadata in FilterChunk is updated

EndTime span
MRT span
Event ID span
Chunk ID span

When Chunk ID span > limit, time to persist this FilterChunk
6.persisting…

  • master bloom filter
    Managed by SingleFilterChunkDiskStore
    Separate data files in /opt/arcsight/logger/data/indexes
    Files on disk:
    Arcsight_Master_Log: Transaction log for file write operations: two-byte records: status+fieid
    Arcsight_Master_Data_n: Data file
    Need to guard against incomplete write of bloom filter data

流程如下
1.Check which data file is next to write (e.g. file 1 is next)
2.Add record to transaction log: (TRANSACTION_START, File 1)
3.Serialize, compress, and write master bloom filter data to file.
4.Add record to transaction log: (TRANSACTION_SUCCESS, File 1)
5.Update counter so we know file 0 is next data file to be written.

write操作具体实现使用的是java.nio,先buffer.put()把数据写入buffer,然后buffer.flip()把buffer由写模式转换成读模式,最后filechannel.read(buffer)把buffer内数据读入filechannel
下面是一个很好的比方。

原文中说了最重要的3个概念 http://www.iteye.com/magazines/132-Java-NIO
Channel 通道
Buffer 缓冲区
Selector 选择器

其中Channel对应以前的流,Buffer不是什么新东西,Selector是因为nio可以使用异步的非堵塞模式才加入的东西。
以前的流总是堵塞的,一个线程只要对它进行操作,其它操作就会被堵塞,也就相当于水管没有阀门,你伸手接水的时候,不管水到了没有,你就都只能耗在接水(流)上。
nio的Channel的加入,相当于增加了水龙头(有阀门),虽然一个时刻也只能接一个水管的水,但依赖轮换策略,在水量不大的时候,各个水管里流出来的水,都可以得到妥善接纳,这个关键之处就是增加了一个接水工,也就是Selector,他负责协调,也就是看哪根水管有水了的话,在当前水管的水接到一定程度的时候,就切换一下:临时关上当前水龙头,试着打开另一个水龙头(看看有没有水)。
当其他人需要用水的时候,不是直接去接水,而是事前提了一个水桶给接水工,这个水桶就是Buffer。也就是,其他人虽然也可能要等,但不会在现场等,而是回家等,可以做其它事去,水接满了,接水工会通知他们。
这其实也是非常接近当前社会分工细化的现实,也是统分利用现有资源达到并发效果的一种很经济的手段,而不是动不动就来个并行处理,虽然那样是最简单的,但也是最浪费资源的方式。

  • Data range bloom filter
    1.FilterChunkStoreImpl serializes and compresses the FilterChunk
    2.Metadata and compressed bytes are stuffed into BloomFilterChunk
    3.BloomFilterChunk is handed to StoreManager for persistence
    4.BloomFilterChunk is appended to Logger data file by StoreFile
    5.Postgresql is updated with metadata by StoreManager calling BloomFilterChunk.persist().
    6.A new under-construction FilterChunk is created and kept in memory for the next incoming events.
    1. ESM event archive support. FilterChunkStoreImpl registered as EventArchiveObserver. 然后对每个storagegroup保存一个HashMap<Long,FilterChunk>,Long类型的是date. archive的时候把对应fiterchunk persist,然后remove from hashmap. 注意同一天可能有很多个filter chunks, hashmap里只保存lastest,之前的都在新的加入时就persist了.

-Crash recovery design choices
data is under construction in memory, due to crash, or shutdown, this data never gets persisted.
BloomFilterRebuilder 利用MasterBloomFilter的最后一个chunkid 和postgresql中最后的一个chunkid可以得到这个gap,然后把两个chunkid之间的data重新load进来再persist


4.Search flow

get the chunk metadata for chunks between time x and time y, but dont get chunk IDs between a and b, between c and d, etc.

overview

1.Some component issues a SQL query against event table
Logger search, active channel, report
2.MySQL storage engine submits query to Logger servers
Includes time range and query conditions (WHERE clause)
CDistributedSearchGetMetadataInitCommand
SQL query -> JDBC call -> Storage engine send request (RPC) 给logger server
3.Master bloom filter is checked
MasterBloomFilterManager.shouldContinueSearch()
4.Rejected chunk list is generated from data range bloom filters
BloomFilterManager.getChunkRangesRejectedByBloomFilter()
5.Relevant data chunks are retrieved and returned to storage engine

data range bloom filter search
1.Conditions are parsed into tree (BloomFilterQuery). AND, OR, IN, TRUE
2.Check if underlying data must always be searched
3.Get FilterChunks in query time range. FilterChunkStoreImpl依靠chunkid调用get
4.Check each FilterChunk against BloomFilterQuery
If FilterChunk can be ruled out, add event count to events scanned.
5.Update events scanned in HypotheticalQuerySizeManager
6.Merge adjacent rejected chunk ranges to simplify query
7.Rejected chunk ranges are passed by ReadStoreMgr to ColumnChunkIterator.
8.ColumnChunkIterator will construct long-winded PostgreSQL query
Avoids rejected chunk ranges


未完成
  1. Multi-thread
    One thread for updating. Multiple threads for reading.
    都是用ThreadPoolExecutor. 但是没有synchronized,所以每个thread操作不是atomic的

blocking queue
mysql plug-in storage engine

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值