插入时的循环踢出。由于标准Cuckoo hashing不能预测哪个item有空的备用buckets,只能通过BFS或者是随机的方式去决策。会浪费时间不说,还可能造成死循环。A good strategy should find a solution fast if such solution exists
一个循环中多桶检查:查找一个item的时候,所有的备用桶都需要访问,影响了查找性能,尤其在表很大,需要把表放在外部内存的情况下。narrow down the subset of buckets that may contain the item beforehand and optimize the accessing pattern
将stash on-chip?减少对性能的影响,当stash本身满的时候,其中的items会尝试插入主表When the stash itself is full, items stored in it will take a try to the main table until some space is freed. A small stash of size 4 is regarded as enough to achieve rather high load (for example 95% in [24]) with high probability.?:Cuckoo hashing with a stash (CHS)[22] propose 已经很高了,好奇如何提升emm
多桶查询
当实施的平台上要用很小很快的on-chip memory去处理很大的表,(for example the ASIC/FPGA/SOC based packet processing devices).每次要检查多个位置就会成为大问题。
multiple copies of item,将item同时放入所有可用桶中。不用随机选择一个可用桶插入 通过冗余度可以很清楚地知道冲突的时候替换谁是最优方案,加速了插入速度,也避免了死循环。Keeping copies in all the available candidate buckets will maintain the flexibility and avoid entering the sub-optimal situation early: the optimal placement will come out naturally later on when the other occupied buckets are appropriately given away as per request to new items, who turn out to be the better owners of these buckets in an overall optimal arrangement.
查找的时候,由于同一个item的所有桶的计数器应该一样,这个特点可以被用于排查掉不可能的情况。比如其中一个桶的c值是0,那么该item一定不存在。Furthermore, because an item can always overwrite a redundant copy to settle down, if a lookup fails with any candidate bucket having counter value larger than 1, we know that item must have not been inserted before and skip checking the stash. These 通过这些观察,可以避免查询一些bucket,或者是对stash的不必要检查。
McCuckoo特别适合于主表只能放在慢一点的二级存储上,上述三个问题可以被统一解决。为了最大化counters的好处,要把它们放在on-chip embedded memory. a compact on-chip counter array。操作逻辑也很简单。