倒排索引的一些算法调研

下面的文章专门针对搜索引擎里的倒排列表 sorted sets研究交集算法,思路类似快排,非常值得一看

www.cs.ucr.edu/~stelo/cpm/cpm04/25_Baeza-yates.pdf
 

合并sorted sequence算法:

https://github.com/rklaehn/rklaehn.github.io/blob/master/_posts/2016-01-05-binarymerge.md

 

汇总资料:https://github.com/TechConf/CodeMash2016/blob/master/Great%20Galloping%20Cuckoos-%20Algorithms%20Faster%20than%20log(n)/index.html

关键信息:

 ## Comparisons of Set Intersections
  
 <small>Excerpted from [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)</small>
  
 Algorithm | # of comparisons
 -----------|----------------:
 Sequential | 119479075
 Adaptive | 83326341
 Small Adaptive | 68706234
 Interpolation Sequential | 55275738
 Interpolation Adaptive | 58558408
  
  
 </markdeep></section><section><markdeep>
 ## Comparisons of Set Intersections
  
 <small>Excerpted from [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)</small>
  
 Algorithm | # of comparisons
 -----------|----------------:
 Sequential | 119479075
 Interpolation Small Adaptive | 44525318
 Extrapolation Small Adaptive | 50018852
 Extrapolate Many Small Adaptive | 44087712
 Extrapolate Ahead Small Adaptive | 43930174

 

## Resources: Sets
 - [A Fast Set Intersection Algorithm for Sorted Sequences](http://www.cs.ucr.edu/~stelo/cpm/cpm04/25_Baeza-yates.pdf)
 - [Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences](https://cs.uwaterloo.ca/~ajsaling/papers/paper-spire.pdf)
 - [Experimental Comparison of Set Intersection Algorithms for Inverted Indexing](http://ceur-ws.org/Vol-1003/58.pdf)
 - [Fast Set Intersection in Memory](http://research.microsoft.com/pubs/142850/p255-dingkoenig.pdf)
 - [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)
 - [Faster Set Intersection with SIMD instructions by Reducing Branch Mispredictions](http://www.vldb.org/pvldb/vol8/p293-inoue.pdf)
 - [SIMD Compression and the Intersection of Sorted Integers](http://arxiv.org/abs/1401.6399)

 

https://github.com/lemire/SIMDCompressionAndIntersection 

A C++ library to compress and intersect sorted lists of integers using SIMD instructions
里面提及的一些资料:

Documentation

This work has also inspired other work such as...

 
提及较多的:
https://github.com/Randl/CS/tree/master/Hwang-Lin
 
 

转载于:https://www.cnblogs.com/bonelee/p/6591201.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值