本文重点:
- 提出几种算法,解决多维数据集上精确的RSQ问题
- 指出RSSA缺点:对于相同节点多次访问,导致冗余的i/o和cpu成本
- 利用索引重用机制,实现R树的单次遍历,显著减少I/O成本
- 全局修剪启发式算法
- 优于RSSA几个数量级
现状
skyline查询的变体
In addition to conventional skyline operator, numerous skyline
query variants have also been studied in the literature. Examples
include constrained skyline query (Chen, Cui, & Lu, 2011; Dellis,
Vlachou, Vladimirskiy, Seeger, & Theodoridis, 2006; Papadias
et al., 2005), dynamic skyline query (Papadias et al., 2005;
Sacharidis, Bouros, & Sellis, 2008), k-Skyband query (Papadias
et al., 2005), SkyCluster query (Huang, Xiang, Zhang, & Liu, 2011),
subspace skyline query (Pei et al., 2006; Tao, Xiao, & Pei, 2011), met-
ric skyline query (Chen & Lian, 2009; Fuhry, Jin, & Zhang, 2009),
probabilistic skyline query (Pei, Jiang, Lin, & Yuan, 2007; Zhang,
Lin, Zhang, Wang, & Yu, 2009), representative skyline query (Lin,
Yuan, Zhang, & Zhang, 2007; Tao, Ding, Lin, & Pei, 2009), stochastic
skyline query (Lin, Zhang, Zhang, & Cheema, 2011), parallel skyline
query (Gao, Chen, Chen, & Chen, 2006; Kohler, Yang, & Zhou, 2011;
Vlachou, Doulkeridis, & Kotidis, 2008; Wu et al., 2006), and skyline
retrieval in distributed environments (such as P2P systems and web
information systems) (Hose & Vlachou, 2012), to name but a few.
BBRS和RSSA的图示
核心部分
本文讲述了两个算法,实现了只遍历一次R树就可以得到结果,大大降低I/O复杂度。
第一个FRRS,主要是将淘汰的点留下,这样后面就不用在遍历一次R树,但是非常不实用于数据量庞大的情况,所以意义不大
第二个算法GSRS很有用
主要降低了在窗口查询是需要查询的数据规模,只需要查询global skyline和global 1-skyline即可。剩下的部分类似于RSSA。
定义了新的global 1-skyline,即为仅仅被一个global skyline点控制的点,引理3.1证明了这种窗口查询方法的正确性。
GSRS算法流程:
第一步,计算出global skyline和global 1-skyline;第二步,用dynamic skyline修剪;第三步,类似RSSA,准确判断的部分依然根据DADR和DDR直接判断是否为reverse skyline ,不能准确判断的部分进行窗口查询(按照GSRS的方法)。