层次数据上关键字检索的结果聚合.pdf
第 34 卷第 10 期
2011 年 10 月
计 算 机 学 报
CHINESE JOURNAL OF COMPUTERS
层次数据上关键字检索的结果聚合
胡吴 何震漉
(复旦大学计算机科学技术学院 上海 200433)
Vol. 34 No. 10
Oct. 2011
摘 要 由于使用方便等优点,数据库上的关键字检索技术使用户可以得到所需信息而不必书写复杂的 SQL 语
句.但大部分现有的检索方法都关注通过连接操作得到包含所有关键字的元组连接树,忽略了对于检索结果的信
息整合,这从某种程度上影响了用户对于检索结果的判断.文中提出并实现一种改进的关键字检索系统框架,在具
有层次结构的属性指导下对得到的元组连接树结果做聚合操作,通过寻找愚低层次最小覆盖聚合将关系更为紧密
的元组作为更加相关的检索结果反馈给用户.文中还提出了基本的聚合算法并对其做改进从而减少了系统的响应
时间.同时,为了改善用户体验,文中定义并给出了检索结果的摘要问题及其算法,使用户最大程度地了解检索结
果.实验数据表明,文中的方法能够以较高的效率和较低的计算代价有妓地完成检索结果的聚合和摘要.
关键词 关键字检索;聚合操作;层次结构:摘要算法
中图法分类号 TP311 DOI 号: 10. 3724/SP.J. 1016.2011.01986
Aggregate Keyword Queries 00 Hierarchy Relatiooal Databases
HU Hao HE Zhen-Ying
(School of Computer Science. Fudan University. Shanghai 200433)
Abstract Keyword search (KWS) has been well accepted as a proven , user-friendly way to re-
trieve information , and recently applied successfully on relational databases. Today , this tech-
nique allows users to find pieces of information without having to compose complicated SQL que-
ries. However , almost all the existing approaches focus on finding joined tuples matching a set of
keywords and return the results as joining networks of tuples. In order to feed back the user more
relative information , this paper formulates an expanding version of existing system to answer ag-
gregate keyword queries over hierarchical relational databases in which the value of a specific at-
tribute is organized in hierarchical structure. This version retrieves information in the form of
MaxLMC(Max-Lowest hierarchy Minimum Coverage aggregate , which consists of tuples more I
similar and closer to each other) under the conduct of the above hierarchical structure. A Na?ve
algorithm is proposed to obtain MaxLMC and its enhancement is designed to reduce the system' s
responding time. Meanwhile , recognized that the number of returned answer might be extremely
large in practical , we defined and studied the problem of effective exploration of large sets of a