signature=779b9aaaa676e80ad1c9e305c7258d48,Bit-sliced signature files for very large text databases ...

Bit-sliced signature files for very large text databases on a parallel machine architecture

George Panagopoulos

Christos Faloutsos

Conference paper

First Online:03 June 2005

171

DownloadsPart of the

Lecture Notes in Computer Science

book series (LNCS, volume 779)Abstract

Free text retrieval is an important problem which can significantly benefit from a parallel architecture. Signature methods have been proposed to answer text retrieval queries in parallel machines [Sta88, LF92], under the assumption that the main memory is sufficient to hold the entire signature file. We propose the use of a Parallel Bit-Sliced Signature File method on a SIMD machine architecture when the size of the signature file exceeds the available memory. We propose that we need not examine all the bit slices; instead we use a partial fetch slice swapping algorithm. This method achieves graceful performance degradation according to the database size. We provide formulae for the optimal number of signature slices to fetch and match with the query signature. Arithmetic examples show that our method can handle a 128GB database with a 2sec response time on a machine with the characteristics of the Connection Machine.

KeywordsRetrieval AlgorithmDatabase SizeConjunctive QueryText RetrievalVirtual Processor

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This research was sponsored partially by the Institute for Advanced Computer Studies (UMIACS), by the National Science Foundation under the grants IRI-8719458, IRI-8958546 and IRI-9205273, by a donation by EMPRESS Software Inc., and by a donation by Thinking Machines Inc.

This is a preview of subscription content, log in to check access.

PreviewUnable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.References

[CF84]

Stavros Christodoulakis and Christos Faloutsos. Design Considerations for a Message File Server. IEEE Transactions on Software Engineering, 10(2):201–210, March 1984.

[Fal90]

Christos Faloutsos. Signature-Based Text Retrieval Methods: A Survey. IEEE Data Engineering, pages 25–32, March 1990.

[FC87]

Christos Faloutsos and Stavros Christodoulakis. Description and Performance Analysis of Signature File Methods for Office Filing. ACM Transactions on Office Information Systems, 5(3):237–257, July 1987.

[FC88]

Christos Faloutsos and Raphael Chan. Fast Text Access Methods for Optical Disks: Designs and Performance Comparison. In Proceedings of the 14th International Conference on Very Large Databases, pages 280–293, Long Beach, California, August 1988.

[FJ91]

Christos Faloutsos and H. V. Jagadish. Hybrid Index Organizations for Text Databases. Technical Report UMIACS-TR-91-33 and CS-TR-2621, Department of Computer Science, University of Maryland, March 1991.

[Has81]

R. Haskin. Special-Purpose Processors for Text Retrieval. Database Engineering, 4(1):16–29, September 1981.

[LF92]

Zheng Lin and Christos Faloutsos. Frame Sliced Signature Files. IEEE Transactions on Knowledge and Data Engineering, 4(3):158–180, June 1992. Also available as UMD CS-TR-2146 and UMIACS-TR-88-88.

[Lin92]

Zheng Lin. CAT: An Execution Model for Concurrent Full Text Search. In PDIS, 1992.

[LL89]

D. L. Lee and C. W. Leng. Partitioned Signature File: Designs and Performance Evaluation. ACM Transactions on Office Information Systems, 7(2):158–180, April 1989.

[Pan92]

George Panagopoulos. Bit-Sliced Signature Files for Very Large Databases on a Parallel Machine Architecture. Technical Report CSC-809, Department of Computer Science, University of Maryland, April 1992.

[SD83]

Ron Sacks-Davis. Two Level Superimposed Coding Scheme for Partial Match Retrieval. Information Systems, 8(4):273–280, 1983.

[SM83]

G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983.

[Sta88]

Craig Stanfill. Parallel Computing for Information Retrieval: Recent Developments. Technical Report DR88-1, Thinking Machines Corporation, Cambridge, Mass., January 1988.

[Sti60]

Simon Stiassny. Mathematical Analysis of Various Superimposed Coding Methods. American Documentation, 11(2):155–169, February 1960.

[Sto87]

Harold S. Stone. Parallel Querying of Large Databases: A Case Study. IEEE Computer, 20(10):11–21, October 1987.

[TC83]

D. Tsichritzis and S. Christodoulakis. Message Files. ACM Transactions on Office Information Systems, 1(1):88–98, January 1983.

[Thi89]

Thinking Machines Corporation, Cambridge, Mass. Parallel Instruction Set, Version 5.2, October 1989.

[Zip49]

G. K. Zipf. Human Behavior and Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Cambridge, MA, 1949.Copyright information

© Springer-Verlag Berlin Heidelberg 1994Authors and Affiliations

George Panagopoulos1

Christos Faloutsos11.Department of Computer Science and Institute for Systems Research (ISR)University of MarylandCollege Park

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。 经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。 经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。 经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。
经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。 经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。 经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。 经导师精心指导并认可、获 98 分的毕业设计项目!【项目资源】:微信小程序。【项目说明】:聚焦计算机相关专业毕设及实战操练,可作课程设计与期末大作业,含全部源码,能直用于毕设,经严格调试,运行有保障!【项目服务】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值