ACM Home Page
Please provide us with feedback. Feedback
Improved techniques for processing queries in full-text systems
Full text PdfPdf (933 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
New Orleans, Louisiana, United States
Pages: 306 - 315  
Year of Publication: 1987
ISBN:0-89791-232-2
Authors
Y. Choueka  Inst. for Information Retrieval and Computational Linguistics (IRCOL) -- The Responsa Project and Department of Mathematics and Computer Science, Bar-Ilan University, Ramat Gan, Israel and On sabbatical leave at Bell Communications Research, Morristown, New Jersey, USA
A. Fraenkel  Department of Applied Mathematics, The Weizmann Institute of Science, Rehovot 76100, Israel
S. Klein  Department of Applied Mathematics, The Weizmann Institute of Science, Rehovot 76100, Israel
E. Segal  Inst. for Information Retrieval and Computational Linguistics (IRCOL) -- The Responsa Project
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 14,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/42005.42039
What is a DOI?

ABSTRACT

In static full-text retrieval systems, which accommodate metrical as well as Boolean operators, the traditional approach to query processing uses a “concordance”, from which large sets of coordinates are retrieved and then merged and/or collated. Alternatively, in a system with l documents, the concordance can be replaced by a set of bit-maps of fixed length l, which are constructed for every different word of the database and serve as occurrence maps. We propose to combine the concordance and bit-map approaches, and show how this can speed up the processing of queries: fast ANDing and ORing of the maps in a preprocessing stage, lead to large I/O savings in collating coordinates of keywords needed to satisfy the metrical and Boolean constraints. Moreover, the bit-maps give partial information on the distribution of the coordinates of the keywords, which can be used when queries must be processed by stages, due to their complexity and the sizes of the involved sets of coordinates. The new techniques are partially implemented at the Responsa Retrieval Project.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
Bratley P., Choueka Y., Processing truncated terms in document retrieval systems, Inf. Processing ~ Management 18 (1982) 257-266.
 
4
Choueka Y., Full text systems and research in the humanities, Computers and the Humanities XIV (1980) 153-169.
5
6
 
7
Fraenkel A.S., All about the Responsa Retrieval Project you always wanted to know but were afraid to ask, Expanded Summary, Jurimetrics J. 16 (1976) 149- 156.
 
8
Fraenkel A.S., Klein S.T., Novel compression of sparse bit-strings ~ preliminary report, Combinatorial Algo. rithms on Words, NATO ASI Series Vol. F12, Springer Verlag, Berlin (1985) 169-183.
 
9
 
10

CITED BY  8
 
 

Collaborative Colleagues:
Y. Choueka: colleagues
A. Fraenkel: colleagues
S. Klein: colleagues
E. Segal: colleagues

Peer to Peer - Readers of this Article have also read: