Computing Reviews, the leading online review service for computing literature.

Search

Fourier Domain Scoring: A Novel Document Ranking Method
Park L., Ramamohanarao K., Palaniswami M. IEEE Transactions on Knowledge and Data Engineering16 (5):529-539,2004.Type:Article

Date Reviewed: Oct 5 2004

A remarkable improvement on the use of singular value decomposition (SVD) for the latent semantic indexing of a large corpus of documents is presented in this paper. SVD is already almost too good to be true, and perhaps even an example of Arthur Clarke’s Third Law: “any technology sufficiently advanced cannot be distinguished from magic.” Importing some magical Fourier technology from differential equations for Fourier domain scoring (FDS) makes it yet more mysterious, and more effective as well. The authors propose the application of their new technology for search and retrieval on the World Wide Web (WWW), which now refers to more than a billion documents. It has been said that the problem with the WWW is that there were too many Unix gurus involved in its development, but not enough librarians. Today’s best search engines can scarcely find one-third of these documents, even if you have a pretty good idea of what you are searching for. LDS, plus this novel FDS document ranking system, will be of great assistance if you know what the document is about, even if you do not know its title, author, or provenance. It is now possible to meet the challenge the White Knight gave Alice: “you didn’t ask for the song, you asked for the name of the song.” The problem the authors set out to solve with vector space models is that, once documents are converted into document vectors, the position of the terms, which represents the flow of the document, is lost, and thus spatial information is no longer available to the searcher. FDS is able to retain document spatial information, and use it to rank documents. The difference between FDS and other vector space similarity measures (for example, cosine of the angle between two document vectors) is that, rather than storing only the frequency count of a term per document, FDS stores a term signal, which tells the searcher how the term is spread throughout the document. This information is provided to the searcher by computing and comparing the magnitude and phase of the spectrum across the term signals for different documents. The paper is well written, gives the mathematical basis for the method in sufficient detail, and presents the results of two experiments on large document databases. This is a very nice piece of work.

Reviewer: P. C. Patton	Review #: CR130230

Information Search And Retrieval (H.3.3 )

Fast Fourier Transforms (FFT) (G.1.2 ... )

Selection Process (H.3.3 ... )

Signal Processing (I.5.4 ... )

Similarity Measures (I.5.3 ... )

Text Analysis (I.2.7 ... )

Would you recommend this review?

yes

Other reviews under "Information Search And Retrieval":	Date

Nested transactions in a combined IRS-DBMS architecture Schek H. (ed) Research and development in information retrieval (, King’s College, Cambridge,701984. Type: Proceedings	Nov 1 1985

An integrated fact/document information system for office automation Ozkarahan E., Can F. (ed) Information Technology Research Development Applications 3(3): 142-156, 1984. Type: Article	Oct 1 1985

Access methods for text Faloutsos C. ACM Computing Surveys 17(1): 49-74, 1985. Type: Article	Jan 1 1986

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy