Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Fourier Domain Scoring: A Novel Document Ranking Method
Park L., Ramamohanarao K., Palaniswami M. IEEE Transactions on Knowledge and Data Engineering16 (5):529-539,2004.Type:Article
Date Reviewed: Oct 5 2004

A remarkable improvement on the use of singular value decomposition (SVD) for the latent semantic indexing of a large corpus of documents is presented in this paper. SVD is already almost too good to be true, and perhaps even an example of Arthur Clarke’s Third Law: “any technology sufficiently advanced cannot be distinguished from magic.” Importing some magical Fourier technology from differential equations for Fourier domain scoring (FDS) makes it yet more mysterious, and more effective as well.

The authors propose the application of their new technology for search and retrieval on the World Wide Web (WWW), which now refers to more than a billion documents. It has been said that the problem with the WWW is that there were too many Unix gurus involved in its development, but not enough librarians. Today’s best search engines can scarcely find one-third of these documents, even if you have a pretty good idea of what you are searching for. LDS, plus this novel FDS document ranking system, will be of great assistance if you know what the document is about, even if you do not know its title, author, or provenance. It is now possible to meet the challenge the White Knight gave Alice: “you didn’t ask for the song, you asked for the name of the song.”

The problem the authors set out to solve with vector space models is that, once documents are converted into document vectors, the position of the terms, which represents the flow of the document, is lost, and thus spatial information is no longer available to the searcher. FDS is able to retain document spatial information, and use it to rank documents. The difference between FDS and other vector space similarity measures (for example, cosine of the angle between two document vectors) is that, rather than storing only the frequency count of a term per document, FDS stores a term signal, which tells the searcher how the term is spread throughout the document. This information is provided to the searcher by computing and comparing the magnitude and phase of the spectrum across the term signals for different documents.

The paper is well written, gives the mathematical basis for the method in sufficient detail, and presents the results of two experiments on large document databases. This is a very nice piece of work.

Reviewer:  P. C. Patton Review #: CR130230
Bookmark and Share
  Editor Recommended
Featured Reviewer
 
 
Information Search And Retrieval (H.3.3 )
 
 
Fast Fourier Transforms (FFT) (G.1.2 ... )
 
 
Selection Process (H.3.3 ... )
 
 
Signal Processing (I.5.4 ... )
 
 
Similarity Measures (I.5.3 ... )
 
 
Text Analysis (I.2.7 ... )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Information Search And Retrieval": Date
Nested transactions in a combined IRS-DBMS architecture
Schek H. (ed)  Research and development in information retrieval (, King’s College, Cambridge,701984. Type: Proceedings
Nov 1 1985
An integrated fact/document information system for office automation
Ozkarahan E., Can F. (ed) Information Technology Research Development Applications 3(3): 142-156, 1984. Type: Article
Oct 1 1985
Access methods for text
Faloutsos C. ACM Computing Surveys 17(1): 49-74, 1985. Type: Article
Jan 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy