Computing Reviews, the leading online review service for computing literature.

Search

Towards effective strategies for monolingual and bilingual information retrieval: lessons learned from NTCIR-4
Qu Y., Hull D., Grefenstette G., Evans D., Ishikawa M., Nara S., Ueda T., Noda D., Arita K., Funakoshi Y., Matsuda H. ACM Transactions on Asian Language Information Processing4 (2):78-110,2005.Type:Article

Date Reviewed: Feb 24 2006

This substantial paper will be very useful for researchers working in automated information retrieval (IR), but not for a general audience. It describes, in great detail, techniques for both monolingual IR in English and Japanese, and Japanese-English cross-language IR (Japanese queries, English documents). The paper reports on retrieval experiments in the context of NTCIR-4, a Japanese retrieval testing program run by the National Institute of Informatics (NII), much like the Text Retrieval Conference (TREC) run by the National Institute for Standards and Technology (NIST) in the US. It describes retrieval systems developed in a collaboration between Justsystem Corporation (JSC) and Clairvoyance Corporation (CC). The system uses natural language processing (NLP) techniques, including noun-phrase detection, with language-specific extensions, and rich translation resources. It explores issues of noun-phrase weighting, translation weighting, pseudo-relevance feedback, and term-weight merging. The experiments are carefully set up, exploring the interactions of variables through analysis of variance (ANOVA) and reporting statistical significance. A particularly welcome feature is error analysis that uses a typology of errors to gain insight into the contribution of various system components to the end result. The results are presented in many tables. The system, testing procedures, and results are all well explained. There are no earth-shattering results here, but that is true for most papers reporting on IR experiments. There are too many variables influencing retrieval performance; results are often specific to a given context, and grand generalizations are hard to come by. What sets this paper apart is the clear framework used for testing various configurations of system components, and the carefully worked out testing methodology, especially the typology of errors for the failure analysis.

Reviewer: D. Soergel	Review #: CR132483 (0611-1163)

Linguistic Processing (H.3.1 ... )

Relevance Feedback (H.3.3 ... )

Text Analysis (I.2.7 ... )

Information Search And Retrieval (H.3.3 )

Would you recommend this review?

yes

Other reviews under "Linguistic Processing":	Date

Anatomy of a text analysis package Reed A. Information Systems 9(2): 89-96, 1984. Type: Article	Jun 1 1985

Dependency parsing for information retrieval Metzler D., Noreault T., Richey L., Heidorn B. Research and development in information retrieval (, King’s College, Cambridge,3241984. Type: Proceedings	Oct 1 1985

Automated medical office records Gabrieli E. Journal of Medical Systems 11(1): 59-68, 1987. Type: Article	Nov 1 1988

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy