Computing Reviews

Social question answering:textual, user, and network features for best answer prediction
Molino P., Aiello L., Lops P. ACM Transactions on Information Systems35(1):1-40,2016.Type:Article
Date Reviewed: 11/11/16

This paper combines a good review of the main techniques used in community question-answering (CQA) systems with an extensive and systematic study that compares the effect of using different sets of features for the task of predicting the best answers for questions in a large-scale, general-purpose CQA system.

As the authors say, the task of matching questions to their best answers has been tackled for more than a decade. However, the availability of large corpora and new developments such as the rise of distributional semantic methods give many opportunities for new results in this field. This paper takes an application-oriented approach, using a large dataset from Yahoo Answers to provide an extensive study on features and algorithms best suited for finding the best answers to a question.

It compares the results obtained by multiple combinations of features, belonging to five distinct families (text quality, linguistic similarity, distributional semantics, user characteristics, and network structure) in a supervised learning framework, to the results obtained by four baseline methods from previous works. The authors’ use of distributional semantic features is novel for this problem. Using this approach, they are able to outperform state-of-the-art methods by up to 26 percent.

The work also draws interesting conclusions about the impact of different features employed, noting that textual ones were the most important. In this family, text quality and distributional semantics were generally best in comparison to linguistic similarity features. This is an interesting finding, as the latter are more expensive to compute than the first two.

I believe the work achieves the goal of helping to show how far we can currently go on the task of best answer prediction, including the application of recent developments in distributional semantic methods for the task. I also believe that the same kind of extensive and systematic study can be done for other linguistic tasks. Therefore, the work described in the paper has the potential to help many new developments in the field.

Reviewer:  Sergio Queiroz Review #: CR144919 (1702-0161)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy