Computing Reviews

On the value of outlier elimination on software effort estimation research
Seo Y., Bae D. Empirical Software Engineering18(4):659-698,2013.Type:Article
Date Reviewed: 01/15/14

To a large extent, the success or failure of software development projects can be predicted by the accuracy of the effort estimation. One of the crucial factors that affect the estimation accuracy is the use of outliers, data points that appear to be inconsistent with the rest of the datasets. Elimination of outliers can improve data quality. The authors of this paper performed a systematic analysis using a general experimental procedure to evaluate the extent to which elimination of outliers led to higher accuracy of the effort estimation.

Five outlier elimination methods are used, including least trimmed squares and k-means clustering. Two of the most popular software estimation methods were used: least square regression and estimation by analogy. Empirical experiments were conducted with five industrial datasets. The accuracy was estimated with several criteria, including the mean magnitude of relative error and the median magnitude of relative error.

The experimental results are not consistent. For several of the datasets, elimination of the outliers did not increase accuracy, and in some cases, actually decreased it. Improvements were observed on the Stock_NDV dataset only. The authors plan to continue their investigation with more focus on the types of outliers.

The intended audience includes academics and practitioners working on software effort estimation methods.

Readers who are interested in this topic can find additional information on the subject in other papers [1,2].


1)

Trendowicz, A.; Münch, J.; Jeffery, R. Software engineering techniques. Springer, , 2011.


2)

Wen, J.; Li, S.; Lin, Z.; Hu, Y.; Huang, C. Systematic literature review of machine learning based software development effort estimation models. Information and Software Technology 54, 1(2012), 41–59.

Reviewer:  Alexei Botchkarev Review #: CR141894 (1404-0280)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy