The impact of parameter tuning on software effort estimation using learning machines

Liyan SONG, Leandro L. MINKU, Xin YAO

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

73 Citations (Scopus)

Abstract

Background: The use of machine learning approaches for software effort estimation (SEE) has been studied for more than a decade. Most studies performed comparisons of different learning machines on a number of data sets. However, most learning machines have more than one parameter that needs to be tuned, and it is unknown to what extent parameter settings may affect their performance in SEE. Many works seem to make an implicit assumption that parameter settings would not change the outcomes significantly. Aims: To investigate to what extent parameter settings affect the performance of learning machines in SEE, and what learning machines are more sensitive to their parameters. Method: Considering an online learning scenario where learning machines are updated with new projects as they become available, systematic experiments were performed using five learning machines under several different parameter settings on three data sets. Results: While some learning machines such as bagging using regression trees were not so sensitive to parameter settings, others such as multilayer perceptrons were affected dramatically. Combining learning machines into bagging ensembles helped making them more robust against different parameter settings. The average performance of k-NN across different projects was not so much affected by different parameter settings, but the parameter settings that obtained the best average performance across time steps were not so consistently the best throughout time steps as in the other approaches. Conclusions: Learning machines that are more/less sensitive to different parameter settings were identified. The different sensitivity obtained by different learning machines shows that sensitivity to parameters should be considered as one of the criteria for evaluation of SEE approaches. A good learning machine for SEE is not only one which is able to achieve superior performance, but also one that is either less dependent on parameter settings or to which good parameter choices are easy to make.
Original languageEnglish
Title of host publicationPROMISE '13: Proceedings of the 9th International Conference on Predictive Models in Software Engineering
PublisherAssociation for Computing Machinery
Number of pages10
ISBN (Print)9781450320160
DOIs
Publication statusPublished - 9 Oct 2013
Externally publishedYes

Bibliographical note

Acknowledgements: Liyang Song would like to thank Fengzhen Tang for her useful comments and discussion.

Funding

This work was supported by EPSRC grant EP/J017515/1. Liyan Song was supported by a School-funded PhD studentship. Xin Yao was supported by a Royal Society Wolfson Research Merit Award.

Keywords

  • Ensembles
  • Machine learning
  • Online learning
  • Sensitivity to parameters
  • Software effort estimation

Fingerprint

Dive into the research topics of 'The impact of parameter tuning on software effort estimation using learning machines'. Together they form a unique fingerprint.

Cite this