Abstract
Background: The use of machine learning approaches for software effort estimation (SEE) has been studied for more than a decade. Most studies performed comparisons of different learning machines on a number of data sets. However, most learning machines have more than one parameter that needs to be tuned, and it is unknown to what extent parameter settings may affect their performance in SEE. Many works seem to make an implicit assumption that parameter settings would not change the outcomes significantly. Aims: To investigate to what extent parameter settings affect the performance of learning machines in SEE, and what learning machines are more sensitive to their parameters. Method: Considering an online learning scenario where learning machines are updated with new projects as they become available, systematic experiments were performed using five learning machines under several different parameter settings on three data sets. Results: While some learning machines such as bagging using regression trees were not so sensitive to parameter settings, others such as multilayer perceptrons were affected dramatically. Combining learning machines into bagging ensembles helped making them more robust against different parameter settings. The average performance of k-NN across different projects was not so much affected by different parameter settings, but the parameter settings that obtained the best average performance across time steps were not so consistently the best throughout time steps as in the other approaches. Conclusions: Learning machines that are more/less sensitive to different parameter settings were identified. The different sensitivity obtained by different learning machines shows that sensitivity to parameters should be considered as one of the criteria for evaluation of SEE approaches. A good learning machine for SEE is not only one which is able to achieve superior performance, but also one that is either less dependent on parameter settings or to which good parameter choices are easy to make.
Original language | English |
---|---|
Title of host publication | PROMISE '13: Proceedings of the 9th International Conference on Predictive Models in Software Engineering |
Publisher | Association for Computing Machinery |
Number of pages | 10 |
ISBN (Print) | 9781450320160 |
DOIs | |
Publication status | Published - 9 Oct 2013 |
Externally published | Yes |
Bibliographical note
Acknowledgements: Liyang Song would like to thank Fengzhen Tang for her useful comments and discussion.Funding
This work was supported by EPSRC grant EP/J017515/1. Liyan Song was supported by a School-funded PhD studentship. Xin Yao was supported by a Royal Society Wolfson Research Merit Award.
Keywords
- Ensembles
- Machine learning
- Online learning
- Sensitivity to parameters
- Software effort estimation