How to make best use of cross-company data in software effort estimation?

Leandro L. MINKU, Xin YAO

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

54 Citations (Scopus)

Abstract

Previous works using Cross-Company (CC) data for making Within-Company (WC) Software Effort Estimation (SEE) try to use CC data or models directly to provide predictions in the WC context. So, these data or models are only helpful when they match the WC context well. When they do not, a fair amount of WC training data, which are usually expensive to acquire, are still necessary to achieve good performance. We investigate how to make best use of CC data, so that we can reduce the amount of WC data while maintaining or improving performance in comparison to WC SEE models. This is done by proposing a new framework to learn the relationship between CC and WC projects explicitly, allowing CC models to be mapped to the WC context. Such mapped models can be useful even when the CC models themselves do not match the WC context directly. Our study shows that a new approach instantiating this framework is able not only to use substantially less WC data than a corresponding WC model, but also to achieve similar/better performance. This approach can also be used to provide insight into the behaviour of a company in comparison to others. © 2014 ACM.
Original languageEnglish
Title of host publicationProceedings - International Conference on Software Engineering
PublisherIEEE Computer Society
Chapter1
Pages446-456
Number of pages11
DOIs
Publication statusPublished - 31 May 2014
Externally publishedYes

Keywords

  • cross-company learning
  • ensembles of learning machines
  • online learning
  • Software effort estimation
  • transfer learning

Fingerprint

Dive into the research topics of 'How to make best use of cross-company data in software effort estimation?'. Together they form a unique fingerprint.

Cite this