Empirical Evaluation of Defect Projection Models for Widely-deployed Production Software Systems
Defect-occurrence projection is necessary for the development of methods to mitigate the risks of software defect occurrences. In this paper, we examine user-reported software defect-occurrence patterns across twenty-two releases of four widely-deployed, business-critical, production, software systems: a commercial operating system, a commercial middleware system, an open source operating system (OpenBSD), and an open source middleware system (Tomcat). We evaluate the suitability of common defect-occurrence models by first assessing the match between characteristics of widely-deployed production software systems and model structures. We then evaluate how well the models fit real world data. We find that the Weibull model is flexible enough to capture defectoccurrence behavior across a wide range of systems. It provides the best model fit in 16 out of the 22 releases. We then evaluate the ability of the moving averages and the exponential smoothing methods to extrapolate Weibull model parameters using fitted model parameters from historical releases. Our results show that in 50% of our forecasting experiments, these two naïve parameter-extrapolation methods produce projections that are worse than the projection from using the same model parameters as the most recent release. These findings establish the need for further research on parameter-extrapolation methods that take into account variations in characteristics of widely-deployed, production, software systems across multiple releases.