An Automated Defect Prediction Framework using Genetic AlgorithmsA Validation of Empirical Studies

  1. Murillo-Morera, Juan
  2. Castro-Herrera, Carlos
  3. Arroyo, Javier
  4. Fuentes-Fernandez, Ruben
Revista:
Inteligencia artificial: Revista Iberoamericana de Inteligencia Artificial

ISSN: 1137-3601 1988-3064

Año de publicación: 2016

Título del ejemplar: Inteligencia Artificial (June 2016)

Volumen: 19

Número: 57

Páginas: 114-137

Tipo: Artículo

Otras publicaciones en: Inteligencia artificial: Revista Iberoamericana de Inteligencia Artificial

Resumen

Today, it is common for software projects to collect measurement data through development processes. With these data, defect prediction software can try to estimate the defect proneness of a software module, with the objective of assisting and guiding software practitioners. With timely and accurate defect predictions, practitioners can focus their limited testing resources on higher risk areas. This paper reports the results of three empirical studies that uses an automated genetic defect prediction framework. This framework generates and compares different learning schemes (preprocessing + attribute selection + learning algorithms) and selects the best one using a genetic algorithm, with the objective to estimate the defect proneness of a software module. The first empirical study is a performance comparison of our framework with the most important framework of the literature. The second empirical study is a performance and runtime comparison between our framework and an exhaustive framework. The third empirical study is a sensitivity analysis. The last empirical study, is our main contribution in this paper. Performance of the software development defect prediction models (using AUC, Area Under the Curve) was validated using NASA-MDP and PROMISE data sets. Seventeen data sets from NASA-MDP (13) and PROMISE (4) projects were analyzed running a NxM-fold cross-validation. A genetic algorithm was used to select the components of the learning schemes automatically, and to assess and report the results. Our results reported similar performance between frameworks. Our framework reported better runtime than exhaustive framework. Finally, we reported the best configuration according to sensitivity analysis.