Balancing Misclassification Rates in Classification-Tree Models of Software Quality |
| |
Authors: | Taghi M. Khoshgoftaar Xiaojing Yuan Edward B. Allen |
| |
Affiliation: | (1) Florida Atlantic University, Boca Raton, Florida, USA;(2) Mississippi State University, Mississippi, USA |
| |
Abstract: | Software product and process metrics can be useful predictorsof which modules are likely to have faults during operations.Developers and managers can use such predictions by softwarequality models to focus enhancement efforts before release.However, in practice, software quality modeling methods in theliterature may not produce a useful balance between the two kindsof misclassification rates, especially when there are few faultymodules.This paper presents a practical classificationrule in the context of classification tree models that allowsappropriate emphasis on each type of misclassification accordingto the needs of the project. This is especially important whenthe faulty modules are rare.An industrial case study using classification trees, illustrates the tradeoffs.The trees were built using the TREEDISC algorithm whichis a refinement of the CHAID algorithm. We examinedtwo releases of a very large telecommunications system, and builtmodels suited to two points in the development life cycle: theend of coding and the end of beta testing. Both trees had onlyfive significant predictors, out of 28 and 42 candidates, respectively.We interpreted the structure of the classification trees, andwe found the models had useful accuracy. |
| |
Keywords: | classification trees CHAID TREEDISC telecommunications software quality fault-prone modules software metrics knowledge discovery in data bases |
本文献已被 SpringerLink 等数据库收录! |
|