Fault-prone module detection using large-scale text features based on spam filtering |
| |
Authors: | Hideaki Hata Osamu Mizuno Tohru Kikuno |
| |
Affiliation: | (1) Graduate School of Information Science and Technology, Kyoto Institute of Technology, Kyoto, Japan;(2) Graduate School of Information Science and Technology, Osaka University, Osaka, Japan |
| |
Abstract: | This paper proposes an approach using large-scale text features for fault-prone module detection inspired by spam filtering.
The number of every text feature in the source code of a module is counted and used as data for training detection models.
In this paper, we prepared a naive Bayes classifier and a logistic regression model as detection models. To show the effectiveness
of our approaches, we conducted experiments with five open source projects and compared them with a well-known metrics set,
thereby achieving higher detection results. The results imply that large-scale text features are useful in constructing practical
detection models, and measuring sophisticated metrics is not always necessary for detecting fault-prone modules. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|