首页 | 本学科首页   官方微博 | 高级检索  
     

面向软件缺陷报告的提取方法
引用本文:林涛,高建华,伏雪,马燕,林艳.面向软件缺陷报告的提取方法[J].计算机科学,2016,43(6):179-183.
作者姓名:林涛  高建华  伏雪  马燕  林艳
作者单位:上海师范大学计算机科学与工程系 上海200234,上海师范大学计算机科学与工程系 上海200234,上海师范大学计算机科学与工程系 上海200234,上海师范大学计算机科学与工程系 上海200234,奥克兰大学信息系统系 奥克兰92019
基金项目:本文受国家自然科学基金(61073163,61373004),上海市企业自主创新专项资金项目(沪CXY-2013-88)资助
摘    要:软件工程中的软件缺陷报告数量在快速增长,开发者们越来越困惑于大量的缺陷报告。因此,为了达到缺陷修复和软件复用等目的,有必要研究软件缺陷报告的提取方法。提出一种提取方法,该方法首先合并缺陷报告中的同义词,然后建立空间向量模型,使用词频反文档频率以及信息增益等文本挖掘的方法来收集软件缺陷报告中单词的特征,同时设计算法来确定句子复杂度以选择长句,最后将贝叶斯分类器引入该领域。该方法可以提高缺陷报告提取的命中率,降低虚警率。实验证明,基于文本挖掘和贝叶斯分类器的软件缺陷报告提取方法在接受者工作特征曲线面积(0.71)、F-score(0.80)和Kappa值(0.75)方面有良好效果。

关 键 词:软件缺陷报告管理  文本挖掘  贝叶斯分类器  软件缺陷报告特征  空间向量模型  句子复杂度
收稿时间:2015/4/29 0:00:00
修稿时间:2015/8/18 0:00:00

Extraction Approach for Software Bug Report
LIN Tao,GAO Jian-hu,FU Xue,MA Yan and LIN Yan.Extraction Approach for Software Bug Report[J].Computer Science,2016,43(6):179-183.
Authors:LIN Tao  GAO Jian-hu  FU Xue  MA Yan and LIN Yan
Affiliation:Department of Computer Science and Technology,Shanghai Normal University,Shanghai 200234,China,Department of Computer Science and Technology,Shanghai Normal University,Shanghai 200234,China,Department of Computer Science and Technology,Shanghai Normal University,Shanghai 200234,China,Department of Computer Science and Technology,Shanghai Normal University,Shanghai 200234,China and Department of Information Systems,The University of Auckland,Auckland 92019,New Zealand
Abstract:Bug reports in software engineering areincreasing rapidly,and developers are bewildered by the large number accumulation of reports.Therefore,it is necessary to study on the extraction of bug reports for the task of bug fixing and software reuse,etc.This paper proposed a novel extraction approach.Synonyms are merged into one specific word firstly in the approach.Then it sets up a vector space model.And some text mining methods,such as TF-IDF and information gain,are used to collect word features in bug reports specifically.Meanwhile,there is an algorithm for determining sentence complexity,so as to choose long sentences.Finally Bayes classifier is introduced to bug report extraction.TPR is increased and FPR is decreased in this approach.The experiment proves that the bug report extraction based on text mining and Bayes classifier is competitive in the evaluation of AUC(0.71),F-score(0.80) and Kappa value(0.75).
Keywords:Bug report management  Text mining  Bayes classifier  Bug report feature  Vector space model  Sentence complexity
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号