基于线性分类算法的软件错误定位模型 Software fault localization model based on linear classification algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于线性分类算法的软件错误定位模型

引用本文：	何海江. 基于线性分类算法的软件错误定位模型[J]. 计算机工程与应用, 2017, 53(21): 42-48. DOI: 10.3778/j.issn.1002-8331.1606-0176

作者姓名：	何海江

作者单位：	长沙学院数学与计算机科学系，长沙 410022

摘要：	基于谱的错误定位（SBFL）方法能帮助程序员减小软件调试的困难。作为一种轻量方法，SBFL只需收集测试用例的覆盖信息和测试结果，计算程序每条语句的运行特征。众多SBFL方法，将四个运行特征组合成不同的可疑度计算公式。然而，这些公式受固定参数的影响，无法适应不同的程序集。因此，提出一种机器学习方法，能自动确定特定程序集的可疑度计算公式。首先，收集已标注错误语句的程序旧版本；再将错误语句与正确语句的运行特征两两相减，构造为训练集的一个样本；最后基于Weka的分类算法，学习到线性函数，作为该程序的错误定位模型。在Siemens程序包、space和gzip三个基准数据集上，使用Logistic、SGD、SMO和LibLinear学习到的模型，性能都要优于SBFL方法。
关键词：	分类算法线性模型错误定位程序谱软件测试
Software fault localization model based on linear classification algorithm

HE Haijiang. Software fault localization model based on linear classification algorithm[J]. Computer Engineering and Applications, 2017, 53(21): 42-48. DOI: 10.3778/j.issn.1002-8331.1606-0176

Authors:	HE Haijiang

Affiliation:	Department of Mathematics and Computer Science, Changsha University, Changsha 410022, China

Abstract:	Spectrum-Based Fault Localization（SBFL） techniques aid developers to reduce the debugging effort. As a light-weight promising approach, SBFL only collects the testing result of passed or failed, and the corresponding coverage information. Based on these data, SBFL can then calculate a runtime spectra for each program statement. SBFL approaches apply various functions to map four profile features to a suspiciousness score. However, existing functions don’t give good accuracy due to the influence of the fixed parameters. Therefore, a machine learning method is proposed that can automatically construct a suspiciousness function of the specific program set. First, the old versions of a program having fault code are collected. Next, it is mapped from the feature difference in a pair of faulty statement and non-faulty statement to an instance in training dataset. Finally the linear classification algorithm of Weka is applied to learn a mapping function. The function learned from old versions is defined as the fault localization model of the program. To assess the validity of the proposed method, an experiment is performed on three benchmark datasets: Siemens suite, space and gzip. Experimental result demonstrates that the proposed method reduces fault localization cost that exists in SBFL approaches.

Keywords:	classification algorithm linear model fault localization program spectra software testing

	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏