基于对抗生成网络的缺陷定位模型域数据增强方法 Model-domain Data Augmentation Using Generative Adversarial Network for Fault Localization期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于对抗生成网络的缺陷定位模型域数据增强方法

引用本文：	张卓,雷晏,毛晓光,薛建新,常曦. 基于对抗生成网络的缺陷定位模型域数据增强方法[J]. 软件学报, 2024, 35(5): 2289-2306

作者姓名：	张卓雷晏毛晓光薛建新常曦

作者单位：	广州商学院信息技术与工程学院, 广东广州 510700;重庆大学大数据与软件学院, 重庆 401331;国防科技大学计算机学院, 湖南长沙 410073;上海第二工业大学计算机与信息工程学院, 上海 200127

基金项目：	国家自然科学基金(62272072);中央高校基本科研业务费(2022CDJDX-005)

摘要：	缺陷定位获取并分析测试用例集的运行信息, 从而度量出各个语句为缺陷的可疑性. 测试用例集由输入域数据构建, 包含成功测试用例和失败测试用例两种类型. 由于失败测试用例在输入域分布不规律且比例很低, 失败测试用例数量往往远少于成功测试用例数量. 已有研究表明, 少量失败测试用例会导致测试用例集出现类别不平衡问题, 严重影响着缺陷定位有效性. 为了解决这个问题, 提出基于对抗生成网络的缺陷定位模型域数据增强方法. 该方法基于模型域(即缺陷定位频谱信息)而非传统输入域(即程序输入), 利用对抗生成网络合成覆盖最小可疑集合的模型域失败测试用例, 从模型域上解决类别不平衡的问题. 实验结果表明, 所提方法大幅提升了11种典型缺陷定位方法的效能.
关键词：	缺陷定位测试用例对抗生成网络数据增强可疑值
收稿时间：	2022-01-07
修稿时间：	2022-11-17
Model-domain Data Augmentation Using Generative Adversarial Network for Fault Localization

ZHANG Zhuo,LEI Yan,MAO Xiao-Guang,XUE Jian-Xin,CHANG Xi. Model-domain Data Augmentation Using Generative Adversarial Network for Fault Localization[J]. Journal of Software, 2024, 35(5): 2289-2306

Authors:	ZHANG Zhuo LEI Yan MAO Xiao-Guang XUE Jian-Xin CHANG Xi

Affiliation:	School of Information Technology & Engineering, Guangzhou College of Commerce, Guangzhou 510700, China;School of Big Data & Software Engineering, Chongqing University, Chongqing 401331, China;College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China;School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai 200127, China

Abstract:	Fault localization collects and analyzes the runtime information of test case sets to evaluate the suspiciousness of each statement of being faulty. Test case sets are constructed by the data from the input domain and have two types, i.e., passing test cases and failing ones. Since failing test cases generally account for a very small portion of the input domain, and their distribution is usually random, the number of failing test cases is much fewer than that of passing ones. Previous work has shown that the lack of failing test cases leads to a class-imbalanced problem of test case sets, which severely hampers fault localization effectiveness. To address this problem, this study proposes a model-domain data augmentation approach using generative adversarial network for fault localization. Based on the model domain (i.e., spectrum information of fault localization) rather than the traditional input domain (i.e., program input), this approach uses the generative adversarial network to synthesize the model-domain failing test cases covering the minimum suspicious set, so as to address the class-imbalanced problem from the model domain. The experimental results show that the proposed approach significantly improves the effectiveness of 12 representative fault localization approaches.

Keywords:	fault localization test case generative adversarial network (GAN) data augmentation suspiciousness

	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏