首页 | 本学科首页   官方微博 | 高级检索  
     

基于对抗生成网络的缺陷定位模型域数据增强方法
引用本文:张卓,雷晏,毛晓光,薛建新,常曦. 基于对抗生成网络的缺陷定位模型域数据增强方法[J]. 软件学报, 2024, 35(5): 2289-2306
作者姓名:张卓  雷晏  毛晓光  薛建新  常曦
作者单位:广州商学院 信息技术与工程学院, 广东 广州 510700;重庆大学 大数据与软件学院, 重庆 401331;国防科技大学 计算机学院, 湖南 长沙 410073;上海第二工业大学 计算机与信息工程学院, 上海 200127
基金项目:国家自然科学基金(62272072);中央高校基本科研业务费(2022CDJDX-005)
摘    要:缺陷定位获取并分析测试用例集的运行信息, 从而度量出各个语句为缺陷的可疑性. 测试用例集由输入域数据构建, 包含成功测试用例和失败测试用例两种类型. 由于失败测试用例在输入域分布不规律且比例很低, 失败测试用例数量往往远少于成功测试用例数量. 已有研究表明, 少量失败测试用例会导致测试用例集出现类别不平衡问题, 严重影响着缺陷定位有效性. 为了解决这个问题, 提出基于对抗生成网络的缺陷定位模型域数据增强方法. 该方法基于模型域(即缺陷定位频谱信息)而非传统输入域(即程序输入), 利用对抗生成网络合成覆盖最小可疑集合的模型域失败测试用例, 从模型域上解决类别不平衡的问题. 实验结果表明, 所提方法大幅提升了11种典型缺陷定位方法的效能.

关 键 词:缺陷定位  测试用例  对抗生成网络  数据增强  可疑值
收稿时间:2022-01-07
修稿时间:2022-11-17

Model-domain Data Augmentation Using Generative Adversarial Network for Fault Localization
ZHANG Zhuo,LEI Yan,MAO Xiao-Guang,XUE Jian-Xin,CHANG Xi. Model-domain Data Augmentation Using Generative Adversarial Network for Fault Localization[J]. Journal of Software, 2024, 35(5): 2289-2306
Authors:ZHANG Zhuo  LEI Yan  MAO Xiao-Guang  XUE Jian-Xin  CHANG Xi
Affiliation:School of Information Technology & Engineering, Guangzhou College of Commerce, Guangzhou 510700, China;School of Big Data & Software Engineering, Chongqing University, Chongqing 401331, China;College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China;School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai 200127, China
Abstract:Fault localization collects and analyzes the runtime information of test case sets to evaluate the suspiciousness of each statement of being faulty. Test case sets are constructed by the data from the input domain and have two types, i.e., passing test cases and failing ones. Since failing test cases generally account for a very small portion of the input domain, and their distribution is usually random, the number of failing test cases is much fewer than that of passing ones. Previous work has shown that the lack of failing test cases leads to a class-imbalanced problem of test case sets, which severely hampers fault localization effectiveness. To address this problem, this study proposes a model-domain data augmentation approach using generative adversarial network for fault localization. Based on the model domain (i.e., spectrum information of fault localization) rather than the traditional input domain (i.e., program input), this approach uses the generative adversarial network to synthesize the model-domain failing test cases covering the minimum suspicious set, so as to address the class-imbalanced problem from the model domain. The experimental results show that the proposed approach significantly improves the effectiveness of 12 representative fault localization approaches.
Keywords:fault localization  test case  generative adversarial network (GAN)  data augmentation  suspiciousness
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号