首页 | 本学科首页   官方微博 | 高级检索  
     

基于源代码扩展信息的细粒度缺陷定位方法
引用本文:李晓卓,卿笃军,贺也平,马恒太.基于源代码扩展信息的细粒度缺陷定位方法[J].软件学报,2022,33(11):4008-4026.
作者姓名:李晓卓  卿笃军  贺也平  马恒太
作者单位:基础软件国家工程研究中心(中国科学院 软件研究所), 北京 100190;中国科学院大学, 北京 100049;基础软件国家工程研究中心(中国科学院 软件研究所), 北京 100190;中国科学院大学, 北京 100049;计算机科学国家重点实验室(中国科学院 软件研究所), 北京 100190
基金项目:核高基国家科技重大专项(2014ZX01029101);中国科学院战略性先导科技专项(XDA-Y01-01)
摘    要:基于信息检索的缺陷定位技术,利用跨语言的语义相似性构造检索模型,通过缺陷报告定位源代码错误,具有方法直观、通用性强的特点.但是由于传统基于信息检索的缺陷定位方法将代码作为纯文本进行处理,只利用了源代码的词汇语义信息,导致在细粒度缺陷定位中面临候选代码语义匮乏产生的准确性低的问题,其结果有用性还有待改进.通过分析程序演化场景下代码改动与缺陷产生间的关系,提出一种基于源代码扩展信息的细粒度缺陷定位方法,以代码词汇语义显性信息及代码执行隐性信息共同丰富源代码语义实现细粒度缺陷定位.利用定位候选点的语义相关上下文丰富代码量,以代码执行中间形式的结构语义实现细粒度代码的可区分,同时以自然语言语义指导基于注意力机制的代码语言表征生成,实现细粒度代码与自然语言间的语义映射,从而实现细粒度缺陷定位方法FlowLocator.实验分析结果表明:与经典的IR缺陷定位方法相比,该方法定位准确性在Top-N排名、平均准确率及平均倒数排名上都有显著提高.

关 键 词:缺陷定位  演化程序  信息检索  语义信息  伪孪生网络
收稿时间:2020/9/30 0:00:00
修稿时间:2021/3/2 0:00:00

Fine-grained Bug Location Method Based on Source Code Extension Information
LI Xiao-Zhuo,QING Du-Jun,HE Ye-Ping,MA Heng-Tai.Fine-grained Bug Location Method Based on Source Code Extension Information[J].Journal of Software,2022,33(11):4008-4026.
Authors:LI Xiao-Zhuo  QING Du-Jun  HE Ye-Ping  MA Heng-Tai
Affiliation:National Engineering Research Center of Fundamental Software (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100049, China;National Engineering Research Center of Fundamental Software (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100049, China;State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
Abstract:Bug location based on information retrieval (IR) uses cross language semantic similarity to construct a retrieval model to locate source code errors through bug report. However, the traditional method of bug location based on IR treats the code as pure text and only uses the lexical semantic information of source code, which leads to the problem of low accuracy caused by the lack of candidate code semantics in fine-grained bug location, and the usefulness of the results needs to be improved. By analyzing the relationship between code change and bug generation in the scenario of program evolution, this study proposes a fine-grained bug location method based on source code extension information, the explicit semantic information of code vocabulary and implicit information of code execution are used to enrich source code semantics to realize fine-grained bug location. Based on the location candidate points, the semantic context is used to enrich the code quantity, and the structural semantics of code execution intermediate language is used to realize fine-grained code distinguishability. Meanwhile, natural language semantics is used to guide the generation of code language representation based on attention mechanism, the semantic mapping between fine-grained code and natural language is implemented to implement fine-grained bug location method FlowLocator. The experimental results show that compared with the classical IR bug location method, the location accuracy of this method is significantly improved in the Top-N rank, mean average precision (MAP) and mean reciprocal rank (MRR).
Keywords:bug location  evolution program  information retrieval  semantic information  pseudo-siamese network
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号