首页 | 本学科首页   官方微博 | 高级检索  
     

基于排序损失的ECC多标签代码异味检测方法
引用本文:王继娜, 陈军华, 高建华. 基于排序损失的ECC多标签代码异味检测方法[J]. 计算机研究与发展, 2021, 58(1): 178-188. DOI: 10.7544/issn1000-1239.2021.20190836
作者姓名:王继娜  陈军华  高建华
作者单位:(上海师范大学计算机科学与技术系 上海 200234) (wjn_wy1108@163.com)
摘    要:代码异味是由糟糕的代码或设计问题引起的一种软件特征,严重影响了软件系统的可靠性和可维护性.在软件系统中,一段代码元素可能同时受到多种代码异味的影响,使得软件质量明显下降.多标签分类适用该情况,将高共现的多个代码异味置于同一标签组,可以更好地考虑代码异味的相关性,但现有的多标签代码异味检测方法未考虑同一段代码元素中多种代码异味检测顺序的影响.对此,提出了一种基于排序损失的集成分类器链(ensemble of classifier chains, ECC)多标签代码异味检测方法,该方法选择随机森林作为基础分类器并采取多次迭代ECC的方式,以排序损失最小化为目标,选择一个较优的标签序列集,优化代码异味检测顺序问题,模拟其生成机理,检测一段代码元素是否同时存在长方法-长参数列表、复杂类-消息链或消息链-过大类这3组代码异味.实验采用9个评价指标,结果表明所提出的检测方法优于现有的多标签代码异味检测方法,F1平均值达97.16%.

关 键 词:代码异味  随机森林  排序损失  集成分类器链  多标签分类

ECC Multi-Label Code Smell Detection Method Based on Ranking Loss
Wang Jina, Chen Junhua, Gao Jianhua. ECC Multi-Label Code Smell Detection Method Based on Ranking Loss[J]. Journal of Computer Research and Development, 2021, 58(1): 178-188. DOI: 10.7544/issn1000-1239.2021.20190836
Authors:Wang Jina  Chen Junhua  Gao Jianhua
Affiliation:(Department of Computer Science and Technology, Shanghai Normal University, Shanghai 200234)
Abstract:Code smell is a software feature of bad code or design problem,which seriously affects the reliability and maintainability of software systems.In a software system,a piece of code element may be affected by multiple code smells at the same time,which makes the quality of the software significantly reduced.Multi-label classification is suitable for this case,by placing multiple code smells with high co-occurrence in one label group,the correlation of code smells can be better considered,but the existing multi-label code smell detection methods do not consider the influence of the code smell detection order in the same code element.As a result,an ECC multi-label code smell detection method based on ranking loss is proposed.This method aims at minimizing ranking loss and chooses an optimal set of label sequences to optimize code smell detection order problem and simulate the mechanism of code smell generation by selecting random forest as the basic classifier and adopting multiple iterations of ECC to detect whether a piece of code element has long method-long parameter list,complex class-message chain or message chain-blob simultaneously.Finally,nine evaluation metrics are used and experimental results show that the proposed method is superior to the existing multi-label code smell detection method with an average F1 of 97.16%.
Keywords:code smell  random forest  ranking loss  ensemble of classifier chains(ECC)  multi-label classification
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机研究与发展》浏览原始摘要信息
点击此处可从《计算机研究与发展》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号