首页 | 本学科首页   官方微博 | 高级检索  
     

基于代价敏感集成分类器的长方法检测
引用本文:刘丽倩,董东.基于代价敏感集成分类器的长方法检测[J].计算机科学,2018,45(Z11):497-500.
作者姓名:刘丽倩  董东
作者单位:河北师范大学数学与信息科学学院 石家庄050024,河北师范大学数学与信息科学学院 石家庄050024
摘    要:长方法(Long Method)是由于一个方法太长而需要重构的软件设计的问题。为了提高传统机器学习方法对长方法的识别率,针对代码坏味数据不平衡的特性,提出代价敏感集成分类器算法。以传统决策树算法为基础,利用欠采样策略对样本进行重采样,进而生成多个平衡的子集,并将这些子集训练生成多个相同的基分类器,然后将这些基分类器组合形成一个集成分类器。最后在集成分类器中引入由认知复杂度决定的误分类代价,使得分类器向准确分类少数类倾斜。与传统机器学习算法相比,此方法对长方法检测结果的查准率和查全率均有一定提升。

关 键 词:长方法  代码坏味  代价敏感  认知复杂度

Long Method Detection Based on Cost-sensitive Integrated Classifier
LIU Li-qian and DONG Dong.Long Method Detection Based on Cost-sensitive Integrated Classifier[J].Computer Science,2018,45(Z11):497-500.
Authors:LIU Li-qian and DONG Dong
Affiliation:College of Mathematics and Information Science,Hebei Normal University,Shijiazhuang 050024,China and College of Mathematics and Information Science,Hebei Normal University,Shijiazhuang 050024,China
Abstract:Long method is a software design problem that requires refactoring because it is too long.In order to improve the detection rate of traditional machine learning approaches on long method,a cost-sensitive integrated classifier algorithm was proposed from the viewpoint of unbalanced sample data of code smell.Based on the traditional decision tree algorithm,the under-sampling startegy is used for resampling,then a plurality of balanced subsets are generated.These subsets are trained to generate a plurality of same base classifiers.Finally,the mistaken classification cost determined by the cognitive complexity is complemented to the integrated classifier.The cost makes the classifier inclined to the accuracy rate of the minority categories.Compared with the traditional machine learning algorithm,this method has improved the precision and recall for detection result of long methods.
Keywords:Long method  Code smell  Cost-sensitive  Cognitive complexity
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号