首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 562 毫秒
1.
多标记分类器链中标记的预测顺序具有随机性,导致学习性能下降,容易造成错误信息的传递.考虑到标记的顺序性,文中提出基于多标记重要性排序的分类器链算法.该算法将标记间相互作用程度的大小作为衡量标记重要程度的依据,在标记相关性的基础上,按照重要性进行标记排序,并将排序结果作为分类器链算法中分类器的顺序,从而解决多标记预测顺序的问题.实验表明,相比现有方法,文中算法在多个数据集上能更稳定有效地分类多标记.  相似文献   

2.
在分类器链方法中, 如何确定标签学习次序至关重要, 为此, 提出一种基于关联规则和拓扑序列的分类器链方法(TSECC). 首先结合频繁模式设计了一种基于强关联规则的标签依赖度量策略; 接下来通过标签间依赖关系构建有向无环图, 对图中所有顶点进行拓扑排序; 最后将得到的拓扑序列作为分类器链方法中标签的学习次序, 对每个标签的分类器依次迭代更新. 特别地, 为减少无标签依赖或标签依赖度较低的“孤独”标签对其余标签预测性能的影响, 将“孤独”标签排在拓扑序列之外, 利用二元关联模型训练. 在多种公共多标签数据集上的实验结果表明, TSECC能够有效提升分类性能.  相似文献   

3.
代码异味的存在对源代码的可理解性和可维护性有着糟糕的负面影响.通常情况下,研究人员更多是考虑单一代码异味对源代码的影响,但有研究指出,多种代码异味的共同存在以及它们之间的交互往往比单一代码异味有着更严重的负面影响.本文针对主成分分析在布尔类型变量分析中容易产生难以解释的主成分的弊端,提出了一种基于主轴因子法及异味严重性的代码异味相关性分析方法,并在92个系统上对14种代码异味进行试验,从中提取了6个因子,在相关矩阵中新发现了{Extensive Coupling、Long Parameter List}代码异味对的存在.最后本文比较分析了具有严重性标签的代码异味数据集的优势,解释了每个因子的含义并进行分类命名.  相似文献   

4.
传感器网络中多种数据故障会同时出现,为了同时检测出多种数据故障,使用多标签分类模型对传感器网络数据故障的检测过程进行建模.为了提高多标签分类器对数据故障的检测性能,提出了一种基于多标签ReliefF和遗传算法的特征选择算法.该方法将ReliefF扩展成可以对特征子集进行评估的多标签ReliefF,特征选择过程首先使用遗传算法搜索特征子集,然后使用多标签ReliefF对特征子集进行评估.在三个多标签分类器上的实验结果表明,提出的特征选择算法可以显著地提升多标签分类器对传感器网络数据故障的检测性能.  相似文献   

5.
为提高重构效率,提出一种通过代码行变更指数来对异味类进行排序的方法.此前对于代码异味的研究主要关注静态代码行度量指标,但是在整个项目开发过程中代码行数是动态变化的,且变化规模越大的类出现异味的概率越大.通过对整个项目开发过程中代码行动态变化过程进行分析,提出代码行变更指数对异味类变化规模进行排序,使其重构主要关注那些变化较大的异味类,减少重构成本.对比实验结果表明,按照该方法对异味类进行重构,可以减少异味剩余率,提高重构效率.  相似文献   

6.
艾成豪  高建华  黄子杰 《计算机工程》2022,48(7):168-176+198
代码异味是违反基本设计原理或编码规范的软件特征,源代码中若存在代码异味将提高其维护的成本和难度。在代码异味检测方法中,机器学习相较其他方法能够取得更好的性能表现。针对使用大量特征进行训练可能会引起“维度灾难”以及单一模型泛化性能不佳的问题,提出一种混合特征选择和集成学习驱动的代码异味检测方法。通过ReliefF、XGBoost特征重要性和Pearson相关系数计算出所有特征的权重并进行融合,删除融合后权重值较低的无关特征,以得到特征子集。构建具有两层结构的Stacking集成学习模型,第一层的基分类器由3种不同的树模型构成,第二层以逻辑回归作为元分类器,两层结构的集成学习模型能够结合多样化模型的优点来增强泛化性能。将特征子集输入Stacking集成学习模型,从而完成代码异味分类与检测任务。实验结果表明,该方法能够减少特征维度,与Stacking集成学习模型第一层中的最优基分类器相比,其在F-measure和G-mean指标上最高分别提升1.46%和0.87%。  相似文献   

7.
ECC-MIMLSVM+是多示例多标签学习框架下一种算法,该算法提出了一种基于分类器链的方法,但其没有充分考虑到标签之间的依赖关系,而且当标签数目的增多,子分类器链长度增加,使得误差传播问题凸显. 因此针对此问题,提出了一种改进算法,将ECC-MIMLSVM+算法和标签依赖关系相结合,设计成基于标记依赖关系集成分类器链(ELDCT-MIMLSVM+)来加强标签间信息联系,避免信息丢失,提高分类的准确率. 通过实验将本文算法与其他算法进行了对比,实验结果显示,本文算法取得了良好的效果.  相似文献   

8.
针对泛娱乐领域文本情报预测类别标签具备有向无环图(DAG)结构的特点,提出一种考虑标签层次结构的基于最优路径层次多标签分类方法。根据现有标签构建DAG结构并将其转化为较易处理的树形结构;采用局部策略为树形结构中每个节点分别训练基分类器,同时为每个节点设置贡献值,贡献值由分类器输出概率与层次权重组合而成,贡献值大于阈值时该节点设置为1,否则为0;对树形结构进行深度优先遍历生成路径,计算各路径得分,选择满足层次约束并得分最高的路径作为最终预测集合。在泛娱乐公开文本信息数据集上进行了4组实验,结果表明该方法相较于分类器链、二元分析、SVM多标签分类和MLKNN算法,分类效果更优。  相似文献   

9.
针对一些多标签文本分类算法没有考虑文本-术语相关性和准确率不高的问题,提出一种结合旋转森林和AdaBoost分类器的集成多标签文本分类方法。首先,通过旋转森林算法对样本集进行分割,通过特征变换将各样本子集映射到新的特征空间,形成多个具有较大差异性的新样本子集。然后,基于AdaBoost算法,在样本子集中通过多次迭代构建多个AdaBoost基分类器。最后,通过概率平均法融合多个基分类器的决策结果,以此做出最终标签预测。在4个基准数据集上的实验结果表明,该方法在平均精确度、覆盖率、排名损失、汉明损失和1-错误率方面都具有优越的性能。  相似文献   

10.
相对于单一类型的代码异味,代码异味共存现象更具危害性。已有实证研究大多聚焦于分析桌面应用程序中代码异味的共存现象,缺少对Android应用程序中代码异味共存现象的研究。为了研究Android应用程序中代码异味的共存现象,并与桌面应用程序中代码异味共存现象进行比较,分别对285个Android应用程序和30个桌面应用程序进行检测,对检测出来的10种异味进行分析。首先,根据检测结果计算受到多种异味影响的类的百分比。然后,使用公式计算代码异味共存的频率,最后,使用Spearman相关系数分析代码异味共存与应用程序规模的关系。结论如下:a)在Android应用程序中受到一种以上代码异味共同干扰的类占有异味的类的总数的31.04%;b)在两个平台的应用程序中,两对代码异味brain class—brain method和god class—brain method共存的频率较高;c)一种异味、两种异味共存、三种异味共存与Android应用程序的规模具有较强的相关性。  相似文献   

11.

Code smell detection is essential to improve software quality, enhancing software maintainability, and decrease the risk of faults and failures in the software system. In this paper, we proposed a code smell prediction approach based on machine learning techniques and software metrics. The local interpretable model-agnostic explanations (LIME) algorithm was further used to explain the machine learning model’s predictions and interpretability. The datasets obtained from Fontana et al. were reformed and used to build binary-label and multi-label datasets. The results of 10-fold cross-validation show that the performance of tree-based algorithms (mainly Random Forest) is higher compared with kernel-based and network-based algorithms. The genetic algorithm based feature selection methods enhance the accuracy of these machine learning algorithms by selecting the most relevant features in each dataset. Moreover, the parameter optimization techniques based on the grid search algorithm significantly enhance the accuracy of all these algorithms. Finally, machine learning techniques have high potential in predicting the code smells, which contribute to detect these smells and enhance the software’s quality.

  相似文献   

12.
张杨  东春浩  刘辉  葛楚妍 《软件学报》2022,33(5):1551-1568
目前已有的代码坏味检测方法仅依赖于代码结构信息和启发式规则, 对嵌入在不同层次代码中的语义信息关注不够, 而且现有的代码坏味检测方法准确率还有进一步提升的空间. 针对该问题, 提出一种基于预训练模型和多层次信息的代码坏味检测方法DeepSmell, 首先采用静态分析工具提取程序中的代码坏味实例和多层次代码度量信息, 并...  相似文献   

13.
现有的多标签学习算法往往只侧重于实例空间到标签空间的正向投影,正向投影时由于特征维数降低所产生的实例空间信息丢失的问题往往被忽略。针对以上问题,提出一种基于双向映射学习的多标签分类算法。首先,利用实例空间到标签空间的正向映射损失建立线性多标签分类模型;然后,在模型中引入重构损失正则项构成双向映射模型,补偿由于正向映射时导致的鉴别信息的丢失;最后,将双向映射模型结合标签相关性和实例相关性充分地挖掘标签之间、实例之间的潜在关系,并利用非线性核映射提高模型对非线性数据的处理能力。实验结果表明,与近年来的其他几种方法相比,该方法在汉明损失、一次错误率和排序损失上的性能平均提升17.68%、17.01%、18.57%;在六种评价指标上的性能平均提升了12.37%,验证了模型的有效性。  相似文献   

14.
Code smells are a popular mechanism to find structural design problems in software systems. Consequently, several tools have emerged to support the detection of code smells. However, the number of smells returned by current tools usually exceeds the amount of problems that the developer can deal with, particularly when the effort available for performing refactorings is limited. Moreover, not all the code smells are equally relevant to the goals of the system or its health. This article presents a semi-automated approach that helps developers focus on the most critical problems of the system. We have developed a tool that suggests a ranking of code smells, based on a combination of three criteria, namely: past component modifications, important modifiability scenarios for the system, and relevance of the kind of smell. These criteria are complementary and enable our approach to assess the smells from different perspectives. Our approach has been evaluated in two case-studies, and the results show that the suggested code smells are useful to developers.  相似文献   

15.
苏珊  张杨  张冬雯 《计算机应用》2022,42(6):1702-1707
基于启发式和机器学习的代码坏味检测方法已被证明具有一定的局限性,且现有的检测方法大多集中在较为常见的代码坏味上。针对这些问题,提出了一种深度学习方法来检测过紧的耦合、分散的耦合和散弹式修改这三种与耦合度相关检测较为少见的代码坏味。首先,提取三种代码坏味需要的度量并对得到的数据进行处理;之后,构建卷积神经网络(CNN)与注意力(Attention)机制相结合的深度学习模型,引入的注意力机制可以对输入的度量特征进行权重的分配。从21个开源项目中提取数据集,在10个开源项目中对检测方法进行了验证,并与CNN模型进行对比。实验结果表明:过紧的耦合和分散的耦合在所提模型中取得了更好的结果,相应代码坏味的查准率分别达到了93.61%和99.76%;而散弹式修改在CNN模型中有更好的结果,相应代码坏味查准率达到了98.59%。  相似文献   

16.
ContextCode smells are indicators of poor coding and design choices that can cause problems during software maintenance and evolution.ObjectiveThis study is aimed at a detailed investigation to which extent problems in maintenance projects can be predicted by the detection of currently known code smells.MethodA multiple case study was conducted, in which the problems faced by six developers working on four different Java systems were registered on a daily basis, for a period up to four weeks. Where applicable, the files associated to the problems were registered. Code smells were detected in the pre-maintenance version of the systems, using the tools Borland Together and InCode. In-depth examination of quantitative and qualitative data was conducted to determine if the observed problems could be explained by the detected smells.ResultsFrom the total set of problems, roughly 30% percent were related to files containing code smells. In addition, interaction effects were observed amongst code smells, and between code smells and other code characteristics, and these effects led to severe problems during maintenance. Code smell interactions were observed between collocated smells (i.e., in the same file), and between coupled smells (i.e., spread over multiple files that were coupled).ConclusionsThe role of code smells on the overall system maintainability is relatively minor, thus complementary approaches are needed to achieve more comprehensive assessments of maintainability. Moreover, to improve the explanatory power of code smells, interaction effects amongst collocated smells and coupled smells should be taken into account during analysis.  相似文献   

17.
A design pattern is a general reusable solution to commonly recurring problems in software projects. Bad smells are symptoms existing in the source code that possibly indicate the presence of a structural problem that requires code refactoring. Although design pattern and bad smells be different concepts, literature has shown that they may be related and cooccur during the evolution of a software system. This paper presents an empirical study that investigates cooccurrences of design patterns and bad smells as well as identifies the main factors that contribute to the emergence of the relationship between them. We carried out a case study with five Java systems to: (1) investigate if the use of design pattern reduces bad smell occurrence, (2) identify cooccurrences of design patterns and bad smells, and (3) identify situations that contribute for the cooccurrence emergence. As the main result, we found that the application of design pattern not necessarily avoid bad smell occurrences. The results also show that some design patterns such as composite, factory method, and singleton, are intrinsically modular and might be useful in creating high-quality systems. However, other design patterns such as adapter-command, proxy, and state-strategy, have presented high cooccurrence frequency with bad smells; therefore, they require attention in their implementation. Finally, via manual inspection in the components with cooccurrence, we found that the identified cooccurrences appeared due to poor planning and inadequate application of design patterns.  相似文献   

18.
Several code smell detection tools have been developed providing different results, because smells can be subjectively interpreted, and hence detected, in different ways. In this paper, we perform the largest experiment of applying machine learning algorithms to code smells to the best of our knowledge. We experiment 16 different machine-learning algorithms on four code smells (Data Class, Large Class, Feature Envy, Long Method) and 74 software systems, with 1986 manually validated code smell samples. We found that all algorithms achieved high performances in the cross-validation data set, yet the highest performances were obtained by J48 and Random Forest, while the worst performance were achieved by support vector machines. However, the lower prevalence of code smells, i.e., imbalanced data, in the entire data set caused varying performances that need to be addressed in the future studies. We conclude that the application of machine learning to the detection of these code smells can provide high accuracy (>96 %), and only a hundred training examples are needed to reach at least 95 % accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号