首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于标签相关度的Relief特征选择算法
引用本文:丁思凡,王锋,魏巍.一种基于标签相关度的Relief特征选择算法[J].计算机科学,2021,48(4):91-96.
作者姓名:丁思凡  王锋  魏巍
作者单位:山西大学计算机与信息技术学院 太原 030006
基金项目:国家自然科学基金;山西省应用基础研究项目
摘    要:特征选择在机器学习和数据挖掘中起到了至关重要的作用。Relief作为一种高效的过滤式特征选择算法,能处理多种类型的数据,且对噪声的容忍力较强,因此被广泛应用。然而,经典的Relief算法对离散特征的评价较为简单,在实际进行特征选择时并未充分挖掘特征与类标签之间的潜在关系,具有很大的改进空间。针对经典的Relief算法对离散特征的评价方式较为简单这一不足,提出了一种基于标签相关度的离散特征评价方法。该算法充分考虑了不同特征的特性,给出了一种面向混合特征的距离度量方式,同时从离散特征与标签之间的相关度出发,重新定义了Relief算法对离散特征的评价体系。实验结果表明,改进后的Relief算法与经典的Relief算法和现有的一些面向混合数据的特征选择算法相比,其分类精度均有不同程度的提升,具有良好的性能。

关 键 词:特征选择  RELIEF  标签相关度  VDM  决策树

Relief Feature Selection Algorithm Based on Label Correlation
DING Si-fan,WANG Feng,WEI Wei.Relief Feature Selection Algorithm Based on Label Correlation[J].Computer Science,2021,48(4):91-96.
Authors:DING Si-fan  WANG Feng  WEI Wei
Affiliation:(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China)
Abstract:Feature selection plays a vital role in machine learning and data mining.Relief,as an efficient filtering feature selection algorithm,is widely used because it can process multiple types of data and has a strong tolerance for noise.However,classic Relief algorithm provides a relatively simple evaluation to discrete features.In actual feature selection,the potential relationship between features and class labels is not fully explored,and there is a lot of room for improvement.Aiming at the shortcomings of classic Relief algorithm’s simple evaluation method for discrete features,a discrete feature evaluation method based on label correlation is proposed.The algorithm fully considers the characteristics of different features and gives a distance measurement method for mixed features.At the same time,starting from the correlation between discrete features and tags,it redefines the Relief algorithm’s evaluation system for discrete features.Experimental results show that,compared with the classic Relief algorithm and some existing feature selection algorithms for mixed data,the classification accuracy of the improved Relief algorithm has been improved to varying degrees and has a good performance.
Keywords:Feature selection  Relief  Label correlation  VDM  Decision tree
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号