首页 | 本学科首页   官方微博 | 高级检索  
     

基于冗余实例对消除算法的实例选择
引用本文:刘 璐,高 强,刘衍珩,等. 基于冗余实例对消除算法的实例选择[J]. 计算机工程, 2014, 0(1): 177-180
作者姓名:刘 璐  高 强  刘衍珩  
作者单位:吉林大学计算机科学与技术学院,长春130012
基金项目:国家自然科学基金资助项目(60973136)
摘    要:实例选择能有效移除数据中的噪声和冗余数据,但现有方法难以在提高泛化能力的同时实现约简。针对该问题,提出一种冗余实例对消除算法用于实例选择。给出最近同类实例对的概念,计算数据集中存在的最近同类实例对,并移除满足条件的实例,在11个不同数据集上进行的仿真实验结果表明,经过该算法处理后的数据集在分类准确率和存储压缩率上较原始样本集有明显提升。对比剪辑最近邻规则算法,该算法能够在保持分类准确率的同时提高平均存储压缩率35%以上,并完整保留原始样本集的数据分布特征,在分类准确率和存储压缩率上取得折中。

关 键 词:实例选择  最近同类实例对  k最近邻  剪辑最近邻规则算法  数据约简  机器学习

Instance Selection Based on Redundant Instance Pair Elimination Algorithm
LIU Lu,GAO Qiang,LIU Yan-heng,SUN Xin. Instance Selection Based on Redundant Instance Pair Elimination Algorithm[J]. Computer Engineering, 2014, 0(1): 177-180
Authors:LIU Lu  GAO Qiang  LIU Yan-heng  SUN Xin
Affiliation:(College of Computer Science and Technology, Jilin University, Changchun 130012, China)
Abstract:Instance selection is a kind of effective method to remove the noise and redundant data. According to the unbalance between the generalization ability and reduction in existing instance selection methods, this paper proposes a new instance selection method: Redundant Instance Pair Elimination(RIPE) algorithm. It gives the concept of nearest similar pair, calculates the nearest similar pair of datasets, and removes the eligible instances. The simulation experimental results in 11 different datasets show that the classification accuracy and storage compression ratio of processed dataset are obviously improved compared with original datasets. Contrasted with Edited Nearest Neighbor rule(ENN) algorithm, this algorithm can keep the classification accuracy, improve more than 35% in average storage compression ratio, keep intact the data distribution of original datasets, and make better compromise in the classification accuracy and the storage compression ratio.
Keywords:instance selection  nearest similar instance pair  k nearest neighbor  Edited Nearest Neighbor rule(ENN) algorithm  datareduction  machine learning
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号