首页 | 本学科首页   官方微博 | 高级检索  
     

基于欧式距离的实例选择算法研究
引用本文:韩光辉.基于欧式距离的实例选择算法研究[J].上海第二工业大学学报,2010,27(3):188-196.
作者姓名:韩光辉
作者单位:河北大学数学与计算机学院,河北,保定,071001
摘    要:近邻分类法在训练分类器时需要存储训练集中所有的数据。这种缺点会导致程序在运行时需要大量的存储空间和运行时间。提出了两种新的实例选择算法:迭代类别实例选择算法(ISCC)和基于同类和异类的迭代实例选择算法(IISDC)。两种算法分别提出分类能力评价函数来度量每个实例的分类能力,挑选分类能力强的实例,删除分类能力弱的实例。经分析得出两个算法的时间复杂度均为O(n2)。在真实数据库上的试验结果表明,ICIS和IISDC算法在压缩比、分类精度上优于FCNN、ICF、ENN等经典算法。

关 键 词:实例选择  噪声  近邻法  ICIS  IISDC  ENN  FCNN  ICF

A Study of Instance Selection Algorithm Based on Euclidean Distance
HAN Guang-hui.A Study of Instance Selection Algorithm Based on Euclidean Distance[J].Journal of Shanghai Second Polytechnic University,2010,27(3):188-196.
Authors:HAN Guang-hui
Affiliation:HAN Guang-hui(College of Mathematics & Computer Science,Hebei University,Baoding 071001,Hebei Province,P.R.China)
Abstract:The basic nearest neighbor classifiers suffer from the common problem that the instances used to train the classifier are all stored indiscriminately,and as a result,the required memory storage is huge and response time becomes slow.In this paper,a new Instances Selection algorithm based on Classification Contribution Function shortly named ISCC and IISDC are presented.Meanwhile,two functions are introduced to evaluate the classification ability of the instances.Then an instance with the highest value of Classification Contribution Function is added to the condensed subset.The time complexity of ISCC and IISDC are O(n2).Compared to traditional methods,such as FCNN,ICF and ENN,the condensed sets obtained by ISCC and IISDC are superior in storage rate and classification accuracy.
Keywords:ICIS  IISDC  ENN  FCNN  ICF
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《上海第二工业大学学报》浏览原始摘要信息
点击此处可从《上海第二工业大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号