首页 | 本学科首页   官方微博 | 高级检索  
     

基于样本密度和分类误差率的增量学习矢量量化算法研究
引用本文:李娟, 王宇平. 基于样本密度和分类误差率的增量学习矢量量化算法研究. 自动化学报, 2015, 41(6): 1187-1200. doi: 10.16383/j.aas.2015.c140311
作者姓名:李娟  王宇平
作者单位:1.西安电子科技大学计算机学院 西安 710071;;2.陕西师范大学远程教育学院 西安 710062
基金项目:国家自然科学基金(61203372, 61472297)资助
摘    要:作为一种简单而成熟的分类方法, K最近邻(K nearest neighbor, KNN)算法在数据挖掘、模式识别等领域获得了广泛的应用, 但仍存在计算量大、高空间消耗、运行时间长等问题. 针对这些问题, 本文在增量学习型矢量量化(Incremental learning vector quantization, ILVQ)的单层竞争学习基础上, 融合样本密度和分类误差率的邻域思想, 提出了一种新的增量学习型矢量量化方法, 通过竞争学习策略对代表点邻域实现自适应增删、合并、分裂等操作, 快速获取原始数据集的原型集, 进而在保障分类精度基础上, 达到对大规模数据的高压缩效应. 此外, 对传统近邻分类算法进行了改进, 将原型近邻集的样本密度和分类误差率纳入到近邻判决准则中. 所提出算法通过单遍扫描学习训练集可快速生成有效的代表原型集, 具有较好的通用性. 实验结果表明, 该方法同其他算法相比较, 不仅可以保持甚至提高分类的准确性和压缩比, 且具有快速分类的优势.

关 键 词:学习矢量量化   增量学习   分类误差率   样本密度   合并   分裂
收稿时间:2014-05-08
修稿时间:2014-09-27

An Incremental Learning Vector Quantization Algorithm Based on Pattern Density and Classification Error Ratio
LI Juan, WANG Yu-Ping. An Incremental Learning Vector Quantization Algorithm Based on Pattern Density and Classification Error Ratio. ACTA AUTOMATICA SINICA, 2015, 41(6): 1187-1200. doi: 10.16383/j.aas.2015.c140311
Authors:LI Juan  WANG Yu-Ping
Affiliation:1. School of Computer Science and Technology, Xidian University, Xi'an 710071;;2. School of Distance Education, Shaanxi Normal University, Xi'an 710062
Abstract:As a simple and mature classification method, the K nearest neighbor algorithm (KNN) has been widely applied to many fields such as data mining, pattern recognition, etc. However, it faces serious challenges such as huge computation load, high memory consumption and intolerable runtime burden when the processed dataset is large. To deal with the above problems, based on the single-layer competitive learning of the incremental learning vector quantization (ILVQ) network, we propose a new incremental learning vector quantization method that merges together pattern density and classification error rate. By adopting a series of new competitive learning strategies, the proposed method can obtain an incremental prototype set from the original training set quickly by learning, inserting, merging, splitting and deleting these representative points adaptively. The proposed method can achieve a higher reduction efficiency while guaranteeing a higher classification accuracy synchronously for large-scale dataset. In addition, we improve the classical nearest neighbor classification algorithm by absorbing pattern density and classification error ratio of the final prototype neighborhood set into the classification decision criteria. The proposed method can generate an effective representative prototype set after learning the training dataset by a single pass scan, and hence has a strong generality. Experimental results show that the method not only can maintain and even improve the classification accuracy and reduction ratio, but also has the advantage of rapid prototype acquisition and classification over its counterpart algorithms.
Keywords:Learning vector quantification  incremental learning  classification error ratio  pattern density  merge  split
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号