首页 | 本学科首页   官方微博 | 高级检索  
     

基于聚类的环形kNN算法
引用本文:匡振曦,武继刚,李嘉兴.基于聚类的环形kNN算法[J].计算机工程与科学,2019,41(5):804-812.
作者姓名:匡振曦  武继刚  李嘉兴
作者单位:广东工业大学计算机学院,广东 广州,510006;广东工业大学计算机学院,广东 广州,510006;广东工业大学计算机学院,广东 广州,510006
基金项目:国家自然科学基金(61672171,61702115,61702114);广东省自然科学基金重点项目(2018B030311007)
摘    要:传统k最近邻算法kNN在数据分类中具有广泛的应用,但该算法具有较多的冗余计算,致使处理高维数据时花费较多的计算时间。同时,基于地标点谱聚类的分类算法(LC-kNN和RC-kNN)中距离当前测试点的最近邻点存在部分缺失,导致其准确率降低。针对上述问题,提出一种基于聚类的环形k最近邻算法。提出的算法在聚类算法的基础上,首先将训练集中相似度较高的数据点聚成一个簇,然后以当前测试点为中心设置一个环形过滤器,最后通过kNN算法对过滤器中的点进行分类,其中聚类算法可以根据实际情况自由选择。算法性能已在UCI数据库中6组公开数据集上进行了实验测试,实验结果表明:AkNN_E与AkNN_H算法比kNN算法在计算量上平均减少51%,而在准确率上比LC-kNN和RC-kNN算法平均提高3%。此外,当数据在10 000维的情况下该算法仍然有效。

关 键 词:环形过滤器  聚类  分类  相邻簇心组  三角不等式
收稿时间:2018-11-15
修稿时间:2019-05-25

A clustering-based annular k-nearest neighbor algorithm
KUANG Zhen xi,WU Ji gang,LI Jia xing.A clustering-based annular k-nearest neighbor algorithm[J].Computer Engineering & Science,2019,41(5):804-812.
Authors:KUANG Zhen xi  WU Ji gang  LI Jia xing
Affiliation:(School of Computers,Guangdong University of Technology,Guangzhou 510006,China)
Abstract:The traditional k nearest neighbor (kNN) algorithm has been widely used in data classification, but its redundant distance computation consumes much time for high dimensional data processing. Meanwhile, for the classification algorithms based on landmark based spectral clustering (LC kNN and RC kNN), the nearest neighbor points of the current testing point are partially missing, which results in low accuracy. Aiming at the above problems, we propose a clustering based annular k nearest neighbor algorithm (AkNN). Based on the traditional clustering algorithm, the proposed algorithm collects data points with high similarity in the training set into a cluster. Then an annular filter centered on the testing point is set. Finally, the points in the filter are classified by the kNN algorithm. We can choose a clustering algorithm freely according to practical needs. The performance of the proposed algorithm is tested on 6 open data sets in the UCI database. Experimental results show that the AkNNE and AkNNH algorithms reduce calculation by 51% on average compared with the kNN algorithm, and enhance the accuracy by 3% on average compared with the LC-kNN and RC-kNN algorithms. In addition, the proposed algorithm is still valid in 10000 dimensional datasets.
Keywords:annular filter  clustering  classification  neighboring centroid  triangle inequality  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号