首页 | 本学科首页   官方微博 | 高级检索  
     

基于区域划分的kNN文本快速分类算法研究
引用本文:胡 元,石 冰.基于区域划分的kNN文本快速分类算法研究[J].计算机科学,2012,39(10):182-186.
作者姓名:胡 元  石 冰
作者单位:1. 山东大学计算机科学与技术学院 济南250101 ; 中国人民解放军77675部队 林芝860000
2. 山东大学计算机科学与技术学院 济南250101
摘    要:kNN方法作为一种简单、有效、非参数的分类方法,在文本分类中广泛应用。为提高其分类效率,提出一种基于区域划分的kNN文本快速分类算法。将训练样本集按空间分布情况划分成若干区域,根据测试样本与各区域之间的位置关系快速查找其k个最近邻,从而大大降低kNN算法的计算量。数学推理和实验结果均表明,该算法在确保kNN分类器准确率不变的前提下,显著提高了分类效率。

关 键 词:文本分类  kNN算法  聚类  k-均值算法

Fast kNN Text Classification Algorithm Based on Area Division
HU Yuan , SHI Bing.Fast kNN Text Classification Algorithm Based on Area Division[J].Computer Science,2012,39(10):182-186.
Authors:HU Yuan  SHI Bing
Affiliation:1 (School of Computer Science and Technology,Shandong University,Jinan 250101,China)1(77675 Troop,PLA,Linzhi 860000,China)2
Abstract:As a simple, effective and non-parametric classification algorithm, kNN method has been widely used in text classification. In order to improve the efficiency of classification,We proposed a fast kNN text classification algorithm based on area division. We divided the training set into several parts based on their area distribution, and then according to the relative positions between test patterns and those parts, easily found out k nearest neighbours of the test patterns in the training set. hhis will sharply cut down the amount of calculation of kNN algorithm Mathematical reasoning and the experimental results both show that this algorithm significantly improves the efficiency of classification while keeping the same accuracy rate of kNN classifier algorithm.
Keywords:Text classification  K-nearest neighbor algorithm  Clustering  K-means algorithm
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号