首页 | 本学科首页   官方微博 | 高级检索  
     

高斯加权的重构性K-NN算法研究
引用本文:刘作国,陈笑蓉.高斯加权的重构性K-NN算法研究[J].中文信息学报,2015,29(5):112-117.
作者姓名:刘作国  陈笑蓉
作者单位:贵州大学 计算机科学与技术学院,贵州 贵阳 550025
基金项目:国家自然科学基金(61363028)
摘    要:该文提出基于高斯加权距离以及聚类重构机制的K-NN文本聚类算法。文章提出K-NN近邻域的概念,通过高斯加权的近邻域算法实施K-NN聚类。利用高斯函数根据样本与聚类中心的距离为样本赋权,计算聚类距离。基于近邻域权重和聚类密度对形成的聚类实施重构,实现聚类数目的自适应调整。使用拆分算子拆分稀疏聚类并调整异常样本;使用合并算子合并相似聚类。实验显示聚类重构机制能够有效地提高聚类的准确率及召回率,增加聚类密度,使得形成的聚类结果更加合理。


关 键 词:文本聚类  K-NN算法  高斯加权  近邻域规则  聚类重构  

Research on Gauss Weighed Reorganization K-NN
LIU Zuoguo,CHEN Xiaorong.Research on Gauss Weighed Reorganization K-NN[J].Journal of Chinese Information Processing,2015,29(5):112-117.
Authors:LIU Zuoguo  CHEN Xiaorong
Affiliation:College of Computer Science & Technology Guizhou University, Guiyang,Guizhou 550025, China
Abstract:This paper presents a K-NN text clustering algorithm employing uses Gauss Weighed Distance and Cluster Reorganization Mechanism. The concept of Nearest Domain is proposed and Nearest Domain Rules are elaborated. Then Gauss Weighing Algorithm is designed to Quantification samples’ distance and weights. A text is weighed based on the distance from cluster center via Gauss function in order that distances of clusters can be calculated. Further, Cluster Reorganization Mechanism will make a self-adaption to the amount of clusters. Splitting operator separates sparse clusters and adjusts abnormal texts while consolidating operator combines similar ones. Clustering experiment shows that reorganization process effectively improves the accuracy and recall rate and makes result more reasonable by increasing the inner density of clusters.
Keywords:text clustering  K-NN  Gauss weighing  nearest domain rule  cluster reorganization  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号