首页 | 本学科首页   官方微博 | 高级检索  
     

流形学习算法在中文文本分类中的应用
引用本文:王洪元,封磊,冯燕,程起才.流形学习算法在中文文本分类中的应用[J].山东大学学报(工学版),2012,42(4):8-12.
作者姓名:王洪元  封磊  冯燕  程起才
作者单位:常州大学信息科学与工程学院,常州市过程感知与互联技术重点实验室, 江苏 常州 213164
基金项目:国家自然科学基金资助项目(61070121)
摘    要:传统的流形学习局部线性嵌入 (locally linear embedding, LLE) 算法通过欧氏距离来选择邻域,如果数据集选自多个类别,这种距离度量方法无法得到正确的邻域关系。本研究提出一种改进的局部线性嵌入 (modified LLE,MLLE) 算法,该算法通过改进距离矩阵,使得类间的距离大、类内的距离小,从而使得邻域的选择尽量在一个类中。将MLLE算法应用到中文文本分类中,结果表明:与传统的算法比较,MLLE在分类结果可视化效果和识别率等方面都有显著提高。

关 键 词:流形学习  LLE算法  MLLE算法  中文文本分类  
收稿时间:2012-05-06

The manifold learning algorithm′s application in the Chinese text clustering
WANG Hong-yuan,FENG Lei,FENG Yan,CHENG Qi-cai.The manifold learning algorithm′s application in the Chinese text clustering[J].Journal of Shandong University of Technology,2012,42(4):8-12.
Authors:WANG Hong-yuan  FENG Lei  FENG Yan  CHENG Qi-cai
Affiliation:Changzhou Key Laboratory for Process Perception and Interconnected Technology, School of Information Science and Engineering, Changzhou University, Changzhou 213164, China
Abstract:According to the euclidean distance,the original LLE(locally linear embedding) algorithm chooses the neighborhood.If the data was originated from multiple classes,the correct neighborhood relationship could not be obtained.In order to solve this problem,an improved MLLE(modified LLE) was proposed.In MLLE algorithm,the distance matrix was modified,which could make the distance longger between classes and smaller within classes,and so could make the neighborhood in one class as far as possible.The test of Chinese text clustering showed that the MLLE algorithm could improve the clustering visualization and the recognition rate.
Keywords:manifold learning  LLE algorithm  MLLE algorithm  Chinese text clustering
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《山东大学学报(工学版)》浏览原始摘要信息
点击此处可从《山东大学学报(工学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号