首页 | 本学科首页   官方微博 | 高级检索  
     

一种半监督局部线性嵌入算法的文本分类方法*
引用本文:夏士雄,李佑文,周勇.一种半监督局部线性嵌入算法的文本分类方法*[J].计算机应用研究,2010,27(1):64-67.
作者姓名:夏士雄  李佑文  周勇
作者单位:中国矿业大学,计算机科学与技术学院,江苏,徐州,221116
基金项目:国家自然科学基金资助项目(50674086);高等学校博士学科点专项科研基金资助项目(20060290508)
摘    要:针对局部线性嵌入算法(LLE)应用于非监督机器学习中的缺陷,将该算法与半监督思想相结合,提出了一种基于半监督局部线性嵌入算法的文本分类方法。通过使用文本数据的流形结构和少量的标签样本,将LLE中的距离矩阵采用分段形式进行调整;使用调整后的矩阵进行线性重建从而实现数据降维;针对半监督LLE中使用欧氏距离的缺点,采用高斯核函数将欧氏距离进行变换,并用新的核距离取代欧氏距离,提出了基于核的半监督局部线性嵌入算法;最后通过仿真实验验证了改进算法的有效性。

关 键 词:局部线性嵌入算法    半监督学习    流形学习    文本分类    核函数

Method based on semi-supervised local linear embedding algorithm for text classification
XIA Shi-xiong,LI You-wen,ZHOU Yong.Method based on semi-supervised local linear embedding algorithm for text classification[J].Application Research of Computers,2010,27(1):64-67.
Authors:XIA Shi-xiong  LI You-wen  ZHOU Yong
Affiliation:(School of Computer Science & Technology, China University of Mining & Technology, Xuzhou Jiangsu 221116, China)
Abstract:In order to solve the defects of local linear embedding algorithm(LLE) could only be used in unsupervised machine learning, combined this algorithm and the thinking of semi-supervised learning together, this paper proposed a method based on semi-supervised local linear embedding algorithm for text classification. Firstly, with the manifold structure of text data and some labeled samples, this algorithm revised the distance matrix in LLE algorithm by using piecewise function. Secondly, in order to achieve the purpose of dimensionality reduction, reconstructed the samples linearly by using the adjusted matrix. Then, because of shortcomings of the Euclidean distance in semi-supervised local linear embedding algorithm, improved it by proposing kernel based semi-supervised local linear embedding algorithm, which transformed and replaced Euclidean distance by Gaussian kernel function distance. Finally, the results of simulated experiments indicate these algorithms can really promote the performance of text classification.
Keywords:LLE  semi-supervised learning  manifold learning  text classification  kernel function
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号