一种基于近邻元分析的文本分类算法 Text Classification Algorithm Based on Neighborhood Component Analysis期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种基于近邻元分析的文本分类算法

引用本文：	刘丛山,李祥宝,杨煜普.一种基于近邻元分析的文本分类算法[J].计算机工程,2012,38(15):139-141.

作者姓名：	刘丛山李祥宝杨煜普

作者单位：	上海交通大学自动化系系统控制与信息处理教育部重点实验室,上海,200240

基金项目：	国家“863”计划基金资助项目“云制造服务平台关键技术”

摘要：	在近邻元分析(NCA)算法的基础上，提出K近邻元分析分类算法K-NCA。利用NCA算法完成对训练样本集的距离测度学习和降维，定义类偏斜因子，引入K近邻思想，得到测试样本的类条件概率估计，并通过该概率进行类别判定，实现文本分类器功能。实验结果表明，K-NCA算法的分类效果较好。
关键词：	近邻元分析距离测度学习降维 K近邻文本分类
收稿时间：	2011-09-29
Text Classification Algorithm Based on Neighborhood Component Analysis

LIU Cong-shan , LI Xiang-bao , YANG Yu-pu.Text Classification Algorithm Based on Neighborhood Component Analysis[J].Computer Engineering,2012,38(15):139-141.

Authors:	LIU Cong-shan LI Xiang-bao YANG Yu-pu

Affiliation:	(Key Laboratory of System Control and Information Processing,Ministry of Education,Department of Automation,Shanghai Jiaotong University,Shanghai 200240,China)

Abstract:	This paper proposes a novel algorithm named K-NCA based on Neighborhood Component Analysis(NCA).It uses NCA to learn a Mahalanobis distance measure and reduces the dimension of the input dataset.The algorithm defines a class imbalance factor and introduces K Nearest Neighbor(KNN) to compute the test sample’s class-conditional probability estimation.The sample’s class label is decided by this probability.A text classifier is designed to accomplish the algorithm.Experimental results show that K-NCA algorithm can improve the accuracy of text classification.

Keywords:	Neighborhood Component Analysis(NCA) distance metric learning dimension reduction K Nearest Neighbor(KNN) text classification
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏