运用文本领域的常识改善基于支撑向量机的文本分类器性能 Improving the Performance of the Text Classifier Based on Support Vector Machine Using the Common Sense in Text Domain期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

运用文本领域的常识改善基于支撑向量机的文本分类器性能

引用本文：	李辉,史忠植,许卓群.运用文本领域的常识改善基于支撑向量机的文本分类器性能[J].中文信息学报,2002,16(2):7-13.

作者姓名：	李辉史忠植许卓群

作者单位：	1.中科院计算技术研究所智能信息处理重点实验室2.北京大学计算机系

基金项目：	国家自然科学基金 (6 0 0 730 19)，国家自然科学基金重大项目 (6 9790 0 80 )支持

摘要：	本文提出了一种提高中文文本分类器推广性能的方法。一般而言,采用机器学习的方法对文本集合进行训练,可以获得文本分类器。本文引入了文本语义不变性常识,并将其融合到文本分类器中,提出了改进文本分类器的方法。与支撑向量机相结合,设计并实现了改进的文本分类器。对中文文本分类的实验表明,文本语义不变性常识的运用有效地改善了分类器的推广性能。
关键词：	文本分类同语义文档子段替换人工文档样本相容性条件支撑向量机
Improving the Performance of the Text Classifier Based on Support Vector Machine Using the Common Sense in Text Domain

LI Hui , SHI Zhong zhi XU Zhuo qun.Improving the Performance of the Text Classifier Based on Support Vector Machine Using the Common Sense in Text Domain[J].Journal of Chinese Information Processing,2002,16(2):7-13.

Authors:	LI Hui SHI Zhong zhi XU Zhuo qun

Affiliation:	1.Key Laboratory of Intelligent Information Processing, The Institute of Computing Technology, Chinese Academy of Sciences2.Computer Science and Technology Department, Peking University

Abstract:	In the paper,a method to improve the generalization performance of the Chinese text classifier is put forward.Generally speaking,a text classifier is obtained by training text set with a machine learning method.A kind of common sense about text semantic invariance is introduced.A method to improve the text classifier is put forward by fusing the common sense into it.With the combination with a Support Vector Machine,we design and implement the improved text classifier.The experiment shows that the generalization performance of the text classifier is improved with the method.

Keywords:	Text Categorization Synonymy Sub-Document Replacement Artificial Document Sample Compatibility Condition Support Vector Machine
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏