一种模仿人类的自动文本分类算法 An Automatic Algorithm of Text Categorization Imitating Human's期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种模仿人类的自动文本分类算法

引用本文：	王树梅,黄河燕,等.一种模仿人类的自动文本分类算法[J].计算机科学,2003,30(3):44-45.

作者姓名：	王树梅黄河燕

作者单位：	1. 南京理工大学计算机系,南京,210014 2. 中国科学院计算机语言信息工程研究中心,北京,100083

摘要：	1.引言 Internet上有着大量的且快速增长的文本,文本是信息和知识的宝贵资源。随着Internet的快速发展,不久的将来,人们所需要的大部分信息都可以在网上找到。Internet正在成为人类的信息宝库,但是随着网上信息的爆炸性增长,人们想从这个信息宝库中获得自己所需要的信息已经变得日益困难,因此,如何快速有效地获得有用的信息已成为人们十分关
关键词：	自动文本分类算法文本信息处理文档分类自然语言处理 Internet
An Automatic Algorithm of Text Categorization Imitating Human's

WANG Shu-Mei DAI Bao-Cun HUANG He-Yan CHEN Zhao-Xiong.An Automatic Algorithm of Text Categorization Imitating Human''''s[J].Computer Science,2003,30(3):44-45.

Authors:	WANG Shu-Mei DAI Bao-Cun HUANG He-Yan CHEN Zhao-Xiong

Abstract:	An algorithm of text classification is given that imitates human's in this paper. On one hand, the algorithm enhances weight of theme when feature vector is processed, because of the assumption that the title of a document can project its content. On the other hand, a weight parameter to vector is designed to simulate human's skimming and skipping behavior for calculating method of a document cluster center, and a weight of the feature that there are more positive examples than negative ones is enhanced . The experiment shows that the algorithm greatly improves the performance of a text classification system.

Keywords:	Text categorization Corpus Cluster center Machine learning
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏