基于新的关键词提取方法的快速文本分类系统* Research on Fast Text Classifier Based on New Keywords Extraction Method期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于新的关键词提取方法的快速文本分类系统*

引用本文：	罗杰,陈力,夏德麟,王凯.基于新的关键词提取方法的快速文本分类系统*[J].计算机应用研究,2006,23(4):32-34.

作者姓名：	罗杰陈力夏德麟王凯

作者单位：	武汉大学,电子信息学院,湖北,武汉,430079

摘要：	关键词的提取是进行计算机自动文本分类和其他文本数据挖掘应用的关键。系统从语言的词性角度考虑，对传统的最大匹配分词法进行了改进，提出一种基于动词、虚词和停用词三个较小词库的快速分词方法（FS），并利用TFIDF算法来筛选出关键词以完成将Web文档进行快速有效分类的目的。实验表明，该方法在不影响分类准确率的情况下，分类的速度明显提高。
关键词：	计算机应用中文信息处理关键词提取 Web文档分类
文章编号：	1001-3695（2006）04-0032-03
收稿时间：	2005-04-06
修稿时间：	2005-04-062005-05-26
Research on Fast Text Classifier Based on New Keywords Extraction Method

LUO Jie,CHEN Li,XIA De lin,WANG Kai.Research on Fast Text Classifier Based on New Keywords Extraction Method[J].Application Research of Computers,2006,23(4):32-34.

Authors:	LUO Jie CHEN Li XIA De lin WANG Kai

Affiliation:	(School of Eletronic Information, Wuhan University, Wuhan Hubei 430079, China)

Abstract:	Keyword extraction is the sticking point for Automatic Classification and Text Data Mining Application. Taking traits of nature language into consideration, this paper provides a new way called Fast Segmentation (FS) which is based on verb, virtual words and stop words to improve traditional segmentation technique. Then, we filter result of FS by TFIDF3] Algorithm so that we can classify Web text fast and efficiently. The experiment has indicated that without reducing the correct rate of classification, the speed of processing has improved distinctly.

Keywords:	Computer Application Nature Language Processing Keyword Extraction Web Text Classification
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏