首页 | 本学科首页   官方微博 | 高级检索  
     

改进词频分类器集成的文本分类算法
引用本文:梁晓娜,于红,范丽民,骆桂爽. 改进词频分类器集成的文本分类算法[J]. 智能系统学报, 2010, 5(2): 177-180. DOI: 10.3969/j.issn.1673-4785.2010.02.013
作者姓名:梁晓娜  于红  范丽民  骆桂爽
作者单位:大连水产学院,信息工程学院,辽宁,大连,116023
基金项目:辽宁省教育厅资助项目,大连市青年基金,大连水产学院博士启动基金 
摘    要:互联网容纳了海量的文本信息,文本分类系统能够在给定的类别下,自动将文本分门别类,更好地帮助人们挖掘有用信息.介绍了基于词频分类器集成文本分类算法.该算法计算代价小,分类召回率高,但准确率较低,分析了导致准确率低的原因,在此基础上提出了基于改进词频分类器集成的文本分类算法,改进后的算法在文本权重更新方面做了参数调整,使得算法的准确率有显著提高,最后用实验验证了改进后算法的性能.实验结果表明,基于改进词频分类器集成的文本分类算法不仅提高了分类的准确性,而且表现出较好的稳定性.

关 键 词:文本分类  集成学习  词频分类器

A text classification algorithm that uses an improved term frequency classifier ensemble
LIANG Xiao-na,YU Hong,FAN Li-min,LUO Gui-shuang. A text classification algorithm that uses an improved term frequency classifier ensemble[J]. CAAL Transactions on Intelligent Systems, 2010, 5(2): 177-180. DOI: 10.3969/j.issn.1673-4785.2010.02.013
Authors:LIANG Xiao-na  YU Hong  FAN Li-min  LUO Gui-shuang
Affiliation:LIANG Xiao-na,YU Hong,FAN Li-min,LUO Gui-shuang (School of Information Engineering,Dalian Fisheries University,Dalian 116023,China)
Abstract:The internet contains massive amounts of textual information.Text classification systems are essential to sort this chaotic mass into desired categories,enabling people to find the type of information they seek.The authors analyzed a text classification method that is based on a term frequency classifier ensemble.The method has low computational costs and high recall rate.Unfortunately the method lacks precision.The causes of its poor precision were analyzed.An improved method was then proposed.In the impro...
Keywords:AdaBoost
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号