首页 | 本学科首页   官方微博 | 高级检索  
     

一种快速高效的文本分类方法
引用本文:石志伟,刘涛,吴功宜.一种快速高效的文本分类方法[J].计算机工程与应用,2005,41(29):180-183.
作者姓名:石志伟  刘涛  吴功宜
作者单位:南开大学信息技术科学学院,天津,300071
摘    要:论文讨论了两个常用的文本分类算法:向量空间法和k近邻方法。前者速度快,但是分类精度通常不能令人满意。后者则相反,它在分类时要花费更多的时间,但分类效果要好很多。通过综合它们的优点提出了一个新的文本分类算法:向量空间法和k近邻的组合方法。试验表明,新算法能在较少的时间复杂度上达到甚至超过k近邻的分类效果。

关 键 词:文本分类  向量空间法  k近邻
文章编号:1002-8331-(2005)29-0180-04
收稿时间:2005-01
修稿时间:2005年1月1日

An Effective and Efficient Algorithm for Text Categorization
Shi Zhiwei,Liu Tao,Wu Gongyi.An Effective and Efficient Algorithm for Text Categorization[J].Computer Engineering and Applications,2005,41(29):180-183.
Authors:Shi Zhiwei  Liu Tao  Wu Gongyi
Affiliation:Department of Information Science,Nankai University,Tianjin 300071
Abstract:This paper discusses two popular algorithms for text categorization:Vector Space Model(VSM) and k Nearest Neighbor(kNN).The former is a simple and fast algorithm,but its precision is often not satisfying.On the contrary,the latter spends much time determining the class label of a query document,but often gains better categorization performance.We have proposed a new algorithm,hybrid of VSM and kNN,by combining the strength of these two algorithms.We have performed an experimental evaluation of the effectiveness of this algorithm.The result of experiment demonstrates that the new algorithm achieves a competitive(or even better) performance to the well-known algorithm kNN at the cost of much less computation.
Keywords:text categorization  VSM  kNN
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号