首页 | 本学科首页   官方微博 | 高级检索  
     

基于LASVM-NC和TF.RF的文本分类方法
引用本文:李玉鑑,李玉雄,冷强奎.基于LASVM-NC和TF.RF的文本分类方法[J].计算机工程与应用,2014(10):136-140,265.
作者姓名:李玉鑑  李玉雄  冷强奎
作者单位:北京工业大学计算机学院,北京100124
基金项目:国家自然科学基金(No.61175004,No.60775010);北京市自然科学基金(No.4112009);北京市教委科技发展项目(No.KZ201210005007);高等学校博士学科点专项科研基金(No.20121103110029)。
摘    要:非凸在线支持向量机(LASVM-NC)具有抗噪能力强和训练速度快的优点,而词频相关频率积(tf.rf)则是一种自适应能力很强、分类性能非常好的文本特征。通过把非凸在线支持向量机和词频相关频率积相结合,提出了一种新的文本分类方法,即LASVM-NC+tf.rf。实验结果表明,这种方法在LASVM-NC与多种其他特征的结合中性能是最好的,且与SVM+tf.rf相比,不仅所产生的分类器具有泛化能力更强、模型表达更稀疏的优点,而且在处理含噪声的数据时具有更好的鲁棒性,在处理大规模数据时具有快得多的训练速度。

关 键 词:非凸在线支持向量机  支持向量机  特征项  词频  相关频率  文本分类

Text classification method based on non-convex online support vector machines and term frequency relevance frequency product
LI Yujian,LI Yuxiong,LENG Qiangkui.Text classification method based on non-convex online support vector machines and term frequency relevance frequency product[J].Computer Engineering and Applications,2014(10):136-140,265.
Authors:LI Yujian  LI Yuxiong  LENG Qiangkui
Affiliation:( College of Computer Science, Beijing University of Technology, Beijing 100124, China)
Abstract:Non-convex online support vector machine(LASVM-NC)has the advantages of strong anti-noise ability and fast training speed, while term frequency relevance frequency product(tf.rf)is a very good text feature for adaptive classification performance. LASVM-NC+tf.rf is proposed as a new text classification method by combining non-convex support vector machines with term frequency relevance frequency product. It has been shown that the method can perform better than LASVM-NC plus many other features. Moreover, the method can produce faster trained and more robust classifiers with greater generalization and sparser expression than SVM+tf.rf in processing noisy and large-scale datasets.
Keywords:non-convex online support vector machine  support vector machines  term weighting  term frequency  relevance frequency  text classification
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号