首页 | 本学科首页   官方微博 | 高级检索  
     

基于多种核函数的SVM在垃圾邮件过滤中的应用
引用本文:董建设,袁占亭,张秋余.基于多种核函数的SVM在垃圾邮件过滤中的应用[J].计算机应用,2008,28(2):424-427.
作者姓名:董建设  袁占亭  张秋余
作者单位:兰州理工大学 兰州理工大学 兰州理工大学
基金项目:国家高技术研究发展计划(863计划)
摘    要:采用TF-IDF和Bernoulli两种模型构造邮件向量,首先详细测试了CHI降维策略对线性支持向量机进行邮件分类的影响。将基于核函数的支持向量机引入到垃圾邮件过滤中,对基于线性核、多项式核和径向基核的支持向量机在邮件分类中的准确率和训练时间进行了比较,分析了训练样本不平衡对分类的影响,并从理论上对实验结果进行了分析,实验结果证明基于径向基核函数的SVM分类器对垃圾邮件有较好的过滤效果。

关 键 词:支持向量机    垃圾邮件过滤    核函数    特征选择
文章编号:1001-9081(2008)02-0424-04
收稿时间:2007-08-15
修稿时间:2007-10-25

Application of various kernel function based SVM in spam filtering
DONG Jian-she,YUAN Zhan-ting,ZHANG Qiu-yu.Application of various kernel function based SVM in spam filtering[J].journal of Computer Applications,2008,28(2):424-427.
Authors:DONG Jian-she  YUAN Zhan-ting  ZHANG Qiu-yu
Affiliation:DONG Jian-she,YUAN Zhan-ting,ZHANG Qiu-yu(College of Computer , Communication,Lanzhou University of Technology,Lanzhou Gansu 730050,China)
Abstract:The Support Vector Machine(SVM) based spam filter was summarized briefly. The mail vector was constructed on TF-IDF model and Bernoulli model. The effect to mail classification of CHI method to descend dimension was tested in detail. Kernel based SVM was introduced into spam filtering. The classification accuracy and training time of SVM based on linear kernel, polynomial kernel and radius basis function kernel were compared and analyzed. It was proposed and analyzed that the imbalance of training samples has great affect on the classification accuracy and the false positive ratio.
Keywords:Support Vector Machine  spam filtering  kernel function  feature selection
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号