首页 | 本学科首页   官方微博 | 高级检索  
     

基于中文变形词匹配的贝叶斯邮件过滤模型
引用本文:汪霞,郑宁,徐明,陈默.基于中文变形词匹配的贝叶斯邮件过滤模型[J].计算机应用与软件,2010,27(1):105-107,130.
作者姓名:汪霞  郑宁  徐明  陈默
作者单位:杭州电子科技大学计算机学院,浙江,杭州,310018
基金项目:浙江省自然科学基金项目(Y106176);;浙江省科技厅计划项目(2007C33058)
摘    要:针对特征词变异的中文垃圾邮件问题,提出了一种基于变形特征词匹配还原的新贝叶斯邮件过滤算法。改进的模型能自动发现邮件中的变异特征词,并根据对应的变异类型还原算法将其还原,避免了变异特征词的匹配逃脱。算法提高了对于含有拼音替换、同音字替换、符号插入等变形特征词样本的分类准确率。实验表明,改进的过滤算法比普通贝叶斯算法有更好的性能。

关 键 词:贝叶斯  垃圾邮件过滤  变形特征

BAYESIAN EMAIL FILTERING MODEL BASED ON CHINESE METAMORPHIC WORDS MATCHING
Wang Xia,Zheng Ning,Xu Ming,Chen Mo.BAYESIAN EMAIL FILTERING MODEL BASED ON CHINESE METAMORPHIC WORDS MATCHING[J].Computer Applications and Software,2010,27(1):105-107,130.
Authors:Wang Xia  Zheng Ning  Xu Ming  Chen Mo
Affiliation:School of Computer/a>;Hangzhou Dianzi University/a>;Hangzhou 310018/a>;Zhejiang/a>;China
Abstract:This paper presents a new Bayesian email filtering algorithm based on metamorphic characteristic words matching and restoration against the problem of Chinese spam mail with characteristic words variation.The improved model can automatically detect varied characteristic words in the email,and restore them according to corresponding recovery algorithm for varied types,which prevents the escape of the varied characteristic words from matching.The algorithm meliorates the classification accuracy of the samples...
Keywords:Bayesian Spam mail filtering Metamorphic characteristic  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号