首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进K近邻的垃圾邮件过滤技术
引用本文:田 泽,颜松远,徐敬东. 基于改进K近邻的垃圾邮件过滤技术[J]. 计算机工程与应用, 2007, 43(25): 178-181
作者姓名:田 泽  颜松远  徐敬东
作者单位:南开大学,信息技术科学学院,天津,300071;南开大学,信息技术科学学院,天津,300071;南开大学,信息技术科学学院,天津,300071
摘    要:提出了一种基于K近邻(KNN)原理的快速文本分类算法。该算法不仅具有原始K近邻算法分类效果好的优点,还通过对训练样本进行压缩,消除相似度之间的比较,提高了分类效率。实验表明,该算法用于邮件过滤系统时,分类效果要优于基于朴素贝叶斯分类器的二项独立模型和多项式模型,而分类的时间复杂度与其相当,完全可以应用于实时邮件过滤。

关 键 词:快速KNN算法  文本分类  邮件过滤
文章编号:1002-8331(2007)25-0178-04
修稿时间:2006-12-01

Spam filtering method based on improved KNN
TIAN Ze,YAN Song-yuan,XU Jing-dong. Spam filtering method based on improved KNN[J]. Computer Engineering and Applications, 2007, 43(25): 178-181
Authors:TIAN Ze  YAN Song-yuan  XU Jing-dong
Affiliation:School of Information Science and Technology,Nankai University,Tianjin 300071,China
Abstract:This paper presents a fast text classification algorithm based on KNN(K Nearest Neighbor).It increases the classification efficiency by compressing training samples and eliminating comparisons between similarities,while maintaining high classification performance of the original KNN algorithm.The experiment shows that in E-mail filter system,the new algorithm has a better classification performance than Binary Bernoulli Model or Multinomial Model,both of which are based on Naive Bayes classifier.And its computational complexity of classification is equal to these two algorithms,so it can be applied to real-time E-mail filtering.
Keywords:fast KNN algorithm  text classification  spam filtering
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号