统计学理论在邮件分类中的应用研究 |
| |
作者单位: | 安徽大学计算智能与信号处理教育部重点实验室 |
| |
摘 要: | 分类问题,尤其是文本自动分类一直是机器学习与数据挖掘研究中的研究热点与核心技术,其中如朴素贝叶斯、KNN等近年来得到了广泛的关注和快速的发展。文中在统计学理论的基础上给出了一种基于支持向量机方法的文本分类算法,并设计出了相应的垃圾邮件过滤系统。实验证明与朴素贝叶斯方法相比,该算法极大地提高了分类准确率和查全率,具有应用推广的价值。
|
关 键 词: | 机器学习 文本分类 垃圾邮件 |
Research and Design of a Spam Filtering System Based on Statistical Learning Theory |
| |
Authors: | TANG Wei CHENG Jia-xing JI Xia |
| |
Abstract: | Classification is one of the most important research fields in data mining and machine learning.In recent years,there have been extensive studies and rapid progresses in automatic text categorization.Proposes a SVM text categorization on the basis of statistic theory,and designs a corresponding spam email filtering system.Compared with the naive Bayes,the validity of this system is proved.At last some future directions of the research are given. |
| |
Keywords: | machine learning text classification spam |
本文献已被 CNKI 等数据库收录! |
|