共查询到19条相似文献,搜索用时 218 毫秒
1.
2.
基于多Bayes网的垃圾邮件智能过滤研究 总被引:2,自引:0,他引:2
在分析朴素Bayes方法用于垃圾邮件自动过滤中存在的一些问题基础上,提出了一种新的基于多Bayes网的垃圾邮件自动过滤方法。该方法利用多个Bayes网构成的多个分类器同时对邮件进行分类,当前邮件被认定是垃圾邮件当且仅当全部分类器都判断它为垃圾邮件。这种多个分类器同时工作及分类临界值的使用在一定程度上减少了将有用邮件误判为垃圾邮件的可能性。该方法还引入动态学习机制,在邮件分类过程中能够补充训练样本,满足不同用户的邮件分类标准。 相似文献
3.
4.
针对邮件所含信息的模糊性和合法邮件与垃圾邮件错分代价的不对称性提出了基于双隶属度模糊支持向量机的邮件过滤方法,通过对每个样本赋予不同的双隶属度,得到最优分类器,提高了邮件过滤的正确率。经仿真实验证明,该方法能够有效降低将合法邮件误判为垃圾邮件,而且有很高的正确率等特点。 相似文献
5.
在垃圾邮件过滤中,考虑到特征词对合法邮件和垃圾邮件分类贡献的不同,通过定义分类贡献比系数,将特征词分类贡献的思想应用到特征选择和朴素贝叶斯过滤器的设计中,在英文语料库上进行实验,实验结果表明,应用特征词分类贡献的垃圾邮件过滤方法可以有效提高过滤器对合法邮件和垃圾邮件的识别能力,降低过滤器对合法邮件和垃圾邮件的误判率。 相似文献
6.
以智能决策支持系统结构为基础,提出了一种新的电子邮件过滤模型,并对中文垃圾邮件过滤中的中文分词及垃圾邮件特征知识库的更新等关键问题进行了探讨。开发了“智能邮件过滤系统(IEFS)”,使垃圾邮件误判率得到了一定程度的控制,有效防止了垃圾邮件的泛滥。 相似文献
7.
在垃圾邮件过滤中,针对过滤器对合法邮件的误判问题,提出一种改进的垃圾邮件过滤算法。该算法对信息增益的条件熵估计方法作了改进,结合最小风险贝叶斯决策方法,在英文语料库上进行实验,并采用召回率和正确率对算法进行评价分析。实验结果表明,改进后的方法可提高过滤器对合法邮件的识别能力,降低对合法邮件的误判,减少用户的损失。 相似文献
8.
以智能决策支持系统结构为基础,提出了一种新的电子邮件过滤模型.并对中文垃圾邮件过滤中的中文分词及垃圾邮件特征知识库的更新等关键问题进行了探讨。开发了“智能邮件过滤系统(JEFS)”,使垃圾邮件误判率得到了一定程度的控制.有效防止了垃圾邮件的泛滥。 相似文献
9.
10.
分析了贝叶斯分类方法在中文垃圾邮件过滤中的应用。提出了基于贝叶斯最小风险的垃圾邮件过滤技术,通过选择适当的损失函数,尽可能减少合法邮件的误判。实验结果表明,该方法是切实可行的并具有良好的效果。 相似文献
11.
A new technique for managing and disseminating Web-based email prefetches messages and generates dynamic pages, displaying them at the network edge. Compared to other popular Web-based email servers, the prefetching and caching emails (PACE) prototype shows an improved performance with respect to user-perceived latency. Additionally, PACE'S centralized neural-network-based personalized spam filter will filter spam and viruses at the server's origin, thus saving bandwidth. Another major concern for users is the email accounts being clogged with spam. Spam filters can be classified as server-side or client-side. Server-side filters are integrated with email servers and filter out spam at the server end. 相似文献
12.
As the importance of email increases, the amount of malicious email is also increasing, so the need for malicious email filtering is growing. Since it is more economical to combine commodity hardware consisting of a medium server or PC with a virtual environment to use as a single server resource and filter malicious email using machine learning techniques, we used a Hadoop MapReduce framework and Naïve Bayes among machine learning methods for malicious email filtering. Naïve Bayes was selected because it is one of the top machine learning methods(Support Vector Machine (SVM), Naïve Bayes, K-Nearest Neighbor(KNN), and Decision Tree) in terms of execution time and accuracy. Malicious email was filtered with MapReduce programming using the Naïve Bayes technique, which is a supervised machine learning method, in a Hadoop framework with optimized performance and also with the Python program technique with the Naïve Bayes technique applied in a bare metal server environment with the Hadoop environment not applied. According to the results of a comparison of the accuracy and predictive error rates of the two methods, the Hadoop MapReduce Naïve Bayes method improved the accuracy of spam and ham email identification 1.11 times and the prediction error rate 14.13 times compared to the non-Hadoop Python Naïve Bayes method. 相似文献
13.
垃圾邮件的处理是电子邮件服务中非常重要的功能,该文在对标准邮件集表示为向量空间模型,降维处理处理工作的基础上,运用神经网络集成的方法来构造邮件分类器,对邮件进行过滤;该方法在垃圾邮件语料库上进行了实验,实验证明该方法对于垃圾邮件的过滤有较好的效果。 相似文献
14.
15.
Traditional classification methods assume that the training and the test data arise from the same underlying distribution.
However, in several adversarial settings, the test set is deliberately constructed in order to increase the error rates of
the classifier. A prominent example is spam email where words are transformed to get around word based features embedded in
a spam filter. 相似文献
16.
Email spam filtering is typically treated as a binary classification problem that can be solved by machine learning algorithms. We argue that a three-way decision approach provides a more meaningful way to users for precautionary handling their incoming emails. Three email folders instead of two are produced in a three-way spam filtering system, a suspected folder is added to allow users make further examinations of suspicious emails, thereby reducing the chances of misclassification. Different from existing ternary email spam filtering systems, we focus on two issues that are less studied, that is, the computation of required thresholds to define the three email categories, and the interpretation of the cost-sensitive characteristics of spam filtering. Instead of supplying the thresholds based on intuitive understandings of the levels of tolerance for errors, we systematically calculate the thresholds based on decision-theoretic rough set model. A loss function is interpreted as the costs of making classification decisions. A decision is made for which the overall cost is minimum. Experimental results show that the new approach reduces the error rate of misclassifying a legitimate email to spam and demonstrates a better performance for the cost-sensitivity aspect. 相似文献
17.
电子邮件随着Intemet的发展给人们带来了方便,随之而来的垃圾邮件却给人们带来无尽的烦恼。本文针对反垃圾邮件技术发展与现状,对目前已应用或正在研究的垃圾邮件过滤技术进行了分析,为项目组改进垃圾邮件过滤方法的下一步工作做前期准备。 相似文献
18.
电子邮件随着Internet的发展给人们带来了方便,随之而来的垃圾邮件却给人们带来无尽的烦恼.本文针对反垃圾邮件技术发展与现状,对目前已应用或正在研究的垃圾邮件过滤技术进行了分析,为项目组改进垃圾邮件过滤方法的下一步工作做前期准备. 相似文献
19.
基于改进贝叶斯的垃圾邮件过滤系统设计与实现 总被引:10,自引:3,他引:7
该文设计并实现了一种基于改进贝叶斯的垃圾邮件过滤系统。传统的贝叶斯方法对邮件进行过滤时,将邮件视为一个无序关键词的向量空间,丢掉了词与词之间,句子之间的相互关系。该文则将邮件视为句间有序,句子内部关键词无序但是相关的部分有序的集合。减少传统方法处理时信息的丢失。得到的实验结果比传统方法更好。 相似文献