首页 | 本学科首页   官方微博 | 高级检索  
     

一种电子邮件敏感信息检测算法
引用本文:刘子豪,庄毅.一种电子邮件敏感信息检测算法[J].计算机研究与发展,2009,46(Z1).
作者姓名:刘子豪  庄毅
作者单位:南京航空航天大学信息科学与技术学院,南京,210016
基金项目:航空基金项目,江苏省自然科学基金项目 
摘    要:针对目前电子邮件安全网关不能很好地支持敏感信息检测问题,深入研究了Winnow算法和Markov模型,在N-Gram语言模型的基础上,提出了一种邮件特征选择方法--Markov-Gram,该方法以句子为单位进行特征项的选取,不仅保留了更多的语义信息,而且可以有效地减少特征项的数目,解决"维度灾难"问题;提出一种Winnow算法训练过程中初始权重生成方法,该方法融入了电子邮件结构特点以及

关 键 词:信息内容安全  电子邮件过滤  文本分类

An Email Sensitive Information Detection Algorithm
Liu Zihao,Zhuang Yi.An Email Sensitive Information Detection Algorithm[J].Journal of Computer Research and Development,2009,46(Z1).
Authors:Liu Zihao  Zhuang Yi
Abstract:Nowadays,the Email secure gateway can not provide good support to the function of sensitive information detection.After deeply researching into the Winnow algorithm and Markov model,the authors put forward an Email feature selection method,the Markov-Gram feature selection method.The method regards each sentence in an Email as a candidate feature instead of the words,by way of which it can preserve more semantic information and the dimensionality of feature denotations in an Email can be decreased.A vector initial method is proposed,which is integrated with the characteristic of Email and the secrete level of key words,used by the winnow algorithm training process to decrease the training times.Based on the two points above,a sensitive Email detection training method-EWMG,is proposed.Finally,the experiment proves that the methods proposed in this paper is better than the normal Winnow detect algorithm.
Keywords:Winnow  Markov  information content security  email filtration  text classification  Winnow  Markov
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号