邮件内容过滤的中文编码盲识别算法 Chinese Encoding Charsets Blind Identification Algorithm for E-mail Content Filtering期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

邮件内容过滤的中文编码盲识别算法

引用本文：	祝佳,李生红,李建华. 邮件内容过滤的中文编码盲识别算法[J]. 计算机工程与应用, 2005, 41(10): 131-133

作者姓名：	祝佳李生红李建华

作者单位：	上海交通大学信息安全学院,上海,200030;上海交通大学信息电子工程系,上海,200030;上海交通大学信息安全学院,上海,200030;上海交通大学信息电子工程系,上海,200030;上海交通大学信息安全学院,上海,200030;上海交通大学信息电子工程系,上海,200030

基金项目：	国家863高技术研究发展计划项目(编号:2003AA142160)，上海市科委“‘软损坏’文件修复系统”项目(编号:035115015)

摘要：	邮件内容过滤是信息安全领域的重点课题。文章着重介绍了一种中文文本编码自动识别算法,可以对目前互联网通信所使用的各种常用中文编码(GB2312,GBK,BIG5,UNICODE)进行盲识别,基本解决乱码问题,从而降低邮件内容过滤系统的虚警和漏警率,提高其处理范围。
关键词：	中文编码邮件过滤高频字符 GB2312 GBK BIG5 UNICODE UTF
文章编号：	1002-8331-(2005)10-0131-03
Chinese Encoding Charsets Blind Identification Algorithm for E-mail Content Filtering

Zhu Jia,Li Shenghong,Li Jianhua. Chinese Encoding Charsets Blind Identification Algorithm for E-mail Content Filtering[J]. Computer Engineering and Applications, 2005, 41(10): 131-133

Authors:	Zhu Jia Li Shenghong Li Jianhua

Abstract:	Email content filtering is an important subject for Information Security research.In this paper,an algorithm for multi Chinese Encoding Charsets identification is introduced.This algorithm enables blind automatic identification for most of the frequently used Chinese Encoding Charsets on the Internet(ex.,GB2312,GBK,BIG5 and UNICODE).

Keywords:	Chinese encoding charset E-mail filtering high frequency Chinese characters GB2312 GBK BIG5 UNICODE UTF
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏