首页 | 官方网站   微博 | 高级检索  
     

非结构化文档敏感数据识别与异常行为分析
引用本文:喻波,王志海,孙亚东,谢福进,安鹏.非结构化文档敏感数据识别与异常行为分析[J].智能系统学报,2021,16(5):932-939.
作者姓名:喻波  王志海  孙亚东  谢福进  安鹏
作者单位:北京明朝万达科技股份有限公司,北京 100876
摘    要:在海量数据中快速、准确地对数据进行分类分级,快速识别用户异常行为是目前数据安全领域的重要研究内容。在数据分类分级研究领域,自然语言处理技术提升了分类分级的准确率,但是中文语体混杂、无监督学习准确率低、有监督学习样本标注工作量大等问题亟待取得关键突破。本文提出多元中文语言模型和基于无监督算法构建样本,突破数据分类分级领域面临的关键问题。在用户异常行为分析研究领域,由于样本依赖度过高,导致异常行为识别准确率较低,本文提出利用离群点检测方法构建异常行为样本库,解决样本依赖过高问题。为验证方法可行性,进一步构建实验系统开展实验分析,通过实验验证所提出方法可以显著提高数据分类分级和异常行为分析的准确率。

关 键 词:数据安全  人工智能  分类分级  语言模型  用户异常行为分析  样本  自然语言处理  监督学习

Unstructured document sensitive data identification and abnormal behavior analysis
YU Bo,WANG Zhihai,SUN Yadong,XIE Fujin,AN Peng.Unstructured document sensitive data identification and abnormal behavior analysis[J].CAAL Transactions on Intelligent Systems,2021,16(5):932-939.
Authors:YU Bo  WANG Zhihai  SUN Yadong  XIE Fujin  AN Peng
Affiliation:Beijing Wondersoft Technology Co., Ltd, Beijing 100876, China
Abstract:It is an important research content in the field of data security to classify data quickly and accurately in mass data, and to quickly identify user abnormal behavior. In the field of data classification research, natural language processing technology improves the accuracy of classification, but the problems of mixed Chinese language, low accuracy of unsupervised learning, and large workload of supervised learning sample labeling need to be Chinese made urgently. In the field of user anomaly analysis, due to high sample dependence, which leads to low accuracy of abnormal behavior recognition, this paper proposes to use outlier detection to build an abnormal behavior sample library to solve the problem of excessive sample dependence. In order to verify feasibility of the method, the experimental system is further constructed to carry out experimental analysis, and the proposed method can significantly improve the accuracy of data classification and abnormal behavior analysis.
Keywords:data security  artificial intelligence  classification  language model  user’s behavior analysis  sample  nlp  supervised learning
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号