首页 | 本学科首页   官方微博 | 高级检索  
     

系统日志模板提取方法研究
引用本文:刘洪歧,陈远平,马建化.系统日志模板提取方法研究[J].计算机系统应用,2019,28(10):239-244.
作者姓名:刘洪歧  陈远平  马建化
作者单位:中国科学院 计算机网络信息中心, 北京 100190;中国科学院大学, 北京 100190,中国科学院 计算机网络信息中心, 北京 100190,福建省龙岩烟草工业有限责任公司, 龙岩 364021
基金项目:新一代ARP试点项目(XXH13502-01)
摘    要:提取日志模板是处理海量系统日志十分有效的方法.本文以Web系统日志为切入点,采用基于标签识别树的模板提取方法提取日志模板,并在其基础上,研究并完善了其日志预处理和模板表达式生成方法.针对于系统日志普遍存在的结构复杂问题,具体采用了基于文本相似度的预处理方法,实现了日志消息分类;采用模板最大匹配的方法,解决了由于日志格式不统一和切词导致的模板匹配度低的问题.最后,对本次日志模板提取方法的实验进行了评估,结果证明该方法的准确率达到96.4%,且模板匹配度大幅上升.

关 键 词:系统日志  文本相似度  日志模板  FP-tree  标签识别树
收稿时间:2019/3/22 0:00:00
修稿时间:2019/4/17 0:00:00

Research on Extraction Method of System Log Template
LIU Hong-Qi,CHEN Yuan-Ping and MA Jian-Hua.Research on Extraction Method of System Log Template[J].Computer Systems& Applications,2019,28(10):239-244.
Authors:LIU Hong-Qi  CHEN Yuan-Ping and MA Jian-Hua
Affiliation:Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China,Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China and Fujian Longyan Tobacco Industrial Co. Ltd., Longyan 364021, China
Abstract:Extracting log template is a very effective way to handle massive system logs. In this study, the Web system log is used as the entry point, extracts the log template by using signature tree model. Based on it, we studied and improved the log preprocessing and template expression generation methods. Aiming at the complex structure problem of syslog, the preprocessing method based on text similarity is adopted to realize the classification of log messages. We used the max template matching method to solve the low template matching problem caused by the inconsistent log format and word-cutting. Finally, we evaluate the experiment of this log template extraction method. The results show that the accuracy of the method is 96.4%, and the template matching degree is greatly increased.
Keywords:syslog  text similarity  extract template  FP-tree  signature tree
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号