首页 | 本学科首页   官方微博 | 高级检索  
     

基于词性标注的启发式在线日志解析方法
引用本文:蒋金钊,傅媛媛,徐建. 基于词性标注的启发式在线日志解析方法[J]. 计算机应用研究, 2024, 41(1): 217-221
作者姓名:蒋金钊  傅媛媛  徐建
作者单位:南京理工大学计算机科学与工程学院
基金项目:国防基础科研计划国防科技重点实验室稳定支持项目(WDZC20225250405);国家自然科学基金资助项目(61872186)
摘    要:为了解决现有启发式日志解析方法中日志特征表示区分能力不足导致解析精度低、泛化差的问题,提出了一种启发式在线日志解析方法PosParser。该方法使用来源于触发词概念的功能词序列作为特征表示,包含解决复杂日志易过度解析问题的两阶段检测方法和处理变长参数日志的后处理流程。PosParser在16个真实日志数据集上取得了0.952的平均解析准确率,证明了功能词序列具有良好区分性、PosParser有良好的解析效果和鲁棒性。

关 键 词:日志分析  日志解析  触发词提取  词性标注  系统运维
收稿时间:2023-05-13
修稿时间:2023-12-14

Heuristic online log parsing method based on part-of-speech tagging
Jiang Jinzhao,Fu Yuanyuan and Xu Jian. Heuristic online log parsing method based on part-of-speech tagging[J]. Application Research of Computers, 2024, 41(1): 217-221
Authors:Jiang Jinzhao  Fu Yuanyuan  Xu Jian
Affiliation:School of Computer Science & Engineering, Nanjing University of Science & Technology,,
Abstract:To solve the problems of low parsing accuracy and poor generalization caused by the insufficient distinguishing ability of log feature representations for logs used in existing heuristic log parsing methods, this paper proposed PosParser, a heuristic online log parsing method. The method used function token sequence(FTS) derived from the concept of trigger words as feature representations, and consisted of the two-stage detection method for solving the problem of complex logs that were prone to over-parsing, and the post-processing for dealing with variable-length parameter logs. PosParser achieved an average parsing accuracy of 0.952 on 16 real-life log datasets. The results demonstrate that FTS has adequate distinguishing ability for logs and PosParser is effective and robust.
Keywords:log analysis   log parsing   trigger word extraction   part-of-speech tagging   system maintenance
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号