首页 | 本学科首页   官方微博 | 高级检索  
     

基于机器学习的日志函数自动识别方法
引用本文:贾周阳,廖湘科,刘晓东,李姗姗,周书林,谢欣伟.基于机器学习的日志函数自动识别方法[J].计算机工程与科学,2017,39(1):111-117.
作者姓名:贾周阳  廖湘科  刘晓东  李姗姗  周书林  谢欣伟
作者单位:;1.国防科学技术大学计算机学院
基金项目:国家自然科学基金(61379146,61272483);腾讯高校合作项目“面向故障检测的大规模开源软件日志增强技术研究”
摘    要:随着软件规模的不断增长,日志在故障检测中发挥着愈加重要的作用。然而,目前软件日志缺乏统一标准,常受开发人员个人习惯影响,为大规模系统中日志的自动化分析带来了挑战。其中,日志函数的识别作为日志分析的前提条件,对分析结果有着直接影响。提出了一种基于机器学习的方法以支持日志自动识别。通过系统分析广泛使用的大规模开源软件,总结出日志函数编写的主要形式,并提取不同形式间的共性特征,进而基于机器学习实现了自动日志识别工具iLog。实验显示,使用iLog识别的日志函数能力平均为使用特定关键字的76倍,十折交叉验证得到iLog的分析结果的F-Score为0.93。

关 键 词:日志函数  机器学习  静态分析  代码质量  故障检测
收稿时间:2015-06-15
修稿时间:2017-01-25

Logging function recognition based on machine learning technique
JIA Zhou yang,LIAO Xiang ke,LIU Xiao dong,LI Shan shan,ZHOU Shu lin,XIE Xin wei.Logging function recognition based on machine learning technique[J].Computer Engineering & Science,2017,39(1):111-117.
Authors:JIA Zhou yang  LIAO Xiang ke  LIU Xiao dong  LI Shan shan  ZHOU Shu lin  XIE Xin wei
Affiliation:(College of Computer,National University of Defense Technology,Changsha 410073,China)
Abstract:With software scaling up continuously, logging mechanism has become an indispensable part in failure diagnosis area. A pretty similar symptom may be caused by various software bugs, and the most obvious evidence is always logging messages. Meanwhile, the development of most pieces of large scale software is affected by developers' personal habits rather than being guided by certain conventional specification, so log related analysis suffers in large scale software. The recognition of logging function plays a precondition role in log analysis and affects the results of log analysis directly. We propose a machine learning method to fill the gap that logging function recognition has not been paid attention by most existing log related works. Learning from widely used software, we summary three logging functions, extract five common features to complement automated logging function recognition tool iLog based on machine learning. Evaluations show that the recognition ability of iLog is 76 times of those using key words. Additionally, 10 fold cross validation shows that the F Score average is 0.93.
Keywords:logging function  machine learning  static analysis  code quality  failure diagnosis  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号