首页 | 本学科首页   官方微博 | 高级检索  
     

融合注意力机制与孪生神经网络的更新日志主题学习模型
引用本文:张鑫,黄文超,熊焰.融合注意力机制与孪生神经网络的更新日志主题学习模型[J].计算机应用研究,2023,40(2):349-353+393.
作者姓名:张鑫  黄文超  熊焰
作者单位:中国科学技术大学 计算机科学与技术学院,中国科学技术大学 计算机科学与技术学院,中国科学技术大学 计算机科学与技术学院
基金项目:国家重点研发计划资助项目(2018YFB2100300,2018YFB0803400);国家自然科学基金资助项目(61972369,62102385);安徽省自然科学基金资助项目(2108085QF262)
摘    要:为进一步挖掘更新日志信息,提出了基于注意力机制的孪生双向LSTM网络模型,对更新日志进行分类以实现主题标注,并辅助定位代码缺陷位置。该模型提出了具有安全特色的分词工具实现日志预处理,借助双向LSTM网络学习更新日志语义信息,通过孪生神经网络解决更新日志自身存在的过拟合模式问题并高质量扩充数据集,提升泛化能力。针对多语句组成的更新日志进行序列化训练,通过注意力机制对语句影响性进行区分。针对缺陷修复类的部分日志基于LLVM工具进行改进,生成映射表进行日志内容搜索,定位源码中缺陷模块的位置。大量实验结果表明,所提模型分类效果具有强泛化能力,且较其他文本分类通用方法在准确率、F1值等指标提升近10%,具有理想的日志分类效果与主题学习效果。

关 键 词:更新日志  孪生神经网络  注意力机制  双向LSTM
收稿时间:2022/7/16 0:00:00
修稿时间:2022/9/21 0:00:00

Changelog topic learning model with attention mechanism and siamese neural network
Zhang Xin,Huang Wenchao and Xiong Yan.Changelog topic learning model with attention mechanism and siamese neural network[J].Application Research of Computers,2023,40(2):349-353+393.
Authors:Zhang Xin  Huang Wenchao and Xiong Yan
Affiliation:University of Science and Technology of China School of Computer Science and Technology,Hefei Anhui,,
Abstract:In order to further mine the changelog information, this paper proposed a siamese Bi-LSTM network model based on the attention mechanism to classify the changelog to realize topic annotation and assist in locating the location of code defects. The model proposed a word segmentation tool with security features to realize changelog preprocessing, used Bi-LSTM network to learn changelog contextual semantic information, and solved the problem of overfitting mode existing in changelog itself through siamese neural network and expanded the data set with high quality to improve generalization ability. Serialization training was carried out for the changelog composed of multiple sentences, and the influence of sentences was distinguished through the attention mechanism. For some changelogs of the defect repair class, this paper improved the LLVM tool, generated a mapping table to search the log content, and located the location of the defect module in the source code. A large number of experimental results show that the classification effect of the model in this paper has strong generalization ability, and is nearly 10% higher than the general methods in text classification methods in terms of accuracy, F1 value and other indicators, has ideal log classification effect and topic learning effect.
Keywords:changelog  siamese neural network  attention mechanism  Bi-LSTM
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号