首页 | 本学科首页   官方微博 | 高级检索  
     

GKCI:改进的基于图神经网络的关键类识别方法
引用本文:周纯英,曾诚,何鹏,张龑.GKCI:改进的基于图神经网络的关键类识别方法[J].软件学报,2023,34(6):2509-2525.
作者姓名:周纯英  曾诚  何鹏  张龑
作者单位:湖北大学 计算机与信息工程学院, 湖北 武汉 430062;湖北大学 计算机与信息工程学院, 湖北 武汉 430062;湖北大学 网络空间安全学院, 湖北 武汉 430062;湖北大学 计算机与信息工程学院, 湖北 武汉 430062;湖北大学 网络空间安全学院, 湖北 武汉 430062;湖北省教育信息化工程技术研究中心, 湖北 武汉 430062
基金项目:国家自然科学基金(62102136, 61902114, 61977021); 湖北省重点研发计划项目(2021BAA184, 2021BAA188); 湖北省技术创新专项(2019ACA144, 2020AEA008)
摘    要:研究人员将软件系统中的关键类作为理解和维护一个系统的起点,而关键类上的缺陷对系统造成极大的安全隐患.因此,识别关键类可提高软件的可靠性和稳定性.常用识别方法是将软件系统抽象为一个类依赖网络,再根据定义好的度量指标和计算规则计算每个节点的重要性得分,如此基于非训练的框架得到的关键类,并没有充分利用软件网络的结构信息.针对这一问题,本文基于图神经网络技术提出了一种有监督的关键类识别方法.首先,将软件系统抽象为类粒度的软件网络,并利用网络嵌入学习方法node2vec得到类节点的表征向量,再通过一个全连接层将节点的表征向量转换为具体分值;然后,利用改进的图神经网络模型,综合考虑类节点之间的依赖方向和权重,进行节点分值的聚合操作;最后,模型输出每个类节点的最终得分并进行降序排序,从而实现关键类的识别.在八个Java开源软件系统上通过与基准方法实验对比,验证了本文方法的有效性.实验结果表明,在前10个候选关键类中,本文所提方法比最先进的方法提升了6.4%的召回率和3.5%的精确率.

关 键 词:关键类识别  软件网络  图神经网络  软件度量
收稿时间:2022/9/5 0:00:00
修稿时间:2022/10/10 0:00:00

GKCI: An Improved GNN-based Key Class Identification Method
ZHOU Chun-Ying,ZENG Cheng,HE Peng,ZHANG Yan.GKCI: An Improved GNN-based Key Class Identification Method[J].Journal of Software,2023,34(6):2509-2525.
Authors:ZHOU Chun-Ying  ZENG Cheng  HE Peng  ZHANG Yan
Affiliation:School of Computer and Information Engineering, Hubei University, Wuhan 430062, China;School of Computer and Information Engineering, Hubei University, Wuhan 430062, China;School of Cyber Science and Technology, Hubei University, Wuhan 430062, China;School of Computer and Information Engineering, Hubei University, Wuhan 430062, China;School of Cyber Science and Technology, Hubei University, Wuhan 430062, China;Engineering Technology Research Center for Education Informatization, Hubei Province, Wuhan 430062, China
Abstract:Researchers use key classes as starting points for software understanding and maintenance. These key classes may cause a significant security risk to the software if they have defects. Therefore, identifying key classes can improve the reliability and stability of the software. Most of the existing methods are based on non-trainable solutions, which calculate the score of each node according to a certain calculation rule, and can not fully utilize the structural information available in the software network. To solve these problems, we propose a supervised deep learning method based on graph neural network technology. First, we build the project as a software network and use the network embedding learning method node2vec to learn the node representation. Then we map the node representation into a score through a simple dense network. Second, we improve the aggregation function of the Graph Neural Networks (GNNs) to aggregate important scores instead of node embedding. We also consider the direction and weight information between nodes when aggregating the scores of neighbor nodes. Finally, we rank the nodes in descending order according to the predicted score output by the model. To evaluate the effectiveness of our method, we apply it to eight Java open-source software systems. The experimental results show that our method performs better than benchmark methods. In the top10 key candidates, our method achieves 6.4% higher recall and 3.5% higher precision than the state-of-the-art.
Keywords:key class identification  software network  graph neural network  software measurement
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号