首页 | 本学科首页   官方微博 | 高级检索  
     

基于图卷积神经网络的函数自动命名
引用本文:王堃,李征,刘勇.基于图卷积神经网络的函数自动命名[J].计算机系统应用,2021,30(8):256-265.
作者姓名:王堃  李征  刘勇
作者单位:北京化工大学 信息科学与技术学院, 北京 100029
基金项目:国家自然科学基金(61902015)
摘    要:函数自动命名技术旨在为输入的源代码自动生成目标函数名,增强程序代码的可读性以及加速软件开发进程,是软件工程领域中一项重要的研究任务.现有基于机器学习的技术主要是通过序列模型对源代码进行编码,进而自动生成函数名,但存在长程依赖问题和代码结构编码问题.为了更好的提取程序中的结构信息和语义信息,本文提出了一个基于图卷积(Graph Convolutional Network,GCN)的神经网络模型—TrGCN(a Transformer and GCN based automatic method naming).TrGCN利用了Transformer中的自注意力机制来缓解长程依赖问题,同时采用Character-word注意力机制提取代码的语义信息.TrGCN引入了一种基于图卷积的AST Encoder结构,丰富了AST节点特征向量的信息,可以很好地对源代码结构信息进行建模.在实证研究中,使用了3个不同规模的数据集来评估TrGCN的有效性,实验结果表明TrGCN比当前广泛使用的模型code2seq和Sequence-GNNs能更好的自动生成函数名,其中F1分数分别提高了平均5.2%、2.1%.

关 键 词:深度学习  图卷积神经网络  代码表示方式
收稿时间:2020/11/23 0:00:00
修稿时间:2020/12/22 0:00:00

Automatic Function Naming Based on Graph Convolutional Network
WANG Kun,LI Zheng,LIU Yong.Automatic Function Naming Based on Graph Convolutional Network[J].Computer Systems& Applications,2021,30(8):256-265.
Authors:WANG Kun  LI Zheng  LIU Yong
Affiliation:College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
Abstract:Automatic method naming, as an important task in software engineering, aims to generate the target function name for an input source code to enhance the readability of program codes and accelerate software development. Existing automatic method naming approaches based on machine learning mainly encode the source code through sequence models to automatically generate the function name. However, these approaches are confronted with problems of long-term dependency and code structural encoding. To better extract structural and semantic information from programs, we propose a automatic function naming method called TrGCN based on Transformer and Graph Convolutional Network (GCN). In this method, the self-attention mechanism in Transformer is used to alleviate the long-term dependency and the Character-word attention mechanism to extract the semantic information of codes. The TrGCN introduces a GCN-based AST Encoder that enriches the eigenvector information at AST nodes and models the structural information of the source code well. Empirical studies are conducted on three Java datasets. The results show that TrGCN outperforms conventional approaches, namely code2seq and Sequence-GNNs, in automatic method naming as its F1-score is 5.2% and 2.1% higher than the values of the two approaches, respectively.
Keywords:deep learning  Graph Convolutional Network (GCN)  code representation
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号