首页 | 本学科首页   官方微博 | 高级检索  
     

深度学习在代码表征中的应用综述
引用本文:谢春丽,梁瑶,王霞. 深度学习在代码表征中的应用综述[J]. 计算机工程与应用, 2021, 57(20): 53-63. DOI: 10.3778/j.issn.1002-8331.2106-0368
作者姓名:谢春丽  梁瑶  王霞
作者单位:江苏师范大学 计算机科学与技术学院,江苏 徐州 221116
摘    要:代码表征是对代码数值化的一种技术,把代码映射为一组连续的实值向量,提取隐藏在代码内部的属性,辅助程序员生成或分析代码,是代码克隆、代码推荐、代码剽窃等软件工程任务的核心技术和研究热点。研究者们对代码表征方面进行了一系列研究,根据源代码抽取信息的方式,分为基于文本的表征、基于语法的表征、基于语义的表征和基于功能的表征;根据表征粒度的大小,分为基于词汇的表征、基于语句的表征、基于函数的表征等不同等级;根据表征方法的不同,分为基于统计的模型、基于自然语言的模型和基于深度学习的模型。对近几年基于深度学习的代码表征研究进展进行了综述,并从表征粒度、表征层次、表征模型、应用场景等方面对现有工作进行了概括、比较和分析。对基于深度学习的代码表征的未来发展趋势进行分析和展望。

关 键 词:深度学习  代码表征  表征模型  表征粒度  

Survey of Deep Learning Applied in Code Representation
XIE Chunli,LIANG Yao,WANG Xia. Survey of Deep Learning Applied in Code Representation[J]. Computer Engineering and Applications, 2021, 57(20): 53-63. DOI: 10.3778/j.issn.1002-8331.2106-0368
Authors:XIE Chunli  LIANG Yao  WANG Xia
Affiliation:School of Computer Science and Technology, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
Abstract:Source code representation is an important technology of code numerization, which is the foundation of code cloning detection, code recommendation, code plagiarism and other applications in software engineering domain. It helps programmers to generate or analyze code. It has become a core technology and a hot topic in the field of software engineering. Researchers have conducted a series of researches on code representation. The methods can be divided into text-based representation, syntactic based representation, semantic based representation and function based representation according to different ways of using code information, can be divided into words based representation, statement based representation and function based representation; according to representation granularity, and can be divided into statistical based model, natural language based model and deep learning based representation according to representation methods. In this paper, it first investigates the recent research work of deep learning based code representation which maps source code into a set of continuous space vectors to extract the underlying intrinsic properties. Then it discusses the granularity of representation, abstract level, representation model and application. Finally, this paper summarizes the future development trend of deep learning based code representation.
Keywords:deep learning  code representation  representation model  representation granularity  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号