首页 | 本学科首页   官方微博 | 高级检索  
     

智能代码补全研究综述
引用本文:杨博,张能,李善平,夏鑫. 智能代码补全研究综述[J]. 软件学报, 2020, 31(5): 1435-1453
作者姓名:杨博  张能  李善平  夏鑫
作者单位:浙江大学计算机科学与技术学院,浙江杭州310007;Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
摘    要:代码补全(code completion)是自动化软件开发的重要功能之一,是大多数现代集成开发环境和源代码编辑器的重要组件.代码补全提供即时类名、方法名和关键字等预测,辅助开发人员编写程序,直观提高软件开发效率.近年来,开源软件社区中源代码和数据规模不断扩大,人工智能技术取得了卓越进展,这对自动化软件开发技术产生了极大的促进作用.智能代码补全(intelligent code completion)根据源代码建立语言模型,从语料库学习已有代码特征,根据待补全位置的上下文代码特征在语料库中检索最相似的匹配项进行推荐和预测.相对于传统代码补全,智能代码补全凭借其高准确率、多补全形式、可学习迭代的特性成为软件工程领域的热门方向之一.研究者们在智能代码补全方面进行了一系列研究,根据这些方法如何表征和利用源代码信息的不同方式,可以将它们分为基于编程语言表征和基于统计语言表征两个研究方向,其中,基于编程语言表征又分为标识符序列、抽象语法树、控制/数据流图这3个类别,基于统计语言表征又分为N-gram模型、神经网络模型这2个类别.从代码表征的角度入手,对近年来代码补全方法研究进展进行梳理和总结,主要...

关 键 词:代码补全  代码表征  软件开发工具
收稿时间:2019-08-19
修稿时间:2019-10-28

Survey of Intelligent Code Completion
YANG Bo,ZHANG Neng,LI Shan-Ping,XIA Xin. Survey of Intelligent Code Completion[J]. Journal of Software, 2020, 31(5): 1435-1453
Authors:YANG Bo  ZHANG Neng  LI Shan-Ping  XIA Xin
Affiliation:College of Computer Science and Technology, Zhejiang University, Hangzhou 310007, China; Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
Abstract:Code Completion is one of the crucial functions of automation software development. It is an essential component of most modern integrated development environments and source code editors. Code completion provides predictions such as instant class names, method names, keywords, and assists development. People write programs to improve the efficiency of software development intuitively. In recent years, with the expanding of the source code and data scale in the open-source software community, and outstanding progress in artificial intelligence technology, the automation software development technology has been much promoted. Intelligent code completion (intelligent code completion) builds a language model for source code, learns existing code features from the corpus, and retrieves the most similar matches in the corpus for recommendation and prediction based on the context code features to be replenished. Compared to traditional code completion, intelligence code completion has become one of the hot trends in the field of software engineering with its high accuracy, multiple completion forms, and iterative learning characteristics. Researchers have conducted a series of researches on intelligent code completion. According to how these methods represent and utilize the different forms of source code information, they can be divided into two research directions:Programming language representation and statistical language representation. The programming language is divided into three types:token sequences, abstract syntax tree, and control/data flow graph. The statistical language also has two types:n-gram model and the neural network model. This paper starts from the perspective of code representation and summarizes and summarizes the research progress of code completion methods in recent years. The main contents include:(1) expounding and classifying existing intelligent code completion methods according to code representation; (2) summarizing the experimental verification methods and performance evaluation indicators used in model evaluation; (3) summarizing the critical issues of intelligent code completion; (4) Looking forward to the future development of intelligent code completion.
Keywords:code completion  code representation  software development tool
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号