首页 | 本学科首页   官方微博 | 高级检索  
     

代码相似性检测方法与工具综述
引用本文:张丹,罗平.代码相似性检测方法与工具综述[J].计算机科学,2020,47(3):5-10.
作者姓名:张丹  罗平
作者单位:清华大学软件学院 北京 100084;信息系统安全教育部重点实验室(清华大学) 北京 100084
摘    要:在代码开源的潮流下,代码克隆在提高代码质量和降低开发成本的同时,一定程度地影响了软件系统的稳定性、健壮性与可维护性。代码相似性检测在计算机与信息安全发展方面具有重要的意义。为应对代码克隆带来的各种危害,目前学术界和工业界提出了很多代码相似性检测的方法,这些方法按照源代码信息处理程度可分为基于文本、词法、语法、语义和度量值5类;并开发了相应的检测工具,这些工具实现了很好的检测效果,但在大数据时代背景下也面临着数据规模不断扩大带来的一系列挑战。文中综述了代码相似性检测的方法,对5类检测方法做了详细比较;结合传统方法与机器学习技术,归类了不同检测方法对应的检测工具;按照不同评价标准评估了检测工具的检测效果,总结了每种检测方法的首选检测工具,并对未来代码相似性检测的研究方向做出了展望。

关 键 词:代码克隆  克隆检测  克隆评估

Survey of Code Similarity Detection Methods and Tools
ZHANG Dan,LUO Ping.Survey of Code Similarity Detection Methods and Tools[J].Computer Science,2020,47(3):5-10.
Authors:ZHANG Dan  LUO Ping
Affiliation:(School of Software,Tsinghua University,Beijing 100084,China;Key Laboratory of Information System Security(Tsinghua University),Ministry of Education,Beijing 100084,China)
Abstract:Source code opening has become a new trend in the information technology field.While code cloning improves code quality and reduces software development cost to some extent,it also affects the stability,robustness and maintainability of a software system.Therefore,code similarity detection plays an important role in the development of computer and information security.To overcome the various hazards brought by code cloning,many code similarity detection methods and corresponding tools have been developed by academic and industrial circles.According to the manner of processing source code,these detection methods could be roughly divided into five categories:text analysis based,lexical analysis based,grammar analysis based,semantics analysis based and metrics based.These detection tools can provide good detection performance in many application scenarios,but are also facing a series of challenges brought by ever-increasing data in this big data era.This paper firstly introduced code cloning problem and made a detailed comparison between code similarity detection methods divided into five categories.Then,it classified and organized currently available code similarity detection tools.Finally,it comprehensively evaluated the detection performance of detection tools based on various evaluation criteria.Furthermore,the future research direction of code similarity detection was prospected.
Keywords:Code clone  Clone detection  Clone evaluation
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号