首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进向量空间模型的克隆群映射方法
引用本文:陈桌,张丽萍,王欢,张久杰,王春晖.基于改进向量空间模型的克隆群映射方法[J].计算机应用,2016,36(7):2031-2037.
作者姓名:陈桌  张丽萍  王欢  张久杰  王春晖
作者单位:内蒙古师范大学 计算机与信息工程学院, 呼和浩特 010022
基金项目:国家自然科学基金资助项目(61363017,61462071);内蒙古自然科学基金资助项目(2014MS0613);内蒙古教育厅资助项目(NJZY14039)。
摘    要:针对Type-3克隆代码映射方法少且效率低等问题,提出了一种基于改进向量空间模型(VSM)的映射方法。该方法将改进的VSM引入到克隆代码分析中,从而得到一种可有效映射Type-1、Type-2以及Type-3克隆代码的克隆群映射方法。首先,将克隆群文档预处理得到去除无用词的代码文档,同时提取克隆群文档的文件名、函数名等特征项;其次,提取并构建克隆群词频向量空间,利用余弦算法计算出克隆群相似度;然后,通过克隆群相似度和特征项的匹配构建克隆群映射,最终得到克隆群映射结果。对5款开源软件进行实验并人工验证,所提方法能在低时耗的前提下,保证查全率和查准率均不低于96.1%和97.1%。实验结果表明了所提方法的可行性,为后期软件演化分析提供数据支撑。

关 键 词:克隆代码  克隆群映射  向量空间模型  特征项  词频  
收稿时间:2015-12-28
修稿时间:2016-03-11

Clone group mapping method based on improved vector space model
CHEN Zhuo,ZHANG Liping,WANG Huan,ZHANG Jiujie,WANG Chunhui.Clone group mapping method based on improved vector space model[J].journal of Computer Applications,2016,36(7):2031-2037.
Authors:CHEN Zhuo  ZHANG Liping  WANG Huan  ZHANG Jiujie  WANG Chunhui
Affiliation:College of Computer and Information Engineering, Inner Mongolia Normal University, Hohhot Nei Mongol 010022, China
Abstract:Focusing on the less quantity and low efficiency problem of Type-3 clone code mapping method, a mapping method based on improved Vector Space Model (VSM) was proposed. Improved VSM was introduced into the clone code analysis to get an effective clone group mapping method for Type-1, Type-2 and Type-3. Firstly, clone group document was pretreated to get the code document with removing useless word, and the file name, function name and other features of clone group document were extracted at the same time. Secondly, word frequency vector space of clone group was extracted and built; the similarity of clone group was calculated by using cosine algorithm. Then mapping of clone group was constructed by clone group similarity and feature matching, and the result of cloning group mapping was obtained finally. Five pieces of open source software was tested and verified by experiments. The proposed method can guarantee the recall and the precision of not less than 96.1% and 97.1% at low time consumption. The experimental results show that the proposed method is feasible, which provides data support for the analysis of software evolution.
Keywords:code clone                                                                                                                        clone group mapping                                                                                                                        Vector Space Model (VSM)                                                                                                                        feature item                                                                                                                        word frequency
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号