期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

董刘李引李娟《计算机工程与设计》2009,30(1)

需求跟踪为软件工程提供有力的支持,然而人工建立需求跟踪关系费时费力,存在着成本过高,难以维护等问题.动态需求跟踪运用信息检索等技术,自动化建立需求文档和工作产品的跟踪关系,但在跟踪精度等方面仍然存在问题.在现有的动态需求跟踪方法基础之上,分析了需求动态跟踪精度问题产生的原因,提出了使用代码注释辅助动态需求跟踪的方法,改进了需求动态跟踪工具,并通过实验验证了方法对于动态需求跟踪效果的改进. 相似文献

2.

基于序列到序列模型的代码片段推荐

闫鑫周宇黄志球《计算机科学与探索》2020,14(5):731-739

在软件开发过程中,开发者经常会以复用代码的方式,提高软件开发效率。已有的研究通常采用传统的信息检索技术来实现代码推荐。这些方法存在自然语言查询的高层级的意图与代码的低层级的实现细节不匹配的问题。提出了一种基于序列到序列模型的代码片段推荐方法DeepCR。该方法结合程序静态分析技术与序列到序列模型,训练自然语言查询生成模型,为代码片段生成查询,通过计算生成的查询和开发者输入的自然语言查询的相似度得分来实现代码片段推荐。所构建的代码库的数据来源于Stack Overflow问答网站,确保了数据的真实性。通过计算代码片段推荐结果的平均倒数排名(MRR)和Hit@K来验证方法的有效性。实验结果表明,DeepCR优于现有研究工作,能够有效提高代码片段推荐效果。相似文献

3.

一种基于句法分析的跟踪关系恢复方法

王金水翁伟彭鑫《计算机研究与发展》2015,(3):729-737

软件需求跟踪已被公认为影响软件项目成败的一个关键因素。针对大多数基于信息检索的需求跟踪方法都严重依赖于软件制品中的文本质量,提出了一种基于句法分析的动态需求跟踪方法。该方法能够从制品中抽取最有可能刻画自身特征的标引词,并减少制品中噪音对需求跟踪带来的不利影响。为了验证该方法的有效性,在多个来自不同项目且类型不同的软件制品上,比较了基于不同标引词集合的动态需求跟踪方法所建立的跟踪关系。实验结果表明,基于句法分析的动态需求跟踪方法能够有效地提高跟踪关系的准确性。相似文献

4.

一种基于命名实体识别的需求跟踪方法

王金水薛醒思唐郑熠《计算机应用研究》2016,33(1)

针对基于文本的需求跟踪方法严重依赖文本质量的问题,提出了一种利用命名实体识别技术标注制品文档关键词的需求跟踪方法。该方法通过代码实体上下文构建命名实体识别模型,解决了抽象语法树和正则表达式无法解析非源代码形式的软件制品的问题。利用命名实体识别模型标识出软件制品中的代码实体之后,方法将软件制品转换为文档集合并进行语义聚类,最后再通过映射算法创建制品间的需求跟踪关系。实验结果表明,与基于所有词项和基于高权重词项的需求跟踪方法相比,该方法能够有效提高需求跟踪结果的质量。相似文献

5.

基于版本控制的中文文档到源代码的自动跟踪方法

沈力刘洪星李勇华《计算机应用》2018,38(10):2996-3001

软件文档和源代码之间的可追踪性研究广泛使用了信息检索（IR）技术,但由于中文文档和源代码用不同的语言书写,使用传统IR技术进行自动跟踪时会导致精度不高。针对上述问题,提出一种基于版本控制的中文文档到源代码的自动跟踪方法。首先,结合文本到源代码的启发式规则,采用IR方法计算出文本和源代码之间的相似度得分;然后,使用软件开发和维护过程中提交到版本控制软件的更新信息来修正该分数;最后,根据设定的阈值确定中文文档与源代码之间的跟踪关系。实验结果表明,改进方法的精确度和召回率相比传统IR方法均有一定的提高,并且该方法能提取出传统IR方法中遗漏的跟踪关系。相似文献

6.

基于问答平台的弃用API迁移建议推荐技术

奚耀国沈立炜赵文耘《计算机应用与软件》2022,(5):8-16+114

开发者应及时迁移会为程序引入隐患的弃用API。不完整的文档和潜在替换API的复杂使用方式为此工作带来了挑战。开发者习惯于在问答网站中寻找关于迁移弃用API的讨论。然而,由于帖子数量较多,往往不能在短时间内找到合适的替换API及代码示例。针对此这一问题,提出基于问答平台的弃用API迁移建议推荐技术。迁移建议由与替换API有关的文本描述和代码示例组成。该技术根据弃用API从Stack Overflow中搜索讨论帖,并从中抽取文本信息和代码示例生成回答快照。根据替换API对回答快照进行分类。参考帖子的属性对回答快照进行排序。根据替换API简化快照中的代码示例。基于该技术开发了迁移建议推荐工具,实验表明该工具能显著提高开发者迁移弃用API的效率。相似文献

7.

需求描述前的需求跟踪模型研究

于杨杨丹徐传运文俊浩《计算机科学》2006,33(B12):187-190

对需求描述前的需求进行跟踪可以减少需求错误、有助于需求变更的处理和软件组织的过程改进。针对传统需求跟踪方法没有对需求描述前的需求跟踪提供合适支持的问题，文章剖析了需求分析过程中产生的中间产品及其之间的各种关联，基于分析结果，提出了一个需求描述前的需求跟踪模型，包括过程模型和数据模型：过程模型描述跟踪过程，数据模型描述原始需求和中间产品。最后应用模型跟踪一个设备租赁系统中费用计算的需求，结果表明利用该模型能够准确跟踪到需求来源，发现需求错误、遗漏和不一致，并消除错误影响，从而改进需求质量，其有效性得到了验证。相似文献

8.

基于信息检索的需求跟踪方法综述

《计算机应用与软件》2017,(10)

需求跟踪作为软件过程管理中的一个重要环节,在保障系统质量、应对需求变更方面发挥着重要作用。利用需求跟踪,软件工程师可以发现制品之间的依赖关系、评估需求覆盖率和计算需求变更的影响。随着软件项目的日益复杂和软件制品数量的增加,跟踪关系的自动恢复和维护日益受到业界关注。近年来,人们对于基于信息检索的需求跟踪自动化技术做了大量研究。针对基于信息检索的需求跟踪技术进行综述,从技术改进、支撑工具和度量指标三个方面进行了深入分析。在此基础上,对其发展趋势和有待深入的研究点进行了展望。相似文献

9.

基于情境感知的API个性化推荐

陈晨周宇王永超黄志球《计算机科学》2021,48(12):100-106

在软件开发的过程中,开发人员在遇到编程困境时通常会检索合适的API来完成编程任务.情境信息和开发者画像在有效的API推荐中起着至关重要的作用,却在很大程度上被忽视了.因而文中提出了一种基于情境感知的API个性化推荐方法.该方法利用程序静态分析技术,对代码文件做抽象语法树解析,提取信息构建代码库,并对开发者API使用偏好建模.然后计算开发者当前查询语句与历史代码库中查询的语义相似度,检索出top-k个相似历史查询.最终利用查询语句信息、方法名信息、情境信息以及开发者API使用偏好信息对API进行重排序并推荐给开发者.通过模拟编程任务开发的不同阶段,使用MRR,MAP,Hit,NDCG评估指标来验证所提方法的有效性.实验结果表明,所提方法的API推荐效果优于基准方法,能够为开发者推荐更想要的API. 相似文献

10.

基于软件知识图谱的代码语义标签自动生成方法

邢双双刘名威彭鑫《软件学报》2022,33(11):4027-4045

开源及企业软件项目和各类软件开发网站上的代码片段是重要的软件开发资源.然而,很多开发者代码搜索需求反映的代码的高层意图和主题难以通过基于代码文本的信息检索技术来实现精准的代码搜索.因此,反映代码整体意图和主题的语义标签对于改进代码搜索、辅助代码理解都具有十分重要的作用.现有的标签生成技术主要面向文本内容或依赖于历史数据,无法满足大范围代码语义标注和辅助搜索、理解的需要.针对这一问题,提出了一种基于知识图谱的代码语义标签自动生成方法KGCodeTagger.该方法通过基于API文档和软件开发问答文本的概念和关系抽取构造软件知识图谱,作为代码语义标签生成的基础.针对给定的代码,该方法识别并抽取出通用API调用或概念提及,并链接到软件知识图谱中的相关概念上.在此基础上,该方法进一步识别与所链接的概念相关的其他概念作为候选,然后按照多样性和代表性排序,产生最终的代码语义标签.通过实验对KGCodeTagger软件知识图谱构建的各个步骤进行了评估,并通过与几个已有的基准方法的比较,对所生成的代码语义标签质量进行了评估.实验结果表明,KGCodeTagger的软件知识图谱构建步骤是合理有效的,该方法所生成的代码语义标签是高质量、有意义的,能够帮助开发人员快速理解代码的意图. 相似文献

11.

Applying a smoothing filter to improve IR-based traceability recovery processes: An empirical investigation

《Information and Software Technology》2013,55(4):741-754

ContextTraceability relations among software artifacts often tend to be missing, outdated, or lost. For this reason, various traceability recovery approaches—based on Information Retrieval (IR) techniques—have been proposed. The performances of such approaches are often influenced by “noise” contained in software artifacts (e.g., recurring words in document templates or other words that do not contribute to the retrieval itself).AimAs a complement and alternative to stop word removal approaches, this paper proposes the use of a smoothing filter to remove “noise” from the textual corpus of artifacts to be traced.MethodWe evaluate the effect of a smoothing filter in traceability recovery tasks involving different kinds of artifacts from five software projects, and applying three different IR methods, namely Vector Space Models, Latent Semantic Indexing, and Jensen–Shannon similarity model.ResultsOur study indicates that, with the exception of some specific kinds of artifacts (i.e., tracing test cases to source code) the proposed approach is able to significantly improve the performances of traceability recovery, and to remove “noise” that simple stop word filters cannot remove.ConclusionsThe obtained results not only help to develop traceability recovery approaches able to work in presence of noisy artifacts, but also suggest that smoothing filters can be used to improve performances of other software engineering approaches based on textual analysis. 相似文献

12.

DCTracVis: a system retrieving and visualizing traceability links between source code and documentation

Xiaofan Chen John Hosking John Grundy Robert Amor 《Automated Software Engineering》2018,25(4):703-741

It is well recognized that traceability links between software artifacts provide crucial support in comprehension, efficient development, and effective management of a software system. However, automated traceability systems to date have been faced with two major open research challenges: how to extract traceability links with both high precision and high recall, and how to efficiently visualize links for complex systems because of scalability and visual clutter issues. To overcome the two challenges, we designed and developed a traceability system, DCTracVis. This system employs an approach that combines three supporting techniques, regular expressions, key phrases, and clustering, with information retrieval (IR) models to improve the performance of automated traceability recovery between documents and source code. This combination approach takes advantage of the strengths of the three techniques to ameliorate limitations of IR models. Our experimental results show that our approach improves the performance of IR models, increases the precision of retrieved links, and recovers more correct links than IR alone. After having retrieved high-quality traceability links, DCTracVis then utilizes a new approach that combines treemap and hierarchical tree techniques to reduce visual clutter and to allow the visualization of the global structure of traces and a detailed overview of each trace, while still being highly scalable and interactive. Usability evaluation results show that our approach can effectively and efficiently help software developers comprehend, browse, and maintain large numbers of links. 相似文献

13.

Design‐code traceability for object‐oriented systems

Giulio Antoniol Bruno Caprile Alessandra Potrich Paolo Tonella 《Annals of Software Engineering》2000,9(1-2):35-58

Traceability is a key issue to ensure consistency among software artifacts of subsequent phases of the development cycle. However, few works have so far addressed the theme of tracing object oriented (OO) design into its implementation and evolving it. This paper presents an approach to checking the compliance of OO design with respect to source code and support its evolution. The process works on design artifacts expressed in the OMT (Object Modeling Technique) notation and accepts C++ source code. It recovers an “as is” design from the code, compares the recovered design with the actual design and helps the user to deal with inconsistencies. The recovery process exploits the edit distance computation and the maximum match algorithm to determine traceability links between design and code. The output is a similarity measure associated to design‐code class pairs, which can be classified as matched and unmatched by means of a maximum likelihood threshold. A graphic display of the design with different green levels associated to different levels of match and red for the unmatched classes is provided as a support to update the design and improve its traceability to the code. 相似文献

14.

基于多特征值的源代码相似性检测技术

展佳俊赵逢禹艾均《计算机技术与发展》2021,(1)

在软件开发的过程中,开发人员通过复制粘贴式的开发方式或者模块化的开发方式来完成需求是十分常见的,这两种开发方式可以提高开发效率,但同时会导致软件系统中出现大量的相同代码或者相似代码,大量的相似代码会给软件维护等方面带来很大的困难,这也是最常见的重构对象。源代码相似性度量是指利用一定的检测方法分析程序源代码间的相似程度。该技术被应用于代码抄袭检测、代码克隆检测、软件知识产权保护、代码复用等多个领域。为了提高代码相似性度量的准确性,提出了一种基于多特征值的源代码相似性检测技术。构建了源代码注释、型构、代码文本语句与结构中特征提取的方法,并给出了源代码相似度检测的度量模型。通过与权威的代码相似检测系统Moss进行对比实验,结果表明该方法可以更准确地检测出相似代码。相似文献

15.

Giulio Antoniol Bruno Caprile Alessandra Potrich Paolo Tonella 《Annals of Software Engineering》2000,9(1-4):35-58

Traceability is a key issue to ensure consistency among software artifacts of subsequent phases of the development cycle. However, few works have so far addressed the theme of tracing object oriented (OO) design into its implementation and evolving it. This paper presents an approach to checking the compliance of OO design with respect to source code and support its evolution. The process works on design artifacts expressed in the OMT (Object Modeling Technique) notation and accepts C++ source code. It recovers an “as is” design from the code, compares the recovered design with the actual design and helps the user to deal with inconsistencies. The recovery process exploits the edit distance computation and the maximum match algorithm to determine traceability links between design and code. The output is a similarity measure associated to design‐code class pairs, which can be classified as matched and unmatched by means of a maximum likelihood threshold. A graphic display of the design with different green levels associated to different levels of match and red for the unmatched classes is provided as a support to update the design and improve its traceability to the code. This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献

16.

Sourcerer: mining and searching internet-scale software repositories

Erik Linstead Sushil Bajracharya Trung Ngo Paul Rigor Cristina Lopes Pierre Baldi 《Data mining and knowledge discovery》2009,18(2):300-336

Large repositories of source code available over the Internet, or within large organizations, create new challenges and opportunities for data mining and statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated crawling, parsing, fingerprinting, and database storage of open source software on an Internet-scale. In one experiment, we gather 4,632 Java projects from SourceForge and Apache totaling over 38 million lines of code from 9,250 developers. Simple statistical analyses of the data first reveal robust power-law behavior for package, method call, and lexical containment distributions. We then develop and apply unsupervised, probabilistic, topic and author-topic (AT) models to automatically discover the topics embedded in the code and extract topic-word, document-topic, and AT distributions. In addition to serving as a convenient summary for program function and developer activities, these and other related distributions provide a statistical and information-theoretic basis for quantifying and analyzing source file similarity, developer similarity and competence, topic scattering, and document tangling, with direct applications to software engineering an software development staffing. Finally, by combining software textual content with structural information captured by our CodeRank approach, we are able to significantly improve software retrieval performance, increasing the area under the curve (AUC) retrieval metric to 0.92– roughly 10–30% better than previous approaches based on text alone. A prototype of the system is available at: . Erik Linstead, Sushil Bajracharya, and Trung Ngo have contributed equally to this work. 相似文献

17.

Design-code traceability recovery: selecting the basic linkage properties 总被引：1，自引：0，他引：1

G. Antoniol B. Caprile A. Potrich P. Tonella 《Science of Computer Programming》2001,40(2-3):213-234

Traceability ensures that software artifacts of subsequent phases of the development cycle are consistent. Few works have so far addressed the problem of automatically recovering traceability links between object-oriented (OO) design and code entities. Such a recovery process is required whenever there is no explicit support of traceability from the development process. The recovered information can drive the evolution of the available design so that it corresponds to the code, thus providing a still useful and updated high-level view of the system.

Automatic recovery of traceability links can be achieved by determining the similarity of paired elements from design and code. The choice of the properties involved in the similarity computation is crucial for the success of the recovery process. In fact, design and code objects are complex artifacts with several properties attached. The basic anchors of the recovered traceability links should be chosen as those properties (or property combinations) which are expected to be maintained during the transformation of design into code. This may depend on specific practices and/or the development environment, which should therefore be properly accounted for.

In this paper different categories of basic properties of design and code entities will be analyzed with respect to the contribution they give to traceability recovery. Several industrial software components will be employed as a benchmark on which the performances of the alternatives are measured. 相似文献

18.

基于软件结构的文档与代码间可追踪性研究

杨丙贤刘超《计算机科学与探索》2014,(6):694-703

正确建立软件文档与代码间的可追踪关系对程序理解、软件维护等非常重要。近年来,软件文档与代码间的可追踪性研究大多基于文本词汇相似度,没有充分利用软件文档和代码所蕴含的结构信息,针对这一问题提出了将软件结构信息与信息检索模型相结合进行文档与代码间可追踪性分析的方法。通过对文档和代码结构信息的分析,改善预处理效果,优化相似度计算过程,进而提高整体方法的有效性。实验结果表明,该方法比单纯基于信息检索的方法在查全率和查准率上都有所提高,而且能提取到更多的可追踪性链。相似文献

19.

基于源代码扩展信息的细粒度缺陷定位方法

李晓卓卿笃军贺也平马恒太《软件学报》2022,33(11):4008-4026

基于信息检索的缺陷定位技术,利用跨语言的语义相似性构造检索模型,通过缺陷报告定位源代码错误,具有方法直观、通用性强的特点.但是由于传统基于信息检索的缺陷定位方法将代码作为纯文本进行处理,只利用了源代码的词汇语义信息,导致在细粒度缺陷定位中面临候选代码语义匮乏产生的准确性低的问题,其结果有用性还有待改进.通过分析程序演化场景下代码改动与缺陷产生间的关系,提出一种基于源代码扩展信息的细粒度缺陷定位方法,以代码词汇语义显性信息及代码执行隐性信息共同丰富源代码语义实现细粒度缺陷定位.利用定位候选点的语义相关上下文丰富代码量,以代码执行中间形式的结构语义实现细粒度代码的可区分,同时以自然语言语义指导基于注意力机制的代码语言表征生成,实现细粒度代码与自然语言间的语义映射,从而实现细粒度缺陷定位方法FlowLocator.实验分析结果表明：与经典的IR缺陷定位方法相比,该方法定位准确性在Top-N排名、平均准确率及平均倒数排名上都有显著提高. 相似文献