首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 140 毫秒
1.
张佳琪  孙艳春  黄罡 《计算机应用》2022,42(11):3520-3526
目前有关API学习和代码复用的研究主要集中在对于API调用频繁模式的挖掘、组件化信息的提取以及根据用户的需求和目标功能进行的个性化应用程序接口(API)推荐服务等方面。然而,作为缺少专业知识和经验技能来完成特定使用案例的软件开发初学者,在阅读官方文档之外,往往需要真实的使用案例作为参考。现有代码推荐研究大多为单片段式代码,缺少跨函数的案例选择,这不利于初学者学习构建完整的使用场景或功能模块;同时,从单个函数注释中提取的语义描述也不足以构建学习者对项目中完整功能实现方法的认识。为了解决上述问题,提出了一种基于开源社区分析的API使用案例推荐服务,并以软件开发后端框架Spring Boot为例,构建了跨函数的案例推荐辅助学习服务。随后,通过调查问卷、专家验证等方式验证了所提出的API使用案例推荐服务的可行性和有效性。  相似文献   

2.
近年来,随着代码复用技术不断成熟和Internet上开源项目不断丰富,软件开发人员的开发行为也逐渐发生了变化。如今,软件开发人员在编程过程中越来越多地依赖于开源软件项目提供的功能。然而,在软件复用活动中,由于开源项目文档的不全面以及代码结构的复杂性,软件开发人员往往只能片面地了解项目的某些功能点,使得复用效率不高。针对开源项目代码丰富而文档较少这一现状,提出了一种基于LDA(Latent Dirichlet Allocation)和静态分析的代码功能识别方法,对传统LDA方法进行了扩展,帮助软件开发人员更全面地了解项目的功能点,从而更好地支持代码复用活动。  相似文献   

3.
《计算机科学与探索》2017,(10):1591-1598
开发人员通常通过问答网站的搜索引擎进行相关软件问答文档的搜索。在检索结果中,包含优质代码片段(使用示例)的问答文档往往更受青睐,但如何度量这些文档中代码片段的质量仍是个巨大的挑战。针对这个问题,提出了一种基于代码模式的软件问答文档检索优化方法。该方法能够基于当前检索结果,抽取文档中的代码片段,分析代码片段中的公共代码模式,并基于代码模式度量文档中代码片段的质量,从原有检索结果中向用户推荐高质量的软件问答文档。以软件开发人员在实践过程中遇到的真实问题为基础进行了实验,对比Stack Overflow的搜索结果,所提方法在准确率指标NDCG@5上提升了40%。  相似文献   

4.
开发者应及时迁移会为程序引入隐患的弃用API。不完整的文档和潜在替换API的复杂使用方式为此工作带来了挑战。开发者习惯于在问答网站中寻找关于迁移弃用API的讨论。然而,由于帖子数量较多,往往不能在短时间内找到合适的替换API及代码示例。针对此这一问题,提出基于问答平台的弃用API迁移建议推荐技术。迁移建议由与替换API有关的文本描述和代码示例组成。该技术根据弃用API从Stack Overflow中搜索讨论帖,并从中抽取文本信息和代码示例生成回答快照。根据替换API对回答快照进行分类。参考帖子的属性对回答快照进行排序。根据替换API简化快照中的代码示例。基于该技术开发了迁移建议推荐工具,实验表明该工具能显著提高开发者迁移弃用API的效率。  相似文献   

5.
一种图形用户界面的XML描述方法与工具开发   总被引:1,自引:0,他引:1  
传统的图形用户界面开发与具体的程序设计语言和软件开发平台密切相关。图形用户界面开发的这种紧耦合性对软件开发的后续过程和软件移植以及软件开发各阶段工作的重用造成了很大的困难。在分析传统图形用户界面开发存在问题的基础上,提出一种基于XML的图形用户界面描述方法,使用这种描述方法定义图形用户界面,可以实现图形用户界面定义与具体程序设计语言和开发平台的无关性。在此基础上,开发了一个基于上述图形用户界面描述方法的B/S架构的图形用户界面生成工具。其借助开源的fckeditor编辑器与用户交互,可以友好便捷地编辑图形用户界面,生成符合要求的图形用户界面XML描述文档;进一步地,再通过dom4j解析相应的XML文档,可以自动生成html格式等与具体语言相关的图形用户界面代码文档。详细介绍提出的基于XML的图形用户界面描述方法,并给出相应图形用户界面生成工具的设计思路和应用实例。  相似文献   

6.
在为了完善应用程序编程接口(application programming interface,API)文档,提出了基于程序静态分析和自然语言处理的自动检测和修复API文档缺陷的方法。该方法能够自动检测和修复API文档缺陷。实验中缺陷检测结果的准确率和召回率分别达到74.6%和81.4%,能够较为准确地检测到Java API的文档缺陷。在进一步的实验中还对API文档的修复功能进行了评估,结果表明生成的文档正确且简洁,可以有效地修复API文档缺陷。  相似文献   

7.
李正  吴敬征  李明树 《软件学报》2018,29(6):1716-1738
API(Application Programming Interface,应用程序编程接口)在现代软件开发过程中被广泛使用.开发人员通过调用API快速构建项目,节省了大量的时间.但由于API数量众多、文档不够完善、维护更新不及时等原因,使开发人员在学习使用API的过程中面临着严峻的挑战.同时,一旦API使用不正确,程序可能会出现缺陷甚至严重的安全问题.本文通过对API相关文献的深入调研,对近些年来国内外学者在该研究领域取得的成果进行了系统总结.首先,介绍了API的基本概念并分析出影响API使用的三个关键问题:API文档质量不高,调用规约不完整以及API调用序列难以确定;接着,从API文档、调用规约和API推荐三个主要方面对研究成果进行全面的分析;最后,对未来研究可能面临的挑战进行了展望.  相似文献   

8.
刘石  李合  王啸吟  张路  谢冰 《计算机科学》2009,36(8):165-168
通过示例代码学习简单算法的实现和具体API的使用方式是程序开发人员在软件开发中进行软件复用的高效手段,也是使用代码搜索引擎的主要目的.代码搜索引擎从网页搜索技术发展而来,提供对网络上源代码资源的检索功能,能够有效定位与搜索内容相关的代码,为程序开发人员提供帮助.但现有的代码搜索引擎没有在搜索结果中区别API的实现代码与使用代码,搜索结果存在冗余,导致用户无法快速有效地找到提供有用信息的代码片段.为了使用户更好更快地找到代码搜索目标,阐述了应用语法与语义分析技术从区分API实现代码和使用代码、相似代码聚类、搜索结果摘要3个方面对代码搜索结果进行优化的方法,给出了一个代码搜索引擎的实现,并在实例研究中展示了该方法的有效性.  相似文献   

9.
胡天翔  谢睿  叶蔚  张世琨 《软件学报》2023,34(4):1695-1710
代码摘要通过生成源代码片段的简短自然语言描述, 可帮助开发人员理解代码并减少文档工作. 近期, 关于代码摘要的研究工作主要采用深度学习模型, 这些模型中的大多数都在由独立代码摘要对组成的大型数据集上进行训练. 尽管取得了良好的效果, 这些工作普遍忽略了代码片段和摘要的项目级上下文信息, 而开发人员在编写文档时往往高度依赖这些信息. 针对该问题, 研究了一种与开发者行为和代码摘要工具实现更加一致的代码摘要场景——项目级代码摘要, 其中, 创建了用于项目特定代码摘要的数据集, 该数据集包含800k方法摘要对及其生命周期信息, 用于构建特定时刻准确的项目项目上下文; 提出了一种新颖的深度学习方法, 利用高度相关的代码片段及其相应的摘要来表征上下文语义, 并通过迁移学习整合从大规模跨项目数据集中学到的常识. 实验结果表明: 基于项目级上下文的代码摘要模型不仅能够比通用代码摘要模型获得显著的性能提升, 同时, 针对特定项目能够生成更一致的摘要.  相似文献   

10.
陈晨  周宇  王永超  黄志球 《计算机科学》2021,48(12):100-106
在软件开发的过程中,开发人员在遇到编程困境时通常会检索合适的API来完成编程任务.情境信息和开发者画像在有效的API推荐中起着至关重要的作用,却在很大程度上被忽视了.因而文中提出了一种基于情境感知的API个性化推荐方法.该方法利用程序静态分析技术,对代码文件做抽象语法树解析,提取信息构建代码库,并对开发者API使用偏好建模.然后计算开发者当前查询语句与历史代码库中查询的语义相似度,检索出top-k个相似历史查询.最终利用查询语句信息、方法名信息、情境信息以及开发者API使用偏好信息对API进行重排序并推荐给开发者.通过模拟编程任务开发的不同阶段,使用MRR,MAP,Hit,NDCG评估指标来验证所提方法的有效性.实验结果表明,所提方法的API推荐效果优于基准方法,能够为开发者推荐更想要的API.  相似文献   

11.
Many software libraries, especially those commercial ones, provide API documentation in natural languages to describe correct API usages. However, developers may still write code that is inconsistent with API documentation, partially because many developers are reluctant to carefully read API documentation as shown by existing research. As these inconsistencies may indicate defects, researchers have proposed various detection approaches, and these approaches need many known specifications. As it is tedious to write specifications manually for all APIs, various approaches have been proposed to mine specifications automatically. In the literature, most existing mining approaches rely on analyzing client code, so these mining approaches would fail to mine specifications when client code is not sufficient. Instead of analyzing client code, we propose an approach, called Doc2Spec, that infers resource specifications from API documentation in natural languages. We evaluated our approach on the Javadocs of five libraries. The results show that our approach performs well on real scale libraries, and infers various specifications with relatively high precisions, recalls, and F-scores. We further used inferred specifications to detect defects in open source projects. The results show that specifications inferred by Doc2Spec are useful to detect real defects in existing projects.  相似文献   

12.
沈琦  钱莹  邹艳珍  伍仕骏  谢冰 《软件学报》2021,32(4):1023-1038
在软件复用过程中,简洁清楚的软件功能自然语言描述是帮助复用者快速了解待复用软件项目/代码库的前提和基础.但当前开源软件往往缺乏高质量的软件功能说明文档,使得这一过程变得更加复杂和困难.为此,本文提出了一种融合代码与文档的软件功能特征挖掘方法.该方法以动宾短语的形式描述软件功能特征,通过迭代挖掘软件源代码和以Stack Overflow讨论帖为代表的软件文档,自动提取开源软件的功能特征描述,并构造了层次化的软件功能特征视图.在针对多个开源软件项目的实验中,本文方法可覆盖官方文档中列举的95.38%的软件功能.挖掘结果中语句和功能特征的准确率分别达到了93.78%和92.57%.对比现有工作TaskNav和APITasks,本文方法在平均准确率上分别提升了28.78%和11.56%.  相似文献   

13.
Bug‐finding tools rely on specifications of what is correct or incorrect code. As it is difficult for a tool developer or user to anticipate all possible specifications, strategies for inferring specifications have been proposed. These strategies obtain probable specifications by observing common characteristics of code or execution traces, typically focusing on sequences of function calls. To counter the observed high rate of false positives, heuristics have been proposed for ranking or pruning the results. These heuristics, however, can result in false negatives, especially for rarely used functions. In this paper, we propose an alternate approach to specification inference, in which the user guides the inference process using patterns of code that reflect the user's understanding of the conventions and design of the targeted software project. We focus on specifications describing the correct usage of API functions, which we refer to as API protocols. Our approach builds on the Coccinelle program matching and transformation tool, which allows a user to construct patterns that reflect the structure of the code to be matched. We evaluate our approach on the source code of the Linux kernel, which defines a very large number of API functions with varying properties. Linux is also critical software, implying that fixing even bugs involving rarely used protocols is essential. In our experiments, we use our approach to find over 3000 potential API protocols, with an estimated false positive rate of under 15% and use these protocols to find over 360 bugs in the use of API functions. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

14.
Inconsistent identifiers make it difficult for developers to understand source code. In particular, large software systems written by several developers can be vulnerable to identifier inconsistency. Unfortunately, it is not easy to detect inconsistent identifiers that are already used in source code. Although several techniques have been proposed to address this issue, many of these techniques can result in false alarms since such techniques do not accept domain words and idiom identifiers that are widely used in programming practice. This paper proposes an approach to detecting inconsistent identifiers based on a custom code dictionary. It first automatically builds a Code Dictionary from the existing API documents of popular Java projects by using an Natural Language Processing (NLP) parser. This dictionary records domain words with dominant part-of-speech (POS) and idiom identifiers. This set of domain words and idioms can improve the accuracy when detecting inconsistencies by reducing false alarms. The approach then takes a target program and detects inconsistent identifiers of the program by leveraging the Code Dictionary. We provide CodeAmigo, a GUI-based tool support for our approach. We evaluated our approach on seven Java based open-/proprietary- source projects. The results of the evaluations show that the approach can detect inconsistent identifiers with 85.4 % precision and 83.59 % recall values. In addition, we conducted an interview with developers who used our approach, and the interview confirmed that inconsistent identifiers frequently and inevitably occur in most software projects. The interviewees then stated that our approach can help to better detect inconsistent identifiers that would have been missed through manual detection.  相似文献   

15.
汪昕  陈驰  赵逸凡  彭鑫  赵文耘 《软件学报》2019,30(5):1342-1358
开发人员经常需要使用各种应用程序编程接口(application programming interface,简称API)来复用已有的软件框架、类库等.由于API自身的复杂性、文档资料的缺失等原因,开发人员经常会误用API,从而导致代码缺陷.为了自动检测API误用缺陷,需要获得API使用规约,并根据规约对API使用代码进行检测.然而,可用于自动检测的API规约难以获得,而人工编写并维护的代价又很高.针对以上问题,将深度学习中的循环神经网络模型应用于API使用规约的学习及API误用缺陷的检测.在大量的开源Java代码基础上,通过静态分析构造API使用规约训练样本,同时利用这些训练样本搭建循环神经网络学习API使用规约.在此基础上,针对API使用代码进行基于上下文的语句预测,并通过预测结果与实际代码的比较发现潜在的API误用缺陷.对所提出的方法进行实现并针对Java加密相关的API及其使用代码进行了实验评估,结果表明,该方法能够在一定程度上实现API误用缺陷的自动发现.  相似文献   

16.
The process of understanding and reusing software is often time-consuming, especially in legacy code and open-source libraries. While some core code of open-source libraries may be well-documented, it is frequently the case that open-source libraries lack informative API documentation and reliable design information. As a result, the source code itself is often the sole reliable source of information for program understanding activities. In this article, we propose a reverse-engineering approach that can provide assistance during the process of understanding software through the automatic recovery of hidden design patterns in software libraries. Specifically, we use ontology formalism to represent the conceptual knowledge of the source code and semantic rules to capture the structures and behaviors of the design patterns in the libraries. Several software libraries were examined with this approach and the evaluation results show that effective and flexible detection of design patterns can be achieved without using hard-coded heuristics.  相似文献   

17.
Searching application programming interfaces (APIs) is very important for developers to reuse software projects. Existing natural language based API search mainly faces the following challenges. 1) More accurate results are required as software projects evolve to be more heterogeneous and complex. 2) The semantic relationships between APIs (e.g., inheritances between classes, and invocations between methods) need to be illustrated so that developers can better understand their usage scenarios. To deal with these issues, we propose GeAPI, a novel graph embedding based approach for API graph search and recommendation in this paper. First, we build a software project's API graph automatically from its source code and represent each API using graph embedding methods. Second, we search the API graph with a question in natural language, and return the corresponding subgraph that is composed of relevant code elements and their associated relationships, as the best answer of the question. In experiments, we select three well-known open source projects, JodaTime, Apache Lucene and POI, as examples to perform API search tasks. The experimental results show that our approach GeAPI improves F1-score by 10% compared with the existing shortest path based API search approach, while reduces the average response time about 60 times.  相似文献   

18.
Software repositories hold applications that are often categorized to improve the effectiveness of various maintenance tasks. Properly categorized applications allow stakeholders to identify requirements related to their applications and predict maintenance problems in software projects. Manual categorization is expensive, tedious, and laborious – this is why automatic categorization approaches are gaining widespread importance. Unfortunately, for different legal and organizational reasons, the applications’ source code is often not available, thus making it difficult to automatically categorize these applications. In this paper, we propose a novel approach in which we use Application Programming Interface (API) calls from third-party libraries for automatic categorization of software applications that use these API calls. Our approach is general since it enables different categorization algorithms to be applied to repositories that contain both source code and bytecode of applications, since API calls can be extracted from both the source code and byte-code. We compare our approach to a state-of-the-art approach that uses machine learning algorithms for software categorization, and conduct experiments on two large Java repositories: an open-source repository containing 3,286 projects and a closed-source repository with 745 applications, where the source code was not available. Our contribution is twofold: we propose a new approach that makes it possible to categorize software projects without any source code using a small number of API calls as attributes, and furthermore we carried out a comprehensive empirical evaluation of automatic categorization approaches.  相似文献   

19.
源代码检索是软件工程领域的一项重要研究问题,其主要任务是检索和复用软件项目API(application program interface,应用程序接口).随着软件项目的规模越来越大、越来越复杂,当前,源代码检索一方面需要提高基于自然语言API查询的准确性,另一方面需要定位和展示目标API及其相关代码之间的关联,以更好地辅助用户理解API的实现逻辑和使用场景.为此,提出一种基于图嵌入的软件项目源代码检索方法.该方法能够基于软件项目源代码自动构建其代码结构图,并通过图嵌入对源代码进行信息表示.在此基础上,用户可以输入自然语言问题、检索并返回相关的API及其关联信息构成的连通代码子图,从而提高API检索和复用的效率.在以开源项目Apache Lucene和POI为例的检索实验中,该方法检索结果的F1值比现有基于最短路径的方法提高了10%,同时显著缩短了平均响应时间.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号