期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Performance comparison of query-based techniques for anti-pattern detection

《Information and Software Technology》2015

ContextProgram queries play an important role in several software evolution tasks like program comprehension, impact analysis, or the automated identification of anti-patterns for complex refactoring operations. A central artifact of these tasks is the reverse engineered program model built up from the source code (usually an Abstract Semantic Graph, ASG), which is traditionally post-processed by dedicated, hand-coded queries.ObjectiveOur paper investigates the costs and benefits of using the popular industrial Eclipse Modeling Framework (EMF) as an underlying representation of program models processed by four different general-purpose model query techniques based on native Java code, OCL evaluation and (incremental) graph pattern matching.MethodWe provide in-depth comparison of these techniques on the source code of 28 Java projects using anti-pattern queries taken from refactoring operations in different usage profiles.ResultsOur results show that general purpose model queries can outperform hand-coded queries by 2–3 orders of magnitude, with the trade-off of an increased in memory consumption and model load time of up to an order of magnitude.ConclusionThe measurement results of usage profiles can be used as guidelines for selecting the appropriate query technologies in concrete scenarios. 相似文献

2.

Sourcerer: An infrastructure for large-scale collection and analysis of open-source code

《Science of Computer Programming》2014

A large amount of open source code is now available online, presenting a great potential resource for software developers. This has motivated software engineering researchers to develop tools and techniques to allow developers to reap the benefits of these billions of lines of source code. However, collecting and analyzing such a large quantity of source code presents a number of challenges. Although the current generation of open source code search engines provides access to the source code in an aggregated repository, they generally fail to take advantage of the rich structural information contained in the code they index. This makes them significantly less useful than Sourcerer for building state-of-the-art software engineering tools, as these tools often require access to both the structural and textual information available in source code.We have developed Sourcerer, an infrastructure for large-scale collection and analysis of open source code. By taking full advantage of the structural information extracted from source code in its repository, Sourcerer provides a foundation upon which state-of-the-art search engines and related tools can easily be built. We describe the Sourcerer infrastructure, present the applications that we have built on top of it, and discuss how existing tools could benefit from using Sourcerer. 相似文献

3.

Recording concerns in source code using annotations

《Computer Languages, Systems and Structures》2016

A concern can be characterized as a developer׳s intent behind a piece of code, often not explicitly captured in it. We discuss a technique of recording concerns using source code annotations (concern annotations). Using two studies and two controlled experiments, we seek to answer the following 3 research questions: (1) Do programmers׳ mental models overlap? (2) How do developers use shared concern annotations when they are available? (3) Does using annotations created by others improve program comprehension and maintenance correctness, time and confidence? The first study shows that developers׳ mental models, recorded using concern annotations, overlap and thus can be shared. The second study shows that shared concern annotations can be used during program comprehension for the following purposes: hypotheses confirmation, feature location, obtaining new knowledge, finding relationships and maintenance notes. The first controlled experiment with students showed that the presence of annotations significantly reduced program comprehension and maintenance time by 34%. The second controlled experiment was a differentiated replication of the first one, focused on industrial developers. It showed a 33% significant improvement in correctness. We conclude that concern annotations are a viable way to share developers׳ thoughts. 相似文献

4.

Program comprehension of domain-specific and general-purpose languages: replication of a family of experiments using integrated development environments

Tomaž Kosar Sašo Gaberc Jeffrey C. Carver Marjan Mernik 《Empirical Software Engineering》2018,23(5):2734-2763

Domain-specific languages (DSLs) allow developers to write code at a higher level of abstraction compared with general-purpose languages (GPLs). Developers often use DSLs to reduce the complexity of GPLs. Our previous study found that developers performed program comprehension tasks more accurately and efficiently with DSLs than with corresponding APIs in GPLs. This study replicates our previous study to validate and extend the results when developers use IDEs to perform program comprehension tasks. We performed a dependent replication of a family of experiments. We made two specific changes to the original study: (1) participants used IDEs to perform the program comprehension tasks, to address a threat to validity in the original experiment and (2) each participant performed program comprehension tasks on either DSLs or GPLs, not both as in the original experiment. The results of the replication are consistent with and expanded the results of the original study. Developers are significantly more effective and efficient in tool-based program comprehension when using a DSL than when using a corresponding API in a GPL. The results indicate that, where a DSL is available, developers will perform program comprehension better using the DSL than when using the corresponding API in a GPL. 相似文献

5.

Predicting source code changes by mining change history 总被引：1，自引：0，他引：1

Ying A.T.T. Murphy G.C. Ng R. Chu-Carroll M.C. 《IEEE transactions on pattern analysis and machine intelligence》2004,30(9):574-586

Software developers are often faced with modification tasks that involve source which is spread across a code base. Some dependencies between source code, such as those between source code written in different languages, are difficult to determine using existing static and dynamic analyses. To augment existing analyses and to help developers identify relevant source code during a modification task, we have developed an approach that applies data mining techniques to determine change patterns - sets of files that were changed together frequently in the past - from the change history of the code base. Our hypothesis is that the change patterns can be used to recommend potentially relevant source code to a developer performing a modification task. We show that this approach can reveal valuable dependencies by applying the approach to the Eclipse and Mozilla open source projects and by evaluating the predictability and interestingness of the recommendations produced for actual modification tasks on these systems. 相似文献

6.

Understanding the attitudes,knowledge sharing behaviors and task performance of core developers: A longitudinal study

《Information and Software Technology》2014,56(12):1578-1596

ContextPrior research has established that a few individuals generally dominate project communication and source code changes during software development. Moreover, this pattern has been found to exist irrespective of task assignments at project initiation.ObjectiveWhile this phenomenon has been noted, prior research has not sought to understand these dominant individuals. Previous work considering the effect of team structures on team performance has found that core communicators are the gatekeepers of their teams’ knowledge, and the performance of these members was correlated with their teams’ success. Building on this work, we have employed a longitudinal approach to study the way core developers’ attitudes, knowledge sharing behaviors and task performance change over the course of their project, based on the analysis of repository data.MethodWe first used social network analysis (SNA) and standard statistical analysis techniques to identify and select artifacts from ten different software development teams. These procedures were also used to select central practitioners among these teams. We then applied psycholinguistic analysis and directed content analysis (CA) techniques to interpret the content of these practitioners’ messages. Finally, we inspected these core developers’ activities as recorded in system change logs at various points in time during systems’ development.ResultsAmong our findings, we observe that core developers’ attitudes and knowledge sharing behaviors were linked to their involvement in actual software development and the demands of their wider project teams. However, core developers appeared to naturally possess high levels of insightful characteristics, which became evident very early during teamwork.ConclusionsProject performance would likely benefit from strategies aimed at surrounding core developers with other competent communicators. Core developers should also be supported by a wider team who are willing to ask questions and challenge their ideas. Finally, the availability of adequate communication channels would help with maintaining positive team climate, and this is likely to mitigate the negative effects of distance during distributed developments. 相似文献

7.

Software cultures and evolution

Raijlich V. Wilde N. Buckellew M. Page H. 《Computer》2001,34(9):24-28

To work effectively with legacy code, software engineers need to understand a legacy computer program's culture - the combination of the programmer's background, the hardware environment and the programming techniques that guided its creation. Software systems typically pass through a series of stages. During the initial development stage, software developers create a first functioning version of the code. An evolution stage follows, during which developmental efforts focus on extending system capabilities to meet user needs. During the servicing stage, only minor repairs and simple functional changes are possible. In the phase-out stage, the system is essentially frozen, but it still produces value. Finally, during the close-down stage, the developers withdraw the system and possibly replace it. Most of the tasks in the evolution and servicing phases require program comprehension ⁿderstanding how and why a software program functions in order to work with it effectively. Effective comprehension requires viewing a legacy program not simply as a product of inefficiency or stupidity, but instead as an artifact of the circumstances in which it was developed. This information can be an important factor in determining appropriate strategies for the software program's transition from the evolution stage to the servicing or phase-out stage 相似文献

8.

KSAP: An approach to bug report assignment using KNN search and heterogeneous proximity

《Information and Software Technology》2016

ContextBug report assignment, namely, to assign new bug reports to developers for timely and effective bug resolution, is crucial for software quality assurance. However, with the increasing size of software system, it is difficult to assign bugs to appropriate developers for bug managers.ObjectiveThis paper propose an approach, called KSAP (K-nearest-neighbor search and heterogeneous proximity), to improve automatic bug report assignment by using historical bug reports and heterogeneous network of bug repository.MethodWhen a new bug report was submitted to the bug repository, KSAP assigns developers for the bug report by using a two-phase procedure. The first phase is to search historically-resolved similar bug reports to the new bug report by K-nearest-neighbor (KNN) method. The second phase is to rank the developers who contributed to those similar bug reports by heterogeneous proximity.ResultsWe collected bug repositories of Mozilla, Eclipse, Apache Ant and Apache Tomcat6 projects to investigate the performance of the proposed KSAP approach. Experimental results demonstrate that KSAP can improve the recall of bug report assignment between 7.5–32.25% in comparison with the state of art techniques. When there is only a small number of developer collaborations on common bug reports, KSAP has shown its excellence over other sate of art techniques. When we tune the parameters of the number of historically-resolved similar bug reports (K) and the number of developers (Q) for recommendation, KSAP keeps its superiority steadily.ConclusionThis is the first paper to demonstrate how to automatically build heterogeneous network of a bug repository and extract meta-paths of developer collaborations from the heterogeneous network for bug report assignment. 相似文献

9.

SQA-Mashup: A mashup framework for continuous integration

《Information and Software Technology》2015

ContextContinuous Integration (CI) has become an established best practice of modern software development. Its philosophy of regularly integrating the changes of individual developers with the master code base saves the entire development team from descending into Integration Hell, a term coined in the field of extreme programming. In practice, CI is supported by automated tools to cope with this repeated integration of source code through automated builds and testing. One of the main problems, however, is that relevant information about the quality and health of a software system is both scattered across those tools and across multiple views.ObjectiveThis paper introduces a quality awareness framework for CI-data and its conceptional model used for the data integration and visualization. The framework called SQA-Mashup makes use of the service-based mashup paradigm and integrates information from the entire CI-toolchain into a single service.MethodThe research approach followed in our work consists out of (i) a conceptional model for data integration and visualization, (ii) a prototypical framework implementation based on tool requirements derived from literature, and (iii) a controlled user study to evaluate its usefulness.ResultsThe results of the controlled user study showed that SQA-Mashup’s single point of access allows users to answer questions regarding the state of a system more quickly (57%) and accurately (21.6%) than with standalone CI-tools.ConclusionsThe SQA-Mashup framework can serve as one-stop shop for software quality data monitoring in a software development project. It enables easy access to CI-data which otherwise is not integrated but scattered across multiple CI-tools. Our dynamic visualization approach allows for a tailoring of integrated CI-data according to information needs of different stakeholders such as developers or testers. 相似文献

10.

Concept location using program dependencies and information retrieval (DepIR)

《Information and Software Technology》2013,55(4):651-659

ContextThe functionality of a software system is most often expressed in terms of concepts from its problem or solution domains. The process of finding where these concepts are implemented in the source code is known as concept location and it is a prerequisite of software change.ObjectiveWe investigate a static approach to concept location named DepIR that combines program dependency search (DepS) with information retrieval-based search (IR). In this approach, programmers explore the static program dependencies of the source code components retrieved by the IR search engine.MethodThe paper presents an empirical study that compares DepIR with its constituent techniques. The evaluation is based on an empirical method of reenactment that emulates the steps of concept location for 50 past changes mined from software repositories of five software systems.ResultsThe results of the study indicate that DepIR significantly outperforms both DepS and IR.ConclusionDepIR allows developers to perform concept location efficiently. It allows finding concepts even with queries that do not rank the relevant software components highly. Since formulating a good query is not always easy, this tolerance of lower-quality queries significantly broadens the usability of DepIR compared to the traditional IR. 相似文献

11.

MSR4SM: Using topic models to effectively mining software repositories for software maintenance tasks

《Information and Software Technology》2015

ContextMining software repositories has emerged as a research direction over the past decade, achieving substantial success in both research and practice to support various software maintenance tasks. Software repositories include bug repository, communication archives, source control repository, etc. When using these repositories to support software maintenance, inclusion of irrelevant information in each repository can lead to decreased effectiveness or even wrong results.ObjectiveThis article aims at selecting the relevant information from each of the repositories to improve effectiveness of software maintenance tasks.MethodFor a maintenance task at hand, maintainers need to implement the maintenance request on the current system. In this article, we propose an approach, MSR4SM, to extract the relevant information from each software repository based on the maintenance request and the current system. That is, if the information in a software repository is relevant to either the maintenance request or the current system, this information should be included to perform the current maintenance task. MSR4SM uses the topic model to extract the topics from these software repositories. Then, relevant information in each software repository is extracted based on the topics.ResultsMSR4SM is evaluated for two software maintenance tasks, feature location and change impact analysis, which are based on four subject systems, namely jEdit, ArgoUML, Rhino and KOffice. The empirical results show that the effectiveness of traditional software repositories based maintenance tasks can be greatly improved by MSR4SM.ConclusionsThere is a lot of irrelevant information in software repositories. Before we use them to implement a maintenance task at hand, we need to preprocess them. Then, the effectiveness of the software maintenance tasks can be improved. 相似文献

12.

Approximate Structural Context Matching: An Approach to Recommend Relevant Examples 总被引：1，自引：0，他引：1

Holmes R. Walker R.J. Murphy G.C. 《IEEE transactions on pattern analysis and machine intelligence》2006,32(12):952-970

When coding to an application programming interface (API), developers often encounter difficulties, unsure of which class to subclass, which objects to instantiate, and which methods to call. Example source code that demonstrates the use of the API can help developers make progress on their task. This paper describes an approach to provide such examples in which the structure of the source code that the developer is writing is matched heuristically to a repository of source code that uses the API. The structural context needed to query the repository is extracted automatically from the code, freeing the developer from learning a query language or from writing their code in a particular style. The repository is generated automatically from existing applications, avoiding the need for handcrafted examples. We demonstrate that the approach is effective, efficient, and more reliable than traditional alternatives through four empirical studies 相似文献

13.

Does syntax highlighting help programming novices?

Christoph Hannebauer Marc Hesenius Volker Gruhn 《Empirical Software Engineering》2018,23(5):2795-2828

Program comprehension is an important skill for programmers – extending and debugging existing source code is part of the daily routine. Syntax highlighting is one of the most common tools used to support developers in understanding algorithms. However, most research in this area originates from a time when programmers used a completely different tool chain. We examined the influence of syntax highlighting on novices’ ability to comprehend source code. Additional analyses cover the influence of task type and programming experience on the code comprehension ability itself and its relation to syntax highlighting. We conducted a controlled experiment with 390 undergraduate students in an introductory Java programming course. We measured the correctness with which they solved small coding tasks. Each test subject received some tasks with syntax highlighting and some without. The data provided no evidence that syntax highlighting improves novices’ ability to comprehend source code. There are very few similar experiments and it is unclear as of yet which factors impact the effectiveness of syntax highlighting. One major limitation may be the types of tasks chosen for this experiment. The results suggest that syntax highlighting squanders a feedback channel from the IDE to the programmer that can be used more effectively. 相似文献

14.

Studying software evolution using topic models

《Science of Computer Programming》2014

Topic models are generative probabilistic models which have been applied to information retrieval to automatically organize and provide structure to a text corpus. Topic models discover topics in the corpus, which represent real world concepts by frequently co-occurring words. Recently, researchers found topics to be effective tools for structuring various software artifacts, such as source code, requirements documents, and bug reports. This research also hypothesized that using topics to describe the evolution of software repositories could be useful for maintenance and understanding tasks. However, research has yet to determine whether these automatically discovered topic evolutions describe the evolution of source code in a way that is relevant or meaningful to project stakeholders, and thus it is not clear whether topic models are a suitable tool for this task.In this paper, we take a first step towards evaluating topic models in the analysis of software evolution by performing a detailed manual analysis on the source code histories of two well-known and well-documented systems, JHotDraw and jEdit. We define and compute various metrics on the discovered topic evolutions and manually investigate how and why the metrics evolve over time. We find that the large majority (87%–89%) of topic evolutions correspond well with actual code change activities by developers. We are thus encouraged to use topic models as tools for studying the evolution of a software system. 相似文献

15.

Replaying development history to assess the effectiveness of change propagation tools

Ahmed E. Hassan Richard C. Holt 《Empirical Software Engineering》2006,11(3):335-367

As developers modify software entities such as functions or variables to introduce new features, enhance old ones, or fix bugs, they must ensure that other entities in the software system are updated to be consistent with these new changes. Many hard to find bugs are introduced by developers who did not notice dependencies between entities, and failed to propagate changes correctly. Most modern development environments offer tools to assist developers in propagating changes. For example, dependency browsers show static code dependencies between source code entities. Other sources of information such as historical co-change or code layout information could be used by tools to support developers in propagating changes. We present the Development Replay (DR) approach which empirically assess and compares the effectiveness of several not-yet-existing change propagation tools by reenacting the changes stored in source control repositories using these tools. We present a case study of five large open source systems with a total of over 40 years of development history. Our empirical results show that historical co-change information recovered from source control repositories along with code layout information can guide developers in propagating changes better than simple static dependency information.

Richard C. HoltEmail:

相似文献

16.

Early prediction of merged code changes to prioritize reviewing tasks

Yuanrui Fan Xin Xia David Lo Shanping Li 《Empirical Software Engineering》2018,23(6):3346-3393

Modern Code Review (MCR) has been widely used by open source and proprietary software projects. Inspecting code changes consumes reviewers much time and effort since they need to comprehend patches, and many reviewers are often assigned to review many code changes. Note that a code change might be eventually abandoned, which causes waste of time and effort. Thus, a tool that predicts early on whether a code change will be merged can help developers prioritize changes to inspect, accomplish more things given tight schedule, and not waste reviewing effort on low quality changes. In this paper, motivated by the above needs, we build a merged code change prediction tool. Our approach first extracts 34 features from code changes, which are grouped into 5 dimensions: code, file history, owner experience, collaboration network, and text. And then we leverage machine learning techniques such as random forest to build a prediction model. To evaluate the performance of our approach, we conduct experiments on three open source projects (i.e., Eclipse, LibreOffice, and OpenStack), containing a total of 166,215 code changes. Across three datasets, our approach statistically significantly improves random guess classifiers and two prediction models proposed by Jeong et al. (2009) and Gousios et al. (2014) in terms of several evaluation metrics. Besides, we also study the important features which distinguish merged code changes from abandoned ones. 相似文献

17.

Automatically propagating changes from reference implementations to code generation templates

《Information and Software Technology》2015

ContextCode generators can automatically perform some tedious and error-prone implementation tasks, increasing productivity and quality in the software development process. Most code generators are based on templates, which are fundamentally composed of text expansion statements. To build templates, the code of an existing, tested and validated implementation may serve as reference, in a process known as templatization. With the dynamics of software evolution/maintenance and the need for performing changes in the code generation templates, there is a loss of synchronism between the templates and this reference code. Additional effort is required to keep them synchronized.ObjectiveThis paper proposes automation as a way to reduce the extra effort needed to keep templates and reference code synchronized.MethodA mechanism was developed to semi-automatically detect and propagate changes from reference code to templates, keeping them synchronized with less effort. The mechanism was also submitted to an empirical evaluation to analyze its effects in terms of effort reduction during maintenance/evolution templatization.ResultsIt was observed that the developed mechanism can lead to a 50% reduction in the effort needed to perform maintenance/evolution templatization, when compared to a manual approach. It was also observed that this effect depends on the nature of the evolution/maintenance task, since for one of the tasks there was no observable advantage in using the mechanism. However, further studies are needed to better characterize these tasks.ConclusionAlthough there is still room for improvement, the results indicate that automation can be used to reduce effort and cost in the maintenance and evolution of a template-based code generation infrastructure. 相似文献

18.

An Exploratory Study of How Developers Seek, Relate, and Collect Relevant Information during Software Maintenance Tasks

Ko A.J. Myers B.A. Coblenz M.J. Aung H.H. 《IEEE transactions on pattern analysis and machine intelligence》2006,32(12):971-987

Much of software developers' time is spent understanding unfamiliar code. To better understand how developers gain this understanding and how software development environments might be involved, a study was performed in which developers were given an unfamiliar program and asked to work on two debugging tasks and three enhancement tasks for 70 minutes. The study found that developers interleaved three activities. They began by searching for relevant code both manually and using search tools; however, they based their searches on limited and misrepresentative cues in the code, environment, and executing program, often leading to failed searches. When developers found relevant code, they followed its incoming and outgoing dependencies, often returning to it and navigating its other dependencies; while doing so, however, Eclipse's navigational tools caused significant overhead. Developers collected code and other information that they believed would be necessary to edit, duplicate, or otherwise refer to later by encoding it in the interactive state of Eclipse's package explorer, file tabs, and scroll bars. However, developers lost track of relevant code as these interfaces were used for other tasks, and developers were forced to find it again. These issues caused developers to spend, on average, 35 percent of their time performing the mechanics of navigation within and between source files. These observations suggest a new model of program understanding grounded in theories of information foraging and suggest ideas for tools that help developers seek, relate, and collect information in a more effective and explicit manner 相似文献

19.

代码坏味对软件演化影响的实证研究

章晓芳朱灿《软件学报》2019,30(5):1422-1437

代码坏味是指程序设计中存在的不良设计模式或设计缺陷.坏味的存在,被认为会阻碍软件的演化与维护.近年来,研究人员致力于探究坏味产生的影响以及坏味与软件演化之间的关系.已有研究表明,代码坏味会随着软件的演化而不断发生变化.通常,软件的演化将涉及源文件的增加、修改与删除这3类具体操作,了解代码坏味与软件演化中源文件操作的关系,将有助于开发者更好地计划软件开发过程和重构软件代码.因此,针对13种常见的坏味,在8个Java项目共计104个版本中进行了系统的实证研究.研究发现,随着软件版本的演化,含代码坏味的文件在整个项目中的占比在不同的项目中呈现出不同的特征.另外,包含代码坏味的文件更倾向于被修改,而坏味本身与文件的添加或者删除并没有太大的关联.更进一步地,在探究的所有坏味中,有几种特定的坏味对文件的修改产生了显著的影响,且这些坏味文件间存在着明显的重叠.这些发现有助于开发人员更好地了解代码坏味,以便于更好地对软件进行维护. 相似文献

20.

Understanding cost drivers of software evolution: a quantitative and qualitative investigation of change effort in two evolving software systems

Hans Christian Benestad Bente Anda Erik Arisholm 《Empirical Software Engineering》2010,15(2):166-203

Making changes to software systems can prove costly and it remains a challenge to understand the factors that affect the costs of software evolution. This study sought to identify such factors by investigating the effort expended by developers to perform 336 change tasks in two different software organizations. We quantitatively analyzed data from version control systems and change trackers to identify factors that correlated with change effort. In-depth interviews with the developers about a subset of the change tasks further refined the analysis. Two central quantitative results found that dispersion of changed code and volatility of the requirements for the change task correlated with change effort. The analysis of the qualitative interviews pointed to two important, underlying cost drivers: Difficulties in comprehending dispersed code and difficulties in anticipating side effects of changes. This study demonstrates a novel method for combining qualitative and quantitative analysis to assess cost drivers of software evolution. Given our findings, we propose improvements to practices and development tools to manage and reduce the costs. 相似文献