共查询到20条相似文献,搜索用时 0 毫秒
1.
Source code comments are a valuable instrument to preserve design decisions and to communicate the intent of the code to programmers
and maintainers. Nevertheless, commenting source code and keeping comments up-to-date is often neglected for reasons of time
or programmers obliviousness. In this paper, we investigate the question whether developers comment their code and to what
extent they add comments or adapt them when they evolve the code. We present an approach to associate comments with source
code entities to track their co-evolution over multiple versions. A set of heuristics are used to decide whether a comment
is associated with its preceding or its succeeding source code entity. We analyzed the co-evolution of code and comments in
eight different open source and closed source software systems. We found with statistical significance that (1) the relative
amount of comments and source code grows at about the same rate; (2) the type of a source code entity, such as a method declaration
or an if-statement, has a significant influence on whether or not it gets commented; (3) in six out of the eight systems,
code and comments co-evolve in 90% of the cases; and (4) surprisingly, API changes and comments do not co-evolve but they
are re-documented in a later revision. As a result, our approach enables a quantitative assessment of the commenting process
in a software system. We can, therefore, leverage the results to provide feedback during development to increase the awareness
of when to add comments or when to adapt comments because of source code changes. 相似文献
2.
Although function points (FP) are considered superior to source lines of code (SLOC) for estimating software size and monitoring developer productivity, practitioners still commonly use SLOC. One reason for this is that individuals who fill different roles on a development team, such as managers and developers, may perceive the benefits of FP differently. We conducted a survey to determine whether a perception gap exists between managers and developers for FP and SLOC across several desirable properties of software measures. Results suggest managers and developers perceive the benefits of FP differently and indicate that developers better understand the benefits of using FP than managers. 相似文献
3.
The organization of talent in online communities has been pivotal for the development of open source software. We are currently witnessing a related phenomenon that is at least of equal importance: the ‘open-sourcing’ of digital content through a dramatic increase in user-generated content and the development of appropriate licenses for users to share their works and build on each other's creativity. This article compares and contrasts (a) the objectives of software development vis-à-vis the development of new media content, (b) the organizational forms that have developed in respective online communities, and (c) the role that licensing plays in the production of ‘functional’ vis-à-vis ‘cultural’ goods. 相似文献
4.
Fabian Trautsch Steffen Herbold Philip Makedonski Jens Grabowski 《Empirical Software Engineering》2018,23(2):1036-1083
The usage of empirical methods has grown common in software engineering. This trend spawned hundreds of publications, whose results are helping to understand and improve the software development process. Due to the data-driven nature of this venue of investigation, we identified several problems within the current state-of-the-art that pose a threat to the replicability and validity of approaches. The heavy re-use of data sets in many studies may invalidate the results in case problems with the data itself are identified. Moreover, for many studies data and/or the implementations are not available, which hinders a replication of the results and, thereby, decreases the comparability between studies. Furthermore, many studies use small data sets, which comprise of less than 10 projects. This poses a threat especially to the external validity of these studies. Even if all information about the studies is available, the diversity of the used tooling can make their replication even then very hard. Within this paper, we discuss a potential solution to these problems through a cloud-based platform that integrates data collection and analytics. We created SmartSHARK, which implements our approach. Using SmartSHARK, we collected data from several projects and created different analytic examples. Within this article, we present SmartSHARK and discuss our experiences regarding the use of it and the mentioned problems. Additionally, we show how we have addressed the issues that we have identified during our work with SmartSHARK. 相似文献
5.
Source code documentation often contains summaries of source code written by authors. Recently, automatic source code summarization tools have emerged that generate summaries without requiring author intervention. These summaries are designed for readers to be able to understand the high-level concepts of the source code. Unfortunately, there is no agreed upon understanding of what makes up a “good summary.” This paper presents an empirical study examining summaries of source code written by authors, readers, and automatic source code summarization tools. This empirical study examines the textual similarity between source code and summaries of source code using Short Text Semantic Similarity metrics. We found that readers use source code in their summaries more than authors do. Additionally, this study finds that accuracy of a human written summary can be estimated by the textual similarity of that summary to the source code. 相似文献
6.
7.
8.
9.
Static analysis is a popular tool for detecting the vulnerabilities that cannot be found by means of ordinary testing. The main problem in the development of static analyzers is their low speed. Methods for accelerating such analyzers are described, which include incremental analysis, lazy analysis, and header file caching. These methods make it possible to considerably accelerate the detection of defects and to integrate the static analysis tools in the development environment. As a result, defects in a file edited in the Visual Studio development environment can be detected in 0.5 s or faster, which means that they can be practically detected after each keystroke. Therefore, critical vulnerabilities can be detected and corrected at the stage of coding. 相似文献
10.
Antecedents of open source software defects: A data mining approach to model formulation, validation and testing 总被引:1,自引:0,他引:1
This paper develops tests and validates a model for the antecedents of open source software (OSS) defects, using Data and
Text Mining. The public archives of OSS projects are used to access historical data on over 5,000 active and mature OSS projects.
Using domain knowledge and exploratory analysis, a wide range of variables is identified from the process, product, resource,
and end-user characteristics of a project to ensure that the model is robust and considers all aspects of the system. Multiple
Data Mining techniques are used to refine the model and data is enriched by the use of Text Mining for knowledge discovery
from qualitative information. The study demonstrates the suitability of Data Mining and Text Mining for model building. Results
indicate that project type, end-user activity, process quality, team size and project popularity have a significant impact
on the defect density of operational OSS projects. Since many organizations, both for profit and not for profit, are beginning
to use Open Source Software as an economic alternative to commercial software, these results can be used in the process of
deciding what software can be reasonably maintained by an organization. 相似文献
11.
E. Yu. Sharygin R. A. Buchatskiy R. A. Zhuykov A. R. Sher 《Programming and Computer Software》2017,43(6):353-365
This paper describes the development of a query compiler for the PostgreSQL DBMS based on automatic code specialization methods; these methods allow one to avoid the development and support difficulties typical for classical query compilers by dividing the compiler development problem into two independent subproblems: reduction of overhead costs and implementation of algorithmic improvements. We assert that this decomposition facilitates the solution of both the subproblems: the cost reduction can be automated, while the algorithmic improvements can be implemented in the interpreter in the DBMS implementation language. This paper presents methods for online and offline specialization, considers specifics of specialization and binding-time analysis of the PostgreSQL source code, and describes the transition to a push model of execution. 相似文献
12.
Free and open source development practices in the game community 总被引:1,自引:0,他引:1
The free and open source software (FOSS) approach lets community of like-minded participants develop software systems and related artifacts that are shared freely instead of offered as closed-source commercial products. Free (as in freedom) software and open source are closely related but slightly different approaches and licensing schemes for developing publicly shared software. FOSS development communities don't seem to adopt modern software engineering processes. FOSS communities develop software that's extremely valuable, generally reliable, globally distributed, made available for acquisition at little or no cost, and readily used in its community. Free and open source software development practices gives rise to new view of how complex software systems can be constructed, deployed, and evolved. They rely on lean electronic communication media, virtual project management, and version management mechanisms to coordinate globally dispersed development efforts. These FOSS processes offer new directions for developing complex software systems. We look at the FOSS computer game community to provide examples of common development processes and practices. 相似文献
13.
This paper advances the existing knowledge of anti-piracy strategies by proposing an open source strategy (OS strategy) to alleviate software piracy based on a qualitative, case-based, exploratory study of eight software firms operating in China. The paper shows that the OS strategy is conditionally adoptable, depending on how users are willing to pay for services (market conditions); how critical and complex software is required for upgrading and modifications (software conditions); and how firms can avoid resources overloading and/or shortage (firm conditions). The paper also identifies several new indicators to assess the effectiveness of the OS strategy against piracy. Managerial implications about how to improve business in piracy-ridden environment are discussed. 相似文献
14.
Nachiappan Nagappan E. Michael Maximilien Thirumalesh Bhat Laurie Williams 《Empirical Software Engineering》2008,13(3):289-302
Test-driven development (TDD) is a software development practice that has been used sporadically for decades. With this practice,
a software engineer cycles minute-by-minute between writing failing unit tests and writing implementation code to pass those
tests. Test-driven development has recently re-emerged as a critical enabling practice of agile software development methodologies.
However, little empirical evidence supports or refutes the utility of this practice in an industrial context. Case studies
were conducted with three development teams at Microsoft and one at IBM that have adopted TDD. The results of the case studies
indicate that the pre-release defect density of the four products decreased between 40% and 90% relative to similar projects
that did not use the TDD practice. Subjectively, the teams experienced a 15–35% increase in initial development time after
adopting TDD.
Nachiappan Nagappan is a researcher in the Software Reliability Research group at Microsoft Research. He received his MS and PhD from North Carolina State University in 2002 and 2005, respectively. His research interests are in software reliability, software measurement and empirical software engineering. Dr. E. Michael Maximilien (aka “max”) is a research staff member at IBM’s Almaden Research Center in San Jose, California. Prior to joining ARC, he spent ten years at IBM’s Research Triangle Park, N.C., in software development and architecture. He led various small- to medium-sized teams, designing and developing enterprise and embedded Java™ software; he is a founding member and contributor to three worldwide Java and UML industry standards. His primary research interests lie in distributed systems and software engineering, especially Web services and APIs, mashups, Web 2.0, SOA (service-oriented architecture), and Agile methods and practices. He can be reached through his Web site (maximilien.org) and blog (blog.maximilien.com). Thirumalesh Bhat is a Development Manager at Microsoft Corporation. He has worked on several versions of Windows and other commercial software systems at Microsoft. He is interested in software reliability, testing, metrics and software processes. Laurie Williams is an associate professor of computer science at North Carolina State University. She teaches software engineering and software reliability and testing. Prior to joining NCSU, she worked at IBM for nine years, including several years as a manager of a software testing department and as a project manager for a large software project. She was one of the founders of the XP Universe conference in 2001, the first US-based conference on agile software development. She is also the lead author of the Pair Programming Illuminated book and a co-editor of the Extreme Programming Perspectives book. 相似文献
Laurie WilliamsEmail: |
Nachiappan Nagappan is a researcher in the Software Reliability Research group at Microsoft Research. He received his MS and PhD from North Carolina State University in 2002 and 2005, respectively. His research interests are in software reliability, software measurement and empirical software engineering. Dr. E. Michael Maximilien (aka “max”) is a research staff member at IBM’s Almaden Research Center in San Jose, California. Prior to joining ARC, he spent ten years at IBM’s Research Triangle Park, N.C., in software development and architecture. He led various small- to medium-sized teams, designing and developing enterprise and embedded Java™ software; he is a founding member and contributor to three worldwide Java and UML industry standards. His primary research interests lie in distributed systems and software engineering, especially Web services and APIs, mashups, Web 2.0, SOA (service-oriented architecture), and Agile methods and practices. He can be reached through his Web site (maximilien.org) and blog (blog.maximilien.com). Thirumalesh Bhat is a Development Manager at Microsoft Corporation. He has worked on several versions of Windows and other commercial software systems at Microsoft. He is interested in software reliability, testing, metrics and software processes. Laurie Williams is an associate professor of computer science at North Carolina State University. She teaches software engineering and software reliability and testing. Prior to joining NCSU, she worked at IBM for nine years, including several years as a manager of a software testing department and as a project manager for a large software project. She was one of the founders of the XP Universe conference in 2001, the first US-based conference on agile software development. She is also the lead author of the Pair Programming Illuminated book and a co-editor of the Extreme Programming Perspectives book. 相似文献
15.
Georgia Frantzeskou Author Vitae Stephen MacDonell Author Vitae 《Journal of Systems and Software》2008,81(3):447-460
The use of Source Code Author Profiles (SCAP) represents a new, highly accurate approach to source code authorship identification that is, unlike previous methods, language independent. While accuracy is clearly a crucial requirement of any author identification method, in cases of litigation regarding authorship, plagiarism, and so on, there is also a need to know why it is claimed that a piece of code is written by a particular author. What is it about that piece of code that suggests a particular author? What features in the code make one author more likely than another? In this study, we describe a means of identifying the high-level features that contribute to source code authorship identification using as a tool the SCAP method. A variety of features are considered for Java and Common Lisp and the importance of each feature in determining authorship is measured through a sequence of experiments in which we remove one feature at a time. The results show that, for these programs, comments, layout features and package-related naming influence classification accuracy whereas user-defined naming, an obvious programmer related feature, does not appear to influence accuracy. A comparison is also made between the relative feature contributions in programs written in the two languages. 相似文献
16.
Prior network-based research on open source software (OSS) development has focused on the benefit of network ties and assumed all network ties play the same role. We adopt a fine-grained view of network relations to investigate the impact of network ties on the success of OSS development. Through examining the development of OSS projects hosted by SourceForge, we find that co-membership among project teams is an effective mechanism for building network ties, through which knowledge and expertise flows across projects in OSS community and, therefore, contributes to the success of OSS development. However, network ties among projects not only confer benefit, but also incur various cost, and due to the different growth patterns of cost and benefit, network ties have a diminishing return to project success. In addition, we find network ties of leader–follower type and follower–leader type are more beneficial to OSS success than other types of ties, and network ties connecting to projects of later development stages are more beneficial than those connecting to projects of earlier stages. Our study provides useful guidelines and suggestions as to how to leverage the knowledge and expertise of others for successful development of OSS projects. 相似文献
17.
The layers architectural pattern has been widely adopted by the developer community in order to build large software systems. In reality, as the system evolves over time, rarely does the system remain conformed to the intended layers pattern, causing a significant degradation of the system maintainability. As a part of re-factoring such a system, practitioners often undertake a mostly manual exercise to discover the intended layers and organize the modules into these layers. In this paper, we present a method for semi-automatically detecting layers in the system and propose a quantitative measurement to compute the amount of non-conformance of the system from the set of layered design principles. We have applied the layer detection method and the non-conformance measurement on a set of open source and proprietary enterprise applications. 相似文献
18.
19.
David Binkley Author VitaeAuthor Vitae Mark Harman Author Vitae Author Vitae Kiarash Mahdavi Author Vitae 《Journal of Systems and Software》2008,81(12):2287-2298
Programs express domain-level concepts in their source code. It might be expected that such concepts would have a degree of semantic cohesion. This cohesion ought to manifest itself in the dependence between statements all of which contribute to the computation of the same concept. This paper addresses a set of research questions that capture this informal observation. It presents the results of experiments on 10 programs that explore the relationship between domain-level concepts and dependence in source code. The results show that code associated with concepts has a greater degree of coherence, with tighter dependence. This finding has positive implications for the analysis of concepts as it provides an approach to decompose a program into smaller executable units, each of which captures the behaviour of the program with respect to a domain-level concept. 相似文献
20.
Data mining is acquiring its own identity by refining concepts from other disciplines, developing generic algorithms, and entering new application areas. Engineering design and manufacturing have been affected by the data mining pursuit. This paper outlines areas of product and manufacturing system design that are particularly suitable for data-mining applications. One of the emerging areas is innovation. The key challenges of data mining in the domains discussed in the paper are outlined. 相似文献