期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

HPCVIEW: A Tool for Top-down Analysis of Node Performance

Mellor-Crummey John Fowler Robert J. Marin Gabriel Tallent Nathan 《The Journal of supercomputing》2001,23(1):81-104

It is increasingly difficult for complex scientific programs to attain a significant fraction of peak performance on systems that are based on microprocessors with substantial instruction-level parallelism and deep memory hierarchies. Despite this trend, performance analysis and tuning tools are still not used regularly by algorithm and application designers. To a large extent, existing performance tools fail to meet many user needs and are cumbersome to use. To address these issues, we developed HPCVIEW—a toolkit for combining multiple sets of program profile data, correlating the data with source code, and generating a database that can be analyzed anywhere with a commodity Web browser. We argue that HPCVIEW addresses many of the issues that have limited the usability and the utility of most existing tools. We originally built HPCVIEW to facilitate our own work on data layout and optimizing compilers. Now, in addition to daily use within our group, HPCVIEW is being used by several code development teams in DoD and DoE laboratories as well as at NCSA. 相似文献

2.

Tuning parallel symbolic execution engine for better performance

Anil Kumar KARNA Jinbo DU Haihao SHEN Hao ZHONG Jiong GONG Haibo YU Xiangning MA Jianjun ZHAO 《Frontiers of Computer Science》2018,12(1):86-100

Symbolic execution is widely used in many code analysis, testing, and verification tools. As symbolic execution exhaustively explores all feasible paths, it is quite time consuming. To handle the problem, researchers have paralleled existing symbolic execution tools (e.g., KLEE). In particular, Cloud9 is a widely used paralleled symbolic execution tool, and researchers have used the tool to analyze real code. However, researchers criticize that tools such as Cloud9 still cannot analyze large scale code. In this paper, we conduct a field study on Cloud9, in which we use KLEE and Cloud9 to analyze benchmarks in C. Our results confirm the criticism. Based on the results, we identify three bottlenecks that hinder the performance of Cloud9: the communication time gap, the job transfer policy, and the cache management of the solved constraints. To handle these problems, we tune the communication time gap with better parameters, modify the job transfer policy, and implement an approach for cache management of solved constraints. We conduct two evaluations on our benchmarks and a real application to understand our improvements. Our results show that our tuned Cloud9 reduces the execution time significantly, both on our benchmarks and the real application. Furthermore, our evaluation results show that our tuning techniques improve the effectiveness on all the devices, and the improvement can be achieved upto five times, depending upon a tuning value of our approach and the behaviour of program under test. 相似文献

3.

Extracting architectural features from source code 总被引：1，自引：1，他引：0

David R. Harris Alexander S. Yeh Howard B. Reubenstein 《Automated Software Engineering》1996,3(1-2):109-138

相似文献

4.

Exploration of the Performance of a Data Mining Application via Hardware Based Monitoring

Thoennes Mathew S. Weems Charles C. 《The Journal of supercomputing》2003,26(1):25-42

The performance of software on modern architectures has grown more and more difficult to predict and analyze, as modern microprocessors have grown more complex. The execution of a program now entails the complex interaction of code, compiler and processor architecture. The current generation of microprocessors is optimized to an existing set of commercial and scientific benchmarks but new applications such as data mining are becoming a significant part of the workload. In this paper we explore the use of performance monitoring hardware to analyze the execution of C4.5, a data mining application, on the IBM Power2 architecture. We see how the data gathered by the hardware can be used to identify potential changes that can be made to the program and the processor micro-architecture to improve performance. We then go on to evaluate changes to C4.5 and to the micro-architecture. Based on our experience, we identify issues that limit the use of performance monitoring hardware in user level tuning and in extending its use to high performance computing environments. 相似文献

5.

Bug busters

Spinellis D. 《Software, IEEE》2006,23(2):92-93

One way to deal with bugs is to avoid them entirely. The approach would be wasteful because we'd be underutilizing the many automated tools and techniques that can catch bugs for us. Most tools for eliminating bugs work by tightening the specifications of what we build. At the program code level, tighter specifications affect the operations allowed on various data types, our program's behavior, and our code's style. Furthermore, we can use many different approaches to verify that our code is on track: the programming language, its compiler, specialized tools, libraries, and embedded tests are our most obvious friends. We can delegate bug busting to code. Many libraries come with hooks or specialized builds that can catch questionable argument values, resource leaks, and wrong ordering of function calls. Bugs many be a fact of life, but they're not inevitable. We have some powerful tools to find them before they mess with our programs, and the good news is that these tools get better every year. 相似文献

6.

Static test case prioritization using topic models

Stephen W. Thomas Hadi Hemmati Ahmed E. Hassan Dorothea Blostein 《Empirical Software Engineering》2014,19(1):182-212

Software development teams use test suites to test changes to their source code. In many situations, the test suites are so large that executing every test for every source code change is infeasible, due to time and resource constraints. Development teams need to prioritize their test suite so that as many distinct faults as possible are detected early in the execution of the test suite. We consider the problem of static black-box test case prioritization (TCP), where test suites are prioritized without the availability of the source code of the system under test (SUT). We propose a new static black-box TCP technique that represents test cases using a previously unused data source in the test suite: the linguistic data of the test cases, i.e., their identifier names, comments, and string literals. Our technique applies a text analysis algorithm called topic modeling to the linguistic data to approximate the functionality of each test case, allowing our technique to give high priority to test cases that test different functionalities of the SUT. We compare our proposed technique with existing static black-box TCP techniques in a case study of multiple real-world open source systems: several versions of Apache Ant and Apache Derby. We find that our static black-box TCP technique outperforms existing static black-box TCP techniques, and has comparable or better performance than two existing execution-based TCP techniques. Static black-box TCP methods are widely applicable because the only input they require is the source code of the test cases themselves. This contrasts with other TCP techniques which require access to the SUT runtime behavior, to the SUT specification models, or to the SUT source code. 相似文献

7.

Transparent runtime parallelization of the R scripting language

Jiangtian LiAuthor Vitae Xiaosong MaAuthor Vitae Srikanth YoginathAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(2):157-168

相似文献

8.

Improving Fault Detection in Modified Code ——- A Study from the Telecommunication Industry

下载免费PDF全文

Piotr Tomaszewski Lars Lundberg and Haa kan Grahn 《计算机科学技术学报》2007,22(3):397-409

Many software systems are developed in a number of consecutive releases. In each release not only new code is added but also existing code is often modified. In this study we show that the modified code can be an important source of faults. Faults are widely recognized as one of the major cost drivers in software projects. Therefore, we look for methods that improve the fault detection in the modified code. We propose and evaluate a number of prediction models that increase the efficiency of fault detection. To build and evaluate our models we use data collected from two large telecommunication systems produced by Ericsson. We evaluate the performance of our models by applying them both to a different release of the system than the one they are built on and to a different system. The performance of our models is compared to the performance of the theoretical best model, a simple model based on size, as well as to analyzing the code in a random order （not using any model）. We find that the use of our models provides a significant improvement over not using any model at all and over using a simple model based on the class size. The gain offered by our models corresponds to 38-57% of the theoretical maximum gain. 相似文献

9.

Alattin: mining alternative patterns for defect detection

Suresh Thummalapenta Tao Xie 《Automated Software Engineering》2011,18(3-4):293-323

To improve software quality, static or dynamic defect-detection tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rules as frequent patterns from program source code. Then these approaches use static or dynamic defect-detection techniques to detect pattern violations in source code under analysis. However, these existing approaches often produce many false positives due to various factors. To reduce false positives produced by these mining approaches, we develop a novel approach, called Alattin, that includes new mining algorithms and a technique for detecting neglected conditions based on our mining algorithm. Our new mining algorithms mine patterns in four pattern formats: conjunctive, disjunctive, exclusive-disjunctive, and combinations of these patterns. We show the benefits and limitations of these four pattern formats with respect to false positives and false negatives among detected violations by applying those patterns to the problem of detecting neglected conditions. 相似文献

10.

Learning to rank code examples for code search engines

Haoran Niu Iman Keivanloo Ying Zou 《Empirical Software Engineering》2017,22(1):259-291

Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user’s queries. Essentially, a code search engine provides a ranking schema, which combines a set of ranking features to calculate the relevance between a query and candidate code examples. Consequently, the ranking schema places relevant code examples at the top of the result list. However, it is difficult to determine the configurations of the ranking schemas subjectively. In this paper, we propose a code example search approach that applies a machine learning technique to automatically train a ranking schema. We use the trained ranking schema to rank candidate code examples for new queries at run-time. We evaluate the ranking performance of our approach using a corpus of over 360,000 code snippets crawled from 586 open-source Android projects. The performance evaluation study shows that the learning-to-rank approach can effectively rank code examples, and outperform the existing ranking schemas by about 35.65 % and 48.42 % in terms of normalized discounted cumulative gain (NDCG) and expected reciprocal rank (ERR) measures respectively. 相似文献

11.

A model-extraction approach to verifying concurrent C programs with CADP

M.M. Gallardo P. Merino 《Science of Computer Programming》2012,77(3):375-392

The development of reliable software for industrial critical systems benefits from the use of formal models and verification tools for detecting and correcting errors as early as possible. Ideally, with a complete model-based methodology, the formal models should be the starting point to obtain the final reliable code and the verification step should be done over the high-level models. However, this is not the case for many projects, especially when integrating existing code. In this paper, we describe an approach to verify concurrent C code by automatically extracting a high-level formal model that is suitable for analysis with existing tools. The basic components of our approach are: (1) a method to construct a labeled transition system from the source code, that takes flow control and interaction among processes into account; (2) a modeling scheme of the behavior that is external to the program, namely the functionality provided by the operating system; (3) the use of demand-driven static analyses to make a further abstraction of the program, thus saving time and memory during its verification. The whole proposal has been implemented as an extension of the CADP toolbox, which already provides a variety of analysis modules for several input languages using labeled transition systems as the core model. The approach taken fits well within the existing architecture of CADP which does not need to be altered to enable C program verification. We illustrate the use of the extended CADP toolbox by considering examples of the VLTS benchmark suite and C implementations of various concurrent programs. 相似文献

12.

Supporting OpenMP on Cell

Kevin O’Brien Kathryn O’Brien Zehra Sura Tong Chen Tao Zhang 《International journal of parallel programming》2008,36(3):289-311

The Cell processor is a heterogeneous multi-core processor with one power processing engine (PPE) core and eight synergistic processing engine (SPE) cores. There is a significant amount of ongoing research in programming models and tools that attempts to make it easy to exploit the computation power of the Cell architecture. In our work, we explore supporting OpenMP on the Cell processor. It is attractive to support OpenMP because programmers can continue using their familiar programming model, and existing code can be re-used. We base our work on IBM’s XL compiler, and developed new components in the XL compiler and a new runtime library. Three major issues are addressed: (1) synchronization support on heterogeneous cores; (2) code generation targeting the different instruction sets; (3) data transfers and implement the OpenMP memory model. We present experimental results for some SPEC OMP 2001 and NAS benchmarks to demonstrate the effectiveness of this approach. A visualization tool based on Paraver is also used to provide some insights into actual thread and synchronization behaviors. 相似文献

13.

Privacy and security constraints for code contributions

Rodrigo Andrade Paulo Borba 《Software》2020,50(10):1905-1929

In collaborative software development, developers submit their contributions to repositories that are used to integrate code from various collaborators. To avoid privacy and security issues, code contributions are often reviewed before integration. Although careful manual code review can detect such issues, it might be time-consuming, expensive, and error-prone. Automatic analysis tools can also detect privacy and security issues, but they often demand significant developer effort, or are domain specific, considering fixed framework specific vulnerability sources and sinks. To reduce these problems, in this paper we propose the Salvum policy language to support the specification of constraints that help to protect sensitive information from being inadvertently accessed by specific code contributions. We implement a tool that automatically checks Salvum policies for systems of different technical domains. We also investigate whether Salvum can find policy violations for a number of open-source projects. We find evidence that Salvum helps to detect violations even for well-supported and highly active projects. Moreover, our tool helps to find 80 violations in benchmark projects. 相似文献

14.

RUGRAT: Evaluating program analysis and testing tools and compilers with large generated random benchmark applications

下载免费PDF全文

Ishtiaque Hussain Christoph Csallner Mark Grechanik Qing Xie Sangmin Park Kunal Taneja B. M. Mainul Hossain 《Software》2016,46(3):405-431

Benchmarks are heavily used in different areas of computer science to evaluate algorithms and tools. In program analysis and testing, open‐source and commercial programs are routinely used as benchmarks to evaluate different aspects of algorithms and tools. Unfortunately, many of these programs are written by programmers who introduce different biases, not to mention that it is very difficult to find programs that can serve as benchmarks with high reproducibility of results. We propose a novel approach for generating random benchmarks for evaluating program analysis and testing tools and compilers. Our approach uses stochastic parse trees, where language grammar production rules are assigned probabilities that specify the frequencies with which instantiations of these rules will appear in the generated programs. We implemented our tool for Java and applied it to generate a set of large benchmark programs of up to 5M lines of code each with which we evaluated different program analysis and testing tools and compilers. The generated benchmarks let us independently rediscover several issues in the evaluated tools. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

15.

Performance analysis of distributed storage clusters based on kernel and userspace traces

Houssem Daoud Michel R. Dagenais 《Software》2021,51(1):5-24

Distributed storage systems are commonly used in modern computing. They are highly scalable and offer data replication and fault tolerance. The complexity of those systems makes them difficult to debug using traditional tools. The existing tools are able to evaluate the overall performance of such systems but they do not provide enough information to find the root cause of performance issues. In this article, we propose a tracing‐based performance analysis framework for storage clusters. We use a tracing strategy that reduces the tracing overhead in production systems. The traces collected from the different storage nodes are correlated and used to generate a data model that represents the cluster. Userspace tracing is used to gather data from the storage daemons, while Kernel tracing is used to provide detailed information about operating system internals such as disk queues, network queues and process scheduling. Efficient data structures are used to store the model and to generate metrics and graphical views. Our tool is used in different real world scenarios and is able to investigate interesting performance problems including I/O latencies, data replication and storage nodes failures. 相似文献

16.

Performance modeling for SPMD message-passing programs

JÜRGEN BREHM PATRICK H. WORLEY MANISH MADHUKAR 《Concurrency and Computation》1998,10(5):333-357

Today's massively parallel machines are typically message-passing systems consisting of hundreds or thousands of processors. Implementing parallel applications efficiently in this environment is a challenging task, and poor parallel design decisions can be expensive to correct. Tools and techniques that allow the fast and accurate evaluation of different parallelization strategies would significantly improve the productivity of application developers and increase throughput on parallel architectures. This paper investigates one of the major issues in building tools to compare parallelization strategies: determining what type of performance models of the application code and of the computer system are sufficient for a fast and accurate comparison of different strategies. The paper is built around a case study employing the performance prediction tool (PerPreT) to predict performance of the parallel spectral transform shallow water model code (PSTSWM) on the Intel Paragon. PSTSWM is a parallel application code that was designed to evaluate different parallel strategies for the spectral transform method as it is used in climate modeling and weather forecasting. Multiple parallel algorithms and algorithm variants are embedded in the code. PerPreT uses a relatively simple algebraic model to predict execution time for SPMD (single program multiple data) parallel applications. Applications are modeled through parameterized formulae for communication and computation, where the parameters include the problem size, the number of processors used to execute the program, and system characteristics (e.g. setup times for communication, link bandwidth and sustained computing performance per processor). In this paper we describe performance models that predict the performance of the different algorithms in PSTSWM accurately enough to allow them to be compared, establishing the feasibility of such a demanding application of performance modeling. We also discuss issues in generating and validating the performance models, emphasizing the practical importance of tools such as PerPreT in such studies. © 1998 John Wiley & Sons, Ltd. 相似文献

17.

Manipulating Wi‐Fi packet traces with WiPal: design and experience

Thomas Claveirole Marcelo Dias de Amorim 《Software》2012,42(5):585-599

Packet traces are important objects in networking, commonly used in a wide set of applications, including monitoring, troubleshooting, measurements, and validation, to cite a few. Many tools exist to produce and process such traces, but they are often too specific; using them as a basis for creating extended tools is then impractical. Some other tools are generic enough, but exhibit performance issues. This paper reports on our experience designing WiPal, a packet trace manipulation framework with a focus on IEEE 802.11. WiPal is designed for performance and re‐usability, while introducing several novel features compared to previous solutions. Besides presenting how WiPal's original design can benefit packet processing programs, we discuss a number of issues a program designer might encounter when writing packet trace processing software. An evaluation of WiPal shows that, albeit generic, it does not impact performance regarding execution speed. WiPal achieves performance levels observed only with specialized code and outperforms some well‐known packet processing programs. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

18.

Mining Effective Temporal Specifications from Heterogeneous API Data

下载免费PDF全文

吴倩梁广泰王千祥梅宏《计算机科学技术学报》2011,26(6):1061-1075

Temporal specifications for Application Programming Interfaces (APIs) serve as an important basis for many defect detection tools. As these specifications are often not well documented, various approaches have been proposed to automatically mine specifications typically from API library source code or from API client programs. However, the library-based approaches take substantial computational resources and produce rather limited useful specifications, while the client-based approaches suffer from high false positive rates. To address the issues of existing approaches, we propose a novel specification mining approach, called MineHEAD, which exploits heterogeneous API data, including information from API client programs as well as API library source code and comments, to produce effective specifications for defect detection with low cost. In particular, MineHEAD first applies client-based specification mining to produce a collection of candidate specifications, and then exploits the related library source code and comments to identify and refine the real specifications from the candidates. Our evaluation results on nine open source projects show that MineHEAD produces effective specifications with average precision of 97.2%. 相似文献

19.

代码相似性检测方法与工具综述

张丹罗平《计算机科学》2020,47(3):5-10

在代码开源的潮流下,代码克隆在提高代码质量和降低开发成本的同时,一定程度地影响了软件系统的稳定性、健壮性与可维护性。代码相似性检测在计算机与信息安全发展方面具有重要的意义。为应对代码克隆带来的各种危害,目前学术界和工业界提出了很多代码相似性检测的方法,这些方法按照源代码信息处理程度可分为基于文本、词法、语法、语义和度量值5类;并开发了相应的检测工具,这些工具实现了很好的检测效果,但在大数据时代背景下也面临着数据规模不断扩大带来的一系列挑战。文中综述了代码相似性检测的方法,对5类检测方法做了详细比较;结合传统方法与机器学习技术,归类了不同检测方法对应的检测工具;按照不同评价标准评估了检测工具的检测效果,总结了每种检测方法的首选检测工具,并对未来代码相似性检测的研究方向做出了展望。相似文献

20.

A boosting ensemble for the recognition of code sharing in malware

Stanley J. Barr Samuel J. Cardman David M. MartinJr. 《Journal in Computer Virology》2008,4(4):335-345

Research and development efforts have recently started to compare malware variants, as it is believed that malware authors are reusing code. A number of these projects have focused on identifying functions through the use of signature-based classifiers. We introduce three new classifiers that characterize a function’s use of global data. Experiments on malware show that we can meaningfully correlate functions on the basis of their global data references even when their functions share little code. We also present an algorithm that combines existing classifiers and our new ones into an ensemble for correlating functions in two binary programs. For testing, we developed a model for comparing our work to previous signature based classifiers. We then used that model to show how our new combined ensemble classifier dominates the previously reported classifiers. The resulting ensemble can be used by malware analysts when they are comparing two binaries. This technique will allow them to correlate both functions and global data references between the two and will lead to a quick identification of any sharing that is occurring. 相似文献