期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Design and Prototype of a Performance Tool Interface for OpenMP

Mohr Bernd Malony Allen D. Shende Sameer Wolf Felix 《The Journal of supercomputing》2001,23(1):105-128

This paper proposes a performance tools interface for OpenMP, similar in spirit to the MPI profiling interface in its intent to define a clear and portable API that makes OpenMP execution events visible to runtime performance tools. We present our design using a source-level instrumentation approach based on OpenMP directive rewriting. Rules to instrument each directive and their combination are applied to generate calls to the interface consistent with directive semantics and to pass context information (e.g., source code locations) in a portable and efficient way. Our proposed OpenMP performance API further allows user functions and arbitrary code regions to be marked and performance measurement to be controlled using new OpenMP directives. To prototype the proposed OpenMP performance interface, we have developed compatible performance libraries for the Expert automatic event trace analyzer [17, 18] and the TAU performance analysis framework [13]. The directive instrumentation transformations we define are implemented in a source-to-source translation tool called OPARI. Application examples are presented for both Expert and TAU to show the OpenMP performance interface and OPARI instrumentation tool in operation. When used together with the MPI profiling interface (as the examples also demonstrate), our proposed approach provides a portable and robust solution to performance analysis of OpenMP and mixed-mode (OpenMP+MPI) applications. 相似文献

2.

Design and Prototype of a Performance Tool Interface for OpenMP

Bernd Mohr Allen D. Malony Sameer Shende Felix Wolf 《The Journal of supercomputing》2002,23(1):105-128

This paper proposes a performance tools interface for OpenMP, similar in spirit to the MPI profiling interface in its intent to define a clear and portable API that makes OpenMP execution events visible to runtime performance tools. We present our design using a source-level instrumentation approach based on OpenMP directive rewriting. Rules to instrument each directive and their combination are applied to generate calls to the interface consistent with directive semantics and to pass context information (e.g., source code locations) in a portable and efficient way. Our proposed OpenMP performance API further allows user functions and arbitrary code regions to be marked and performance measurement to be controlled using new OpenMP directives. To prototype the proposed OpenMP performance interface, we have developed compatible performance libraries for the Expert automatic event trace analyzer [17, 18] and the TAU performance analysis framework [13]. The directive instrumentation transformations we define are implemented in a source-to-source translation tool called OPARI. Application examples are presented for both Expert and TAU to show the OpenMP performance interface and OPARI instrumentation tool in operation. When used together with the MPI profiling interface (as the examples also demonstrate), our proposed approach provides a portable and robust solution to performance analysis of OpenMP and mixed-mode (OpenMP+MPI) applications. 相似文献

3.

Model-driven monitoring support for the multi-view performance analysis of parallel embedded applications 总被引：1，自引：0，他引：1

J. Reference to Garcí a J. Reference to Entrialgo F. J. Reference to Su rez D. F. Reference to Garcí a 《Performance Evaluation》2000,39(1-4):81-98

This paper describes an approach to carry out performance analysis of parallel embedded applications. The approach is based on measurement, but in addition, the idea of driving the measurement process (application instrumentation and monitoring) by a behavioral model is introduced. Using this model, highly comprehensible performance information can be collected. The whole approach is based on this behavioral model, one instrumentation method and two tools, one for monitoring and the other for visualization and analysis. Each of these is briefly described, and the steps to carry out performance analysis using them are clearly defined. They are explained by means of a case study. Finally, one method to evaluate the intrusiveness of the monitoring approach is proposed, and the intrusiveness results for the case study are presented. 相似文献

4.

The Paradyn parallel performance measurement tool 总被引：1，自引：0，他引：1

Miller B.P. Callaghan M.D. Cargille J.M. Hollingsworth J.K. Irvin R.B. Karavanic K.L. Kunchithapadam K. Newhall T. 《Computer》1995,28(11):37-46

Paradyn is a tool for measuring the performance of large-scale parallel programs. Our goal in designing a new performance tool was to provide detailed, flexible performance information without incurring the space (and time) overhead typically associated with trace-based tools. Paradyn achieves this goal by dynamically instrumenting the application and automatically controlling this instrumentation in search of performance problems. Dynamic instrumentation lets us defer insertion until the moment it is needed (and remove it when it is no longer needed); Paradyn's Performance Consultant decides when and where to insert instrumentation 相似文献

5.

Platform‐independent profiling in a virtual execution environment

Walter Binder Jarle Hulaas Philippe Moret Alex Villazón 《Software》2009,39(1):47-79

Virtual execution environments, such as the Java virtual machine, promote platform‐independent software development. However, when it comes to analyzing algorithm complexity and performance bottlenecks, available tools focus on platform‐specific metrics, such as the CPU time consumption on a particular system. Other drawbacks of many prevailing profiling tools are high overhead, significant measurement perturbation, as well as reduced portability of profiling tools, which are often implemented in platform‐dependent native code. This article presents a novel profiling approach, which is entirely based on program transformation techniques, in order to build a profiling data structure that provides calling‐context‐sensitive program execution statistics. We explore the use of platform‐independent profiling metrics in order to make the instrumentation entirely portable and to generate reproducible profiles. We implemented these ideas within a Java‐based profiling tool called JP. A significant novelty is that this tool achieves complete bytecode coverage by statically instrumenting the core runtime libraries and dynamically instrumenting the rest of the code. JP provides a small and flexible API to write customized profiling agents in pure Java, which are periodically activated to process the collected profiling information. Performance measurements point out that, despite the presence of dynamic instrumentation, JP causes significantly less overhead than a prevailing tool for the profiling of Java code. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

6.

Performance measurement,visualization and modeling of parallel and distributed programs using the AIMS toolkit

Jerry Yan Sekhar Sarukkai Pankaj Mehra 《Software》1995,25(4):429-461

Writing large-scale parallel and distributed scientific applications that make optimum use of the multiprocessor is a challenging problem. Typically, computational resources are underused due to performance failures in the application being executed. Performance-tuning tools are essential for exposing these performance failures and for suggesting ways to improve program performance. In this paper, we first address fundamental issues in building useful performance-tuning tools and then describe our experience with the AIMS toolkit for tuning parallel and distributed programs on a variety of platforms. AIMS supports source-code instrumentation, run-time monitoring, graphical execution profiles, performance indices and automated modeling techniques as ways to expose performance problems of programs. Using several examples representing a broad range of scientific applications, we illustrate AIMS' effectiveness in exposing performance problems in parallel and distributed programs. 相似文献

7.

A Performance Analysis Tool for PVM Parallel Programs 总被引：1，自引：0，他引：1

ChenWang YinLiu ChangjunJiang ZhaoqingZhang 《计算机工程与应用》2004,40(29):103-105,112

In this paper,we introduce the design and implementation of ParaVT,which is a visual performance analysis and parallel debugging tool.In ParaVT,we propose an automated instrumentation mechanism.Based on this mechanism,ParaVT automatically analyzes the performance bottleneck of parallel applications and provides a visual user interface to monitor and analyze the performance of parallel programs.In addition,it also supports certain extensions. 相似文献

8.

基于网络断层扫描的网格网络性能测量分析

王伟蔡皖东李勇军《计算机科学》2007,34(5):45-47

网格计算通过网络连接来获得一个高性能和高效的计算平台。网格网络的监测和性能测量为网格性能分析、负载平衡、任务调度等提供了重要的科学依据，而成为大规模网格服务的关键组件。现有的几种网格监测方法因缺乏对监测数据的推断分析而无法对网格网络的性能进行测量。通过对网格网络性能测量的特点、GloPerf及传统网络测量技术的分析，提出了基于网络断层扫描的网格网络性能测量方法。研究结果为网格网络性能的测量提供了新的途径。相似文献

9.

An entropy-based algorithm for data elimination in time-driven software instrumentation

Ahmet Özmen^{Author Vitae} 《Journal of Systems and Software》2009,82(5):907-913

While monitoring, instrumented long running parallel applications generate huge amount of instrumentation data. Processing and storing this data incurs overhead, and perturbs the execution. A technique that eliminates unnecessary instrumentation data and lowers the intrusion without loosing any performance information is valuable for tool developers. This paper presents a new algorithm for software instrumentation to measure the amount of information content of instrumentation data to be collected. The algorithm is based on entropy concept introduced in information theory, and it makes selective data collection for a time-driven software monitoring system possible. 相似文献

10.

Clustering algorithm for parallelizing software systems inmultiprocessors environment

Kadamuddi D. Tsai J.J.P. 《IEEE transactions on pattern analysis and machine intelligence》2000,26(4):340-361

A variety of techniques and tools exist to parallelize software systems on different parallel architectures (SIMD, MIMD). With the advances in high-speed networks, there has been a dramatic increase in the number of client/server applications. A variety of client/server applications are deployed today, ranging from simple telnet sessions to complex electronic commerce transactions. Industry standard protocols, like Secure Socket Layer (SSL), Secure Electronic Transaction (SET), etc., are in use for ensuring privacy and integrity of data, as well as for authenticating the sender and the receiver during message passing. Consequently, a majority of applications using parallel processing techniques are becoming synchronization-centric, i.e., for every message transfer, the sender and receiver must synchronize. However, more effective techniques and tools are needed to automate the clustering of such synchronization-centric applications to extract parallelism. The authors present a new clustering algorithm to facilitate the parallelization of software systems in a multiprocessor environment. The new clustering algorithm achieves traditional clustering objectives (reduction in parallel execution time, communication cost, etc.). Additionally, our approach: 1) reduces the performance degradation caused by synchronizations, and 2) avoids deadlocks during clustering. The effectiveness of our approach is depicted with the help of simulation results 相似文献

11.

Development and performance analysis of real‐world applications for distributed and parallel architectures

T. Fahringer P. Blaha A. Hssinger J. Luitz E. Mehofer H. Moritsch B. Scholz 《Concurrency and Computation》2001,13(10):841-868

Several large real‐world applications have been developed for distributed and parallel architectures. We examine two different program development approaches. First, the usage of a high‐level programming paradigm which reduces the time to create a parallel program dramatically but sometimes at the cost of a reduced performance; a source‐to‐source compiler, has been employed to automatically compile programs—written in a high‐level programming paradigm—into message passing codes. Second, a manual program development by using a low‐level programming paradigm—such as message passing—enables the programmer to fully exploit a given architecture at the cost of a time‐consuming and error‐prone effort. Performance tools play a central role in supporting the performance‐oriented development of applications for distributed and parallel architectures. SCALA—a portable instrumentation, measurement, and post‐execution performance analysis system for distributed and parallel programs—has been used to analyze and to guide the application development, by selectively instrumenting and measuring the code versions, by comparing performance information of several program executions, by computing a variety of important performance metrics, by detecting performance bottlenecks, and by relating performance information back to the input program. We show several experiments of SCALA when applied to real‐world applications. These experiments are conducted for a NEC Cenju‐4 distributed‐memory machine and a cluster of heterogeneous workstations and networks. Copyright © 2001 John Wiley & Sons, Ltd. 相似文献

12.

Faust: an integrated environment for parallel programming

Guarna V.A. Jr. Gannon D. Jablonowski D. Malony A.D. Gaur Y. 《Software, IEEE》1989,6(4):20-27

相似文献

13.

基于聚类的响应时间分析方法

华悦姜旭刘振宇丁志刚《计算机应用与软件》2012,29(8):200-201,274

随着互联网应用的迅速发展,互联网技术已经应用于各类专业领域,如医疗卫生应用领域等。使用第三方性能测试工具模拟多用户并发测量应用的性能已经得到绝大多数软件企业的认可。在使用第三方性能测试工具进行性能测试时,响应时间作为最主要的衡量指标为开发者和用户提供了直观的参考数据。传统的第三方测试工具一般采用平衡值对响应时间进行分析,以PACS/MIIS系统中WEB方式打开影像为研究对象,研究了基于聚类的响应时间分析,提供了一种量化分析响应时间的方法。相似文献

14.

A methodology for transparent knowledge specification in a dynamic tuning environment

P. Caymes‐Scutari A. Morajko T. Margalef E. Luque 《Software》2012,42(3):281-302

The increasing use of parallel/distributed applications demands a continuous support to take significant advantages from parallel power. This includes the evolution of performance analysis and tuning tools which automatically allows for obtaining a better behavior of the applications. Different approaches and tools have been proposed and they are continuously evolving to cover the requirements and expectations of users. One such tool is MATE (Monitoring Analysis and Tuning Environment), which provides automatic and dynamic tuning for parallel/distributed applications. The knowledge used by MATE to analyze and take decisions is based on performance models which include a set of performance parameters and a set of mathematical expressions modeling the solution of the performance problem. These elements are used by the tuning environment to conduct the monitoring and analysis steps, respectively. The tuning phase depends on the results of the performance analysis. This paper presents a methodology to specify performance models. Each performance model specification can be automatically and transparently translated into a piece of software code encapsulating the knowledge to be straightforwardly included in MATE. Applying this methodology, the user does not have to be involved in the implementation details of MATE, which makes the usage of the tool more transparent. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

15.

基于运行时代码修改的动态性能监控关键技术研究

蒋杰徐涵刘杰杨灿群胡庆丰《计算机工程与科学》2009,31(Z1)

代码插桩是程序性能监控的重要环节。动态插桩通过对可执行程序代码的运行时修改支持动态性能监控,有助于降低性能分析工具的构建代价,提高工具易用性。本文首先阐述了Dyninst动态插桩系统的概念抽象与执行机理,然后结合大规模并行程序动态插桩的需要,对DPCL并行程序动态插桩基础设施以及基于MRNet的可扩展通信结构进行了深入分析。相似文献

16.

Sensitivity of Performance Prediction of Message Passing Programs

Girona Sergi Labarta Jesús 《The Journal of supercomputing》2000,17(3):291-298

This paper discusses the issues related to the accuracy of performance prediction tools for message passing programs. We present the results of two sets of experiments to quantify the effect of the instrumentation overhead and variance in the accuracy of Dimemas. The results show that this performance prediction tool can be used with a high level of confidence as the effect of instrumentation overhead on the predicted performance is minimal. We also show that it is possible to carry out instrumentation runs in highly loaded multi-user environments and still be able to accurately analyze the performance of the application as if it had run alone. 相似文献

17.

Dynamic Instrumentation, Performance Monitoring and Analysis of Grid Scientific Workflows

Hong-Linh Truong Thomas Fahringer Schahram Dustdar 《Journal of Grid Computing》2005,3(1-2):1-18

While existing work concentrates on developing QoS models of business workflows and Web services, few tools have been developed to support the monitoring and performance analysis of scientific workflows in Grids. This paper describes novel Grid services for dynamic instrumentation of Grid-based applications, performance monitoring and analysis of Grid scientific workflows. We describe a Grid dynamic instrumentation service that provides a widely accessible interface for other services and users to conduct the dynamic instrumentation of Grid applications during the runtime. We introduce a Grid performance analysis service for Grid scientific workflows. The analysis service utilizes various types of data including workflow graphs, monitoring data of resources, execution status of activities, and performance measurements obtained from the dynamic instrumentation of invoked applications, and provides a rich set of functionalities and features to support the online monitoring and performance analysis of scientific workflows. Workflows and their relevant information including performance metrics are stored and utilized for comparing the performance of constructs of different workflows and for supporting multi-workflow analysis. The work described in this paper is supported in part by the Austrian Science Fund as part of the Aurora Project under contract SFBF1104 and by the European Union through the IST-2002-511385 project K-WfGrid. 相似文献

18.

Designing parallel systems: a performance prediction problem

E Luque R Suppi J Sorribes 《Information and Software Technology》1992,34(12):813-823

PSEE (Parallel System Evaluation Environment) is a software tool that provides a multiprocessor system for research into alternative architectural decisions and experimentation, with such issues as selection, design, tuning, scheduling, clustering and routing policies. PSEE facilitates simulation and performance evaluation as well as a prediction environment for the design and tuning of parallel systems. These tasks involve cycles through programming, simulation, measurement, visualization and modification of parallel system parameters. PSEE includes a parallel programming tool, a simulator for link oriented parallel systems, BOLAS, and a performance evaluation tool, GRAPH. These PSEE modules are tools oriented to support the above tasks in user-friendly, interactive and animated graphical form. PSEE provides quantitative information in a graphical tailored form. This numerical/graphical output helps the user make decisions about his/her particular development. 相似文献

19.

An instrumentation system to measure user performance in interactive systems

Taylor L. Booth Reda Ammar Robert Lenk 《Journal of Systems and Software》1981,2(2):139-146

Many software tools are interactive in nature and require a close match between the user's knowledge of how a task is to be performed and the capabilities the tool provides. This paper describes the current status of an instrumentation and analysis package to measure user performance in an interactive system. A prototype measurement system is considered to evaluate a screen editor and to develop models of user behavior. 相似文献

20.

Rewriting executable files to measure program behavior

James R. Larus Thomas Ball 《Software》1994,24(2):197-218

Inserting instrumentation code in a program is an effective technique for detecting, recording, and measuring many aspects of a program's performance. Instrumentation code can be added at any stage of the compilation process by specially-modified system tools such as a compiler or linker or by new tools from a measurement system. For several reasons, adding instrumentation code after the compilation process—by rewriting the executable file—presents fewer complications and leads to more complete measurements. This paper describes the difficulties in adding code to executable files that arose in developing the profiling and tracing tools qp and qpt. The techniques used by these tools to instrument programs on MIPS and SPARC processors are applicable in other instrumentation systems running on many processors and operating systems. In addition, many difficulties could have been avoided with minor changes to compilers and executable file formats. These changes would simplify this approach to measuring program performance and make it more generally useful. 相似文献