首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Dominic A. Varley 《Software》1993,23(4):461-463
The Unix profiling tool Gprof produces an execution profile in which the time taken by each routine is added to that of its caller. This communication demonstrates that gprof generates a misleading profile when a routine is called from more than one module.  相似文献   

2.
    
Paul E. McKenney 《Software》1999,29(3):219-234
Performance can be a critical aspect of software quality; in some systems, poor performance can cause financial loss, physical damage, or even death. In such cases, it is imperative to identify system performance problems before deployment, preferably well before implementation. Unfortunately, the size of most software systems grossly exceeds the capacity of current performance‐modelling techniques. Hence, there is a need for techniques to quickly identify the portions of the system that are performance‐critical. These portions are often small enough to be modelled directly. This paper describes one such technique, differential profiling. Differential profiling combines two or more conventional profiles of a given program run in different situations or conditions. The technique mathematically combines corresponding buckets of the conventional profiles, then sorts the resulting list by these combined values. Different combining functions are suitable for different situations. This combining of conventional profiles frequently yields much greater insight than could be obtained from either of the conventional profiles. Hence, differential profiling helps to locate difficult‐to‐find performance bottlenecks, such as those that are distributed widely throughout a large program or system, perhaps by being concealed within macros or inlined functions. This paper also describes how this technique may be used to pinpoint certain types of performance bottlenecks in large programs running on large‐scale shared‐memory multiprocessors. In this environment, the critical bottleneck might consume only a small fraction of the total CPU time, since typical critical sections can consume at most one CPUs worth of computation. This sort of bottleneck, particularly when widely distributed throughout the program under consideration, is often invisible to traditional profiling techniques. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

3.
Accurate, continuous resource monitoring and profiling are critical for enabling performance tuning and scheduling optimization. In desktop grid systems that employ sandboxing, these issues are challenging because (1) subjobs inside sandboxes are executed in a virtual computing environment and (2) the state of this virtual environment within the sandboxes is reset to an initial empty state after a subjob completion.DGMonitor is a monitoring tool which builds a global, accurate, and continuous view of real resource utilization for desktop grids with sandboxing. Our monitoring tool measures performance unobtrusively and reliably, uses a simple performance data model, and is easy to use. Our measurements demonstrate that DGMonitor can scale to large desktop grids (up to 12000 PCs) with low monitoring overhead in terms of resource consumption (less than 0.1% per machine).Though we originally developed DGMonitor with the Entropia DCGrid platform, our tool is easily portable and integrated into other desktop grid systems. In all of these systems, DGMonitor data can support existing and novel information services, particularly for performance tuning and scheduling. In this paper, the high scalability and monitoring power of DGMonitor are demonstrated with the Entropia DCGrid platform and the BOINC platform respectively.  相似文献   

4.
程序优化是提高程序运行效率的重要步骤,程序剖析是程序优化的第一步。对于串行语言,程序剖析代码是由编译器通过一个命令行开关自动插入。但是,大部分并行语言编译器都不具有这个功能。该文以并行C++语言的可移植的动态剖析程序(profiler)为例,从两方面对问题进行了论述:首先给出实现可移植动态剖析程序的一般方法;然后分析一个用于pC++插桩(Instrumentation)工具。  相似文献   

5.
分布式网络监控的时钟同步问题研究   总被引:2,自引:0,他引:2  
范逊  宋成 《计算机应用与软件》2007,24(5):131-132,150
介绍了CNIC分布式网络监控器及其时钟同步问题,根据分布式监控对NTP的需求,设计了相应的部署方案,并对NTP服务的运行情况进行了分析.  相似文献   

6.
设计并构造的多路电话监控系统提由一台PC机和多台8031单片机组成,系统成功地实现了电话信号的实时采集与处理。系统的模块性好,易于扩充,可根据需要管理的电话睡数方便地配置成不同系统。  相似文献   

7.
    
The roofline analysis model is a visually intuitive performance model used to understand hardware performance limitations as well as potential benefits of optimizations for science and engineering applications. Intel Advisor has provided a useful roofline analysis feature since its version 2017 update 2, but it is not widely compatible with other compilers and chip‐architectures. As an alternative, we have employed Cray Performance Analysis Tools (CrayPat) that are more flexible for multiple compilers and architectures. First, we present our procedure for measuring a reliable computational intensity for roofline analysis. We performed several numerical studies for validation via manually derived reference data as well as data from Intel Advisor. Second, we provide roofline analysis results on Blue Waters for several HPC benchmarks and sparse linear algebra libraries. In addition, we present an example of roofline‐based performance projection for a future system.  相似文献   

8.
A wireless sensor network (WSN) can be construed as an intelligent, largely autonomous, instrument for scientific observation at fine temporal and spatial granularities and over large areas. The ability to perform spatial analyses over sensor data has often been highlighted as desirable in areas such as environmental monitoring. Whilst there exists research on computing topological changes of dynamic phenomena, existing proposals do not allow for more expressive in-network spatial analysis. This paper addresses the challenges involved in using WSNs to identify, track and report topological relationships between dynamic, transient spatial phenomena and permanent application-specific geometries focusing on cases where the geometries involved can be characterized by sets of nodes embedded in a finite 2-dimensional space. The approach taken is algebraic, i.e., analyses are expressed as algebraic expressions that compose primitive operations (such as Adjacent, or AreaInside). The main contributions are distributed algorithms for the operations in the proposed algebra and an empirical evaluation of their performance in terms of bit complexity, response time, and energy consumption.  相似文献   

9.
    
Large-scale plant-wide processes have become more common and monitoring of such processes is imperative. This work focuses on establishing a distributed monitoring scheme incorporating multivariate statistical analysis and Bayesian method for large-scale plant-wide processes. First, the necessity of distributed monitoring is demonstrated by theoretical analysis on the impact of process decomposition on multivariate statistical process monitoring performance. Second, a stochastic optimization algorithm-based performance-driven process decomposition method is proposed which aims to achieve the best possible monitoring performance from process decomposition aspect. Based on the obtained sub-blocks, local monitors are established to characterize local process behaviors, and then a Bayesian fault diagnosis system is established to identify the underlying process status of the entire process. The proposed distributed monitoring scheme is applied on a numerical example and the Tennessee Eastman benchmark process. Comparison results to some state-of-the-art methods indicate the efficiency and feasibility.  相似文献   

10.
In this paper we propose a new simulation platform called SIMCAN, for analyzing parallel and distributed systems. This platform is aimed to test parallel and distributed architectures and applications. The main characteristics of SIMCAN are flexibility, accuracy, performance, and scalability. Thence, the proposed platform has a modular design that eases the integration of different basic systems on a single architecture. Its design follows a hierarchical schema that includes simple modules, basic systems (computing, memory managing, I/O, and networking), physical components (nodes, switches, …), and aggregations of components. New modules may also be incorporated as well to include new strategies and components. Also, a graphical configuration tool has been developed to help untrained users with the task of modelling new architectures. Finally, a validation process and some evaluation tests have been performed to evaluate the SIMCAN platform.  相似文献   

11.
分布式对象技术是当今计算机软件开发所采用的一种重要技术,与传统开发技术相比,它具有更好的开放性和扩展性.在对分布式监控系统模型进行分析的基础上,使用分布式技术,提出了一种基于CORBA的分布式监控系统的系统框架,并针对实际应用讨论了系统各个成员的定义、实现以及部署方案.  相似文献   

12.
Execution profiles are important in analyzing the performance of computer programs on a given computer system. However, accurate and complete profiles are difficult to arrive at for programs that follow the client-server model of computing, as in the popular X Window System. In X Window applications, considerable computation is invoked at the display server and this computation is an important part of the overall execution profile. The profiler presented in this paper generates meaningful profiles for X Window applications by estimating the time spent in servicing the messages in the display server. The central idea is to analyze a protocol-level trace of the interaction between the application and the display server and thereby construct an execution profile from the trace and a set of metrics about the target display server. Experience using the profiler for examining bottlenecks is presented.  相似文献   

13.
张宇峰 《微机发展》2006,16(8):69-71
Itanium2处理器以寄存器组的形式提供的性能监视单元实现了在程序运行过程中捕捉微结构事件的功能。文中介绍了以Linux为Itanium2的性能监视单元提供的接口perfmon为基础的开发相对高端的性能分析工具的方法,以实现对这些由性能监视硬件提供的数据进行综合处理利用。  相似文献   

14.
对于共享cache的多核处理器,如何管理好各个核对cache的利用,对于充分发挥多核处理器性能是很关键的问题.目前采用的cache替换方法程序间会出现性能干扰,cache静态划分技术则是通过为同时运行的程序分配不同的空间来解决性能干扰问题.为了给程序分配合适大小的cache空间,需要对程序进行性能profiling,即事先多遍运行收集程序在各种cache容量下的性能数据,这种性能profiling方法开销巨大,影响实用.为了解决性能profiling需要多遍运行程序的问题,提出了只需单遍运行的程序性能profiling优化技术.该技术利用在线的phase分析技术识别程序的运行阶段,避免对相同阶段的重复profiling;同时分析程序各phase的性能同cache容量变化的关系趋势,对于性能不敏感的容量变化则不进行profiling,降低开销.在程序运行结束后通过程序各phase在cache各种容量下的性能来估计程序在各容量下的整体性能,以指导cache静态划分.实验表明,该技术的开销仅为7%,而该方法指导的cache划分比未划分时有8%的性能改进,同多遍运行的程序性能profiling指导的cache划分性能相比仅有1%的下降.  相似文献   

15.
编译器在静态分析方式下很难对程序的非线性规律访存操作进行正确的数据预取.但采用profiling技术可以得到程序运行时候的访存规律,利用这些信息可以精确地插入数据预取指令.基于stride profiling技术,提出了新的信息收集类型stride iterative,更精确地反映程序执行时访存指令的实际行为,并结合别名分析的结果调整对同一cache行的数据预取,得到比普通数据预取更好的预取性能.安腾2上运行CPU2000的12个整型测试例子平均有8.54%的性能提升,其中mcf性能提升达到了77.87%.  相似文献   

16.
Monitoring the changes in data values obtained from the environment (e.g., locations of moving objects) is a primary concern in many fields, as for example in the pervasive computing environment. The monitoring task is challenging from a double perspective. First and foremost, the environment can be highly dynamic in terms of the rate of data changes. Second, the monitored data are often not available from a single computer/device but are distributed; moreover, the set of data providers can change along the course of time. Therefore, obtaining a global snapshot of the environment and keeping it up-to-date is not easy, especially if the conditions (e.g., network delays) change.In this article, a decentralized, loose, and fault-tolerant monitoring approach based on the use of mobile agents is described. Mobile agents allow easy tracking of the involved computers, carrying the monitoring tasks to wherever they are needed. A deadline-based mechanism is used to coordinate the cooperative agents, which strive to perform their continuous tasks in time while considering data as recent as possible, constantly adapting themselves to new environmental conditions (changing communication and processing delays). This approach has been successfully used in a real environment and experiments were carried out to prove its feasibility and benefits.  相似文献   

17.
网格资源监控关键技术的研究   总被引:2,自引:0,他引:2  
网格资源监控可以获得网格环境中的各种动态资源和各个节点的状态,提供对网格计算环境的一个真实、实时的动态反映。GridFerret系统是一种基于移动Agent技术的网格监控系统,将移动Agent技术应用于网格环境中,能减少远程计算机网络的连接费用和通讯代价,充分发挥了移动代理的优势,为网格技术的发展提供了新的思路。  相似文献   

18.
监测科技领域的变化情况,洞察科技领域的发展态势是文献情报机构的一项重要任务.在国家科技支撑计划项目和中国科学院项目的支持下,国家科学图书馆开发了适用于领域监测的“网络科技自动监测系统”,这一系统的建设目标是帮助战略情报研究团队全面及时地跟踪监测特定领域内一些重要科研机构发布的网络信息资源, 通过信息的采集、知识抽取、信息分析等技术,有效揭示目标科研机构在战略规划、研究布局、重要研究进展等方面的重要动态信息,深入反映领域内的科技创新态势.这一系统已在相关的战略情报研究团队中实际应用,取得了很好的服务效果.论文研究了网络科技信息自动监测系统的建设思路、技术框架和具体的系统实现情况,并对应用效果进行了总结分析.  相似文献   

19.
Chattering alarms (alarms that repeat excessively in a short time interval) create a level of nuisance to the operator. According to industrial alarm standards, no chattering is acceptable. Therefore, reducing the amount of alarm chatter is a primary goal in redesigning alarm parameters. A quantitative measure to assess the degree of chattering for an alarm has recently been proposed. This measure is based on run length distribution which is the distribution of time differences between consecutive alarms. This chattering index is currently calculated empirically based on alarm data. In this paper, we develop a method to estimate the chattering index based on statistical properties of the process variable as well as alarm parameters. The estimation can be used for developing analytical methods to optimally design alarm parameters for minimal chattering.  相似文献   

20.
    
This paper documents the design and implementation of the IN‐Tune software tool suite, which enables a user to collect real‐time code and hardware profiling information on Intel‐based symmetric multiprocessors running the Linux operating system. IN‐Tune provides a virtually non‐invasive tool for performance analysis and tuning of programs. Unlike other analysis tools, IN‐Tune isolates data with respect to individual threads. It also utilizes performance monitoring hardware registers to permit instrumentation of individual threads as they run in‐situ, thus collecting data with appropriate considerations for a multiprocessor environment. Data can be sampled using two different mechanisms. First, the user can collect data by making calls to the system upon the occurrence of specific software events. Secondly, data can be collected at a fixed, fine grain (e.g. 1–10 microseconds) interval using either software or hardware interrupts. To allow observation of codes for which source code modification is impractical or impossible, a ‘shell’ task is created which permits monitoring without code modification. Although this work deals with Intel processors and Linux, the widespread availability of performance monitoring registers in modern processors makes this work widely applicable. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号