期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Using accurate AIC-based performance models to improve the scheduling of parallel applications

Diego R. Martínez Julio L. Albín Tomás F. Pena José C. Cabaleiro Francisco F. Rivera Vicente Blanco 《The Journal of supercomputing》2011,58(3):332-340

Predictions based on analytical performance models can be used on efficient scheduling policies in order to select adequate resources for an optimal execution in terms of throughput and response time. However, developing accurate analytical models of parallel applications is a hard issue. The TIA (Tools for Instrumenting and Analysis) modeling framework provides an easy to use modeling method for obtaining analytical models of MPI applications. This method is based on modeling selection techniques and, in particular, on Akaike’s information criterion (AIC). In this paper, first the AIC-based performance model of the HPL benchmark is obtained using the TIA modeling framework. Then the use of this model for assessing the runtime estimation on different backfilling policies is analyzed in the GridSim simulator. The behavior of these simulations is compared with the equivalent simulations based on the theoretical model of the HPL provided by its developers. 相似文献

2.

Parsec: a parallel simulation environment for complex systems

Bagrodia R. Meyer R. Takai M. Yu-An Chen Xiang Zeng Martin J. Ha Yoon Song 《Computer》1998,31(10):77-85

Design and development costs for extremely large systems could be significantly reduced if only there were efficient techniques for evaluating design alternatives and predicting their impact on overall system performance metrics. Due to the systems' analytical intractability, simulation is the most common performance evaluation technique for such systems. However, the long execution times needed for sequential simulation models often hampers evaluation. The slow speeds of sequential model execution have led to growing interest in the use of parallel execution for simulating large-scale systems. Widespread use of parallel simulation, however; has been significantly hindered by a lack of tools for integrating parallel model execution into the overall framework of system simulation. Another drawback to widespread use of simulations is the cost of model design and maintenance. The simulation environment the authors developed at UCLA attempts to address some of these issues. It consists of three primary components: a parallel simulation language called Parsec (parallel simulation environment for complex systems), its GUI, called Pave, and the portable runtime system that implements the simulation algorithms 相似文献

3.

A regression‐based performance prediction framework for synchronous iterative algorithms on general purpose graphical processing unit clusters

Vivek K. Pallipuram Melissa C. Smith Nimisha Raut Xiaoyu Ren 《Concurrency and Computation》2014,26(2):532-560

Heterogeneous performance prediction models are valuable tools to accurately predict application runtime, allowing for efficient design space exploration and application mapping. The existing performance models require intricate system architecture knowledge, making the modeling task difficult. In this research, we propose a regression‐based performance prediction framework for general purpose graphical processing unit (GPGPU) clusters that statistically abstracts the system architecture characteristics, enabling performance prediction without detailed system architecture knowledge. The regression‐based framework targets deterministic synchronous iterative algorithms using our synchronous iterative GPGPU execution model and is broken into two components: the computation component that models the GPGPU device and host computations and the communication component that models the network‐level communications. The computation component regression models use algorithm characteristics such as the number of floating‐point operations and total bytes as predictor variables and are trained using several small, instrumented executions of synchronous iterative algorithms that include a range of floating‐point operations‐to‐byte requirements. The regression models for network‐level communications are developed using micro‐benchmarks and employ data transfer size and processor count as predictor variables. Our performance prediction framework achieves prediction accuracy over 90% compared with the actual implementations for several tested GPGPU cluster configurations. The end goal of this research is to offer the scientific computing community, an accurate and easy‐to‐use performance prediction framework that empowers users to optimally utilize the heterogeneous resources. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

4.

Performance modeling for SPMD message-passing programs

JÜRGEN BREHM PATRICK H. WORLEY MANISH MADHUKAR 《Concurrency and Computation》1998,10(5):333-357

Today's massively parallel machines are typically message-passing systems consisting of hundreds or thousands of processors. Implementing parallel applications efficiently in this environment is a challenging task, and poor parallel design decisions can be expensive to correct. Tools and techniques that allow the fast and accurate evaluation of different parallelization strategies would significantly improve the productivity of application developers and increase throughput on parallel architectures. This paper investigates one of the major issues in building tools to compare parallelization strategies: determining what type of performance models of the application code and of the computer system are sufficient for a fast and accurate comparison of different strategies. The paper is built around a case study employing the performance prediction tool (PerPreT) to predict performance of the parallel spectral transform shallow water model code (PSTSWM) on the Intel Paragon. PSTSWM is a parallel application code that was designed to evaluate different parallel strategies for the spectral transform method as it is used in climate modeling and weather forecasting. Multiple parallel algorithms and algorithm variants are embedded in the code. PerPreT uses a relatively simple algebraic model to predict execution time for SPMD (single program multiple data) parallel applications. Applications are modeled through parameterized formulae for communication and computation, where the parameters include the problem size, the number of processors used to execute the program, and system characteristics (e.g. setup times for communication, link bandwidth and sustained computing performance per processor). In this paper we describe performance models that predict the performance of the different algorithms in PSTSWM accurately enough to allow them to be compared, establishing the feasibility of such a demanding application of performance modeling. We also discuss issues in generating and validating the performance models, emphasizing the practical importance of tools such as PerPreT in such studies. © 1998 John Wiley & Sons, Ltd. 相似文献

5.

An update method for digital twin multi-dimension models

《Robotics and Computer》2023

Digital twin, as an effective means to realize the fusion between physical and virtual spaces, has attracted more and more attention in the past few years. Based on ultra-fidelity models, more accurate service, e.g. real-time monitoring and failure prediction, can be reached. Against the background, some scholars studied the related theories and methods on modeling to depict various features of physical objects. Some scholars studied how to use Internet of Things to realize the connections and interactions, thereby keeping the consistency between the virtual and physical spaces. During this process, a new question arises that how to update the models once digital twin models are inconsistent with the practical situations. To solve the problem, this paper proposed a general digital twin model update framework at first. Then, the update methods for multi-dimension models are further explored. The cutting tool is the core component of machine tools which are the key equipment in industry. The precise cutting tool models are essential for realizing the digitalization and servitization of machine tools. Therefore, this paper takes a cutting tool as the application object to discuss how to conduct physics model update based on the proposed framework and methods. Through model update, a more accurate and updated tool wear model could be obtained, which contributes to the prognostics and health management for machine tools. 相似文献

6.

TOWARDS PARALLEL PROGRAMMING BY TRANSFORMATION: THE FAN SKELETON FRAMEWORK*

《International Journal of Parallel, Emergent and Distributed Systems》2012,27(2):87-121

A Functional Abstract Notation (FAN) is proposed for the specification and design of parallel algorithms by means of skeletons - high-level patterns with parallel semantics. The main weakness of the current programming systems based on skeletons ii that the user is still responsible for finding the most appropriate skeleton composition for a given application and a given parallel architecture

We describe a transformational framework for the development of skeletal programs which is aimed at filling this gap. The framework makes use of transformation rules which are semantic equivalences among skeleton compositions. For a given problem, an initial, possibly inefficient skeleton specification is refined by applying a sequence of transformations. Transformations are guided by a set of performance prediction models which forecast the behavior of each skeleton and the performance benefits of different rules. The design process is supported by a graphical tool which locates applicable transformations and provides performance estimates, thereby helping the programmer in navigating through the program refinement space. We give an overview of the FAN framework and exemplify its use with performance-directed program derivations for simple case studies. Our experience can be viewed as a first feasibility study of methods and tools for transformational, performance-directed parallel programming using skeletons. 相似文献

7.

POEMS: end-to-end performance design of large parallel adaptive computational systems

Adve V.S. Bagrodia R. Browne J.C. Deelman E. Dube A. Houstis E.N. Rice J.R. Sakellariou R. Sundaram-Stukel D.J. Teller P.J. Vernon M.K. 《IEEE transactions on pattern analysis and machine intelligence》2000,26(11):1027-1048

The POEMS project is creating an environment for end-to-end performance modeling of complex parallel and distributed systems, spanning the domains of application software, runtime and operating system software, and hardware architecture. Toward this end, the POEMS framework supports composition of component models from these different domains into an end-to-end system model. This composition can be specified using a generalized graph model of a parallel system, together with interface specifications that carry information about component behaviors and evaluation methods. The POEMS Specification Language compiler will generate an end-to-end system model automatically from such a specification. The components of the target system may be modeled using different modeling paradigms and at various levels of detail. Therefore, evaluation of a POEMS end-to-end system model may require a variety of evaluation tools including specialized equation solvers, queuing network solvers, and discrete event simulators. A single application representation based on static and dynamic task graphs serves as a common workload representation for all these modeling approaches. Sophisticated parallelizing compiler techniques allow this representation to be generated automatically for a given parallel program. POEMS includes a library of predefined analytical and simulation component models of the different domains and a knowledge base that describes performance properties of widely used algorithms. The paper provides an overview of the POEMS methodology and illustrates several of its key components. The modeling capabilities are demonstrated by predicting the performance of alternative configurations of Sweep3D, a benchmark for evaluating wavefront application technologies and high-performance, parallel architectures. 相似文献

8.

A multiprocessor bus design model validated by system measurement

Tsuei T.-F. Vernon M.K. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(6):712-727

An accurate and efficient model of a commercial multiprocessor bus is developed. Four important characteristics of the bus design are modeled: asynchronous memory write operations; in-order delivery of responses to processor read requests; priority scheduling of memory responses; and upper bounds on the number of outstanding processor requests. A two-level hierarchical model employing both Markov chain and mean value analysis techniques for analyzing queueing networks is used. The model is shown to accurately predict measured system performance for two parallel program workloads that have different memory access characteristics. The results provide evidence that analytic queueing models can be extremely accurate in spite of simplifying assumptions required for model tractability. Model estimates are compared against detailed simulation of the bus to investigate in more detail the likely source of small model inaccuracies. The use of the analytical model for assessing system design tradeoffs is illustrated 相似文献

9.

Interpretive Performance Prediction for Parallel Application Development

《Journal of Parallel and Distributed Computing》2000,60(1):17-47

Application software development for high-performance parallel computing (HPC) is a non trivial process; its complexity can be primarily attributed to the increased degrees of freedom that have to be resolved and tuned in such an environment. Performance prediction tools enable a developer to evaluate available design alternatives and can assist in HPC application software development. In this paper we first present a novel “interpretive” approach for accurate and cost-effective performance prediction. The approach has been used to develop an interpretive HPF/Fortran 90D application performance prediction framework. The accuracy and usability of the performance prediction framework are experimentally validated. We then outline the stages typically encountered during application software development for HPC and highlight the significance and requirements of a performance prediction tool at relevant stages. Numerical results using benchmarking kernels and application codes are presented to demonstrate the application of the interpretive performance prediction framework at different stages of the HPC application software development process. 相似文献

10.

并行计算时间模型和并行机系统性能 总被引：4，自引：0，他引：4

乔香珍《计算机学报》1998,21(5):413-418

本文重点从共享存储器式并行机系统体系结构中的新技术和并行软件系统的新特点分析了影响并行算法和应用程序性能的各种因素，并提出改进的并行计划时间的模型，给给出了提高并行算法和软件性能的原则和实例。相似文献

11.

Evaluating strategies for integrating environmental models with GIS: Current trends and future needs 总被引：1，自引：0，他引：1

Hassan A. Karimi Benjamin H. Houston 《Computers, Environment and Urban Systems》1996,20(6):413-425

Using Geographic Information Systems (GIS) as an environmental modeling framework allows modelers to use database, data visualization, and analytical tools in a single integrated environment. Environmental modelers can take advantage of GIS by taking one of the two general approaches: loosely coupling or tightly coupling. Loosely-coupled modeling is primarily for taking advantage of database and visualization tools in GIS. Loosely-coupled modeling can be improved by capitalizing on GIS analytical tools and techniques. Conversely, tightly-coupled models, which are completely encapsulated within a GIS environment, take full advantage of the database, the visualization, and the analysis capabilities of a GIS. These two general strategies for integrating environmental models with GIS, a case study to integrate a groundwater flow model with GIS, and needed improvements for GIS integration in environmental modeling are discussed in this paper. 相似文献

12.

Performance based design of high-level language-directed computerarchitectures

Katti R.S. Manwaring M.L. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》1998,28(2):219-227

This paper is concerned with the analytical modeling of computer architectures to aid in the design of high-level language-directed computer architectures. High-level language-directed computers are computers that execute programs in a high-level language directly. The design procedure of these computers are at best described as being ad hoc. In order to systematize the design procedure, we introduce analytical models of computers that predict the performance of parallel computations on concurrent computers. We model computers as queueing networks and parallel computations as precedence graphs. The models that we propose are simple and lead to computationally efficient procedures of predicting the performance of parallel computations on concurrent computers. We demonstrate the use of these models in the design of high-level language-directed computer architectures. 相似文献

13.

Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers

D.?A.?Grove Email author P.?D.?Coddington 《The Journal of supercomputing》2005,34(2):201-217

This paper gives an overview of two related tools that we have developed to provide more accurate measurement and modelling of the performance of message-passing communication and application programs on distributed memory parallel computers. MPIBench uses a very precise, globally synchronised clock to measure the performance of MPI communication routines. It can generate probability distributions of communication times, not just the average values produced by other MPI benchmarks. This allows useful insights to be made into the MPI communication performance of parallel computers, and in particular how performance is affected by network contention. The Performance Evaluating Virtual Parallel Machine (PEVPM) provides a simple, fast and accurate technique for modelling and predicting the performance of message-passing parallel programs. It uses a virtual parallel machine to simulate the execution of the parallel program. The effects of network contention can be accurately modelled by sampling from the probability distributions generated by MPIBench. These tools are particularly useful on clusters with commodity Ethernet networks, where relatively high latencies, network congestion and TCP problems can significantly affect communication performance, which is difficult to model accurately using other tools. Experiments with example parallel programs demonstrate that PEVPM gives accurate performance predictions on commodity clusters. We also show that modelling communication performance using average times rather than sampling from probability distributions can give misleading results, particularly for programs running on a large number of processors. 相似文献

14.

A scalable delay based analytical framework for CSMA/CA wireless mesh networks

Jiazhen Zhou Kenneth Mitchell 《Computer Networks》2010,54(2):304-318

We present an analytical framework for the performance analysis of CSMA/CA based wireless mesh networks. This framework can provide an accurate throughput-delay evaluation for both saturated and unsaturated cases. An efficient algorithm that determines the collision domain for each node based on both the interference range and routing in the network is presented. As another important application of this framework, we develop an analytic model that enables us to obtain closed form expressions for delay in terms of multipath routing variables. A flow-deviation algorithm is used to derive the optimal flow over a given set of routes for any number of classes. The model takes into account the effects of neighbor interference and hidden terminals, and tools are provided to make it feasible for the performance analysis and optimization of large-scale networks. Numerical results are presented for different network topologies and compared with simulation studies. 相似文献

15.

On the accuracy of two analytical models for evaluating the performance of Gigabit Ethernet hosts

Khaled Salah 《Information Sciences》2006,176(24):3735-3756

In this paper we develop and assess the accuracy of two analytical models that capture the behavior of network hosts when subjected to heavy load such as that of Gigabit Ethernet. The first analytical model is based on Markov processes and queuing theory, and the second is a pure Markov process. In order to validate the models and assess their accuracy, two different numerical examples are presented. The two numerical examples use system parameters that are realistic and appropriate for modern hardware. Both analytical models give closed-form solutions that facilitate the study of a number of important system performance metrics. These metrics include throughput, latency, stability condition, CPU utilizations of interrupt handling and protocol processing, and CPU availability for user applications. The two models give mathematically equivalent closed-form solutions for all metrics except for latency. To address latency, we compare the results of both models with the results of a discrete-event simulation. The latency accuracy of the two models is assessed relative to simulation in terms of differences and percentage errors. The paper shows that the second model is more accurate. 相似文献

16.

Performance analysis of a QoS capable cluster interconnect

Eun Jung Ki Hwan Chita R. 《Performance Evaluation》2005,60(1-4):275-302

The growing use of clusters in diverse applications, many of which have real-time constraints, requires quality-of-service (QoS) support from the underlying cluster interconnect. All prior studies on QoS-aware cluster routers/networks have used simulation for performance evaluation. In this paper, we present an analytical model for a wormhole-switched router with QoS provisioning. In particular, the model captures message blocking due to wormhole switching in a pipelined router, and bandwidth sharing due to a rate-based scheduling mechanism, called VirtualClock. Then we extend the model to a hypercube-style cluster network. Average message latency for different traffic classes and deadline missing probability for real-time applications are computed using the model.

We evaluate a 16-port router and hypercubes of different dimensions with a mixed workload of real-time and best-effort (BE) traffic. Comparison with the simulation results shows that the single router and the network models are quite accurate in providing the performance estimates, and thus can be used as efficient design tools. 相似文献

17.

Use of raster-based data layers to model spatial variation of seismotectonic data in probabilistic seismic hazard assessment

Mohammad R. Zolfaghari 《Computers & Geosciences》2009,35(7):1460-1469

Recent achievements in computer and information technology have provided the necessary tools to extend the application of probabilistic seismic hazard mapping from its traditional engineering use to many other applications. Examples for such applications are risk mitigation, disaster management, post disaster recovery planning and catastrophe loss estimation and risk management. Due to the lack of proper knowledge with regard to factors controlling seismic hazards, there are always uncertainties associated with all steps involved in developing and using seismic hazard models. While some of these uncertainties can be controlled by more accurate and reliable input data, the majority of the data and assumptions used in seismic hazard studies remain with high uncertainties that contribute to the uncertainty of the final results. In this paper a new methodology for the assessment of seismic hazard is described. The proposed approach provides practical facility for better capture of spatial variations of seismological and tectonic characteristics, which allows better treatment of their uncertainties. In the proposed approach, GIS raster-based data models are used in order to model geographical features in a cell-based system. The cell-based source model proposed in this paper provides a framework for implementing many geographically referenced seismotectonic factors into seismic hazard modelling. Examples for such components are seismic source boundaries, rupture geometry, seismic activity rate, focal depth and the choice of attenuation functions. The proposed methodology provides improvements in several aspects of the standard analytical tools currently being used for assessment and mapping of regional seismic hazard. The proposed methodology makes the best use of the recent advancements in computer technology in both software and hardware. The proposed approach is well structured to be implemented using conventional GIS tools. 相似文献

18.

The parallel system for integrating impact models and sectors (pSIMS)

《Environmental Modelling & Software》2014

We present a framework for massively parallel climate impact simulations: the parallel System for Integrating Impact Models and Sectors (pSIMS). This framework comprises a) tools for ingesting and converting large amounts of data to a versatile datatype based on a common geospatial grid; b) tools for translating this datatype into custom formats for site-based models; c) a scalable parallel framework for performing large ensemble simulations, using any one of a number of different impacts models, on clusters, supercomputers, distributed grids, or clouds; d) tools and data standards for reformatting outputs to common datatypes for analysis and visualization; and e) methodologies for aggregating these datatypes to arbitrary spatial scales such as administrative and environmental demarcations. By automating many time-consuming and error-prone aspects of large-scale climate impacts studies, pSIMS accelerates computational research, encourages model intercomparison, and enhances reproducibility of simulation results. We present the pSIMS design and use example assessments to demonstrate its multi-model, multi-scale, and multi-sector versatility. 相似文献

19.

All‐uses testing of shared memory parallel programs

Cheer‐Sun D. Yang Lori L. Pollock 《Software Testing, Verification and Reliability》2003,13(1):3-24

Parallelism has become a way of life for many scientific programmers. A significant challenge in bringing the power of parallel machines to these programmers is providing them with a suite of software tools similar to the tools that sequential programmers currently utilize. Unfortunately, writing correct parallel programs remains a challenging task.In particular, automatic or semi‐automatic testing tools for parallel programs are lacking. This paper takes a first step in developing an approach to providing all‐uses coverage for parallel programs. A testing framework and theoretical foundations for structural testing are presented, including test data adequacy criteria and hierarchy, formulation and illustration of all‐uses testing problems, classification of all‐uses test cases for parallel programs, and both theoretical and empirical results with regard to what can be achieved with all‐uses coverage for parallel programs. Copyright © 2003 John Wiley & Sons, Ltd. 相似文献

20.

Multiple‐model sliding mode observer for an artificial gas lift system

Mohammad Luai Hammadih Khalifa Al Hosani Igor Boiko 《Asian journal of control》2019,21(1):21-32

Artificial gas lift is widely used in the oil industry to enhance oil recovery. Active feedback control of this process would lead to its stabilization, that in some operating modes may otherwise be unstable, and to the increase of oil production. However, the control strategies are normally constrained to the use of the surface‐measured process variables. The use of down‐hole measurements would improve the performance of the control system but is technically hardly feasible due to the necessity of placing instruments in harsh conditions. The use of state observation might be a feasible alternative to the down‐hole measurements. Recent development of a new accurate model of the artificial gas lift process enables us to increase accuracy of observation due to the account of pressure and density distribution along the well depth. Besides, a new approach to a nonlinear model treatment proposed in the present paper leads to a high computational efficiency of the observer for the gas lift. The paper presents an approach to the design of a novel sliding mode observer for the gas lift process. The observer uses multiple linearized models representing deviations from a set of equilibrium points. These models are then incorporated to produce estimates for the gas lift process variables. The approach is supported by simulations. 相似文献