期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Global adaptive quadrature for the approximate computation of multidimensional integrals on a distributed-memory multiprocessor

Marco Lapegna 《Concurrency and Computation》1992,4(6):413-426

In this paper we discuss the problem of computing a multidimensional integral on a MIMD distributed-memory multiprocessor. Adaptive quadrature is known as a good approach to the problem of achieving accuracy and reliability while attempting to minimize the number of function evaluations. The implementation makes use of dynamical data structures able to manage subinterval partition. On a distributed-memory multiprocessor, each processor is able to execute code and to manipulate data structures in its own local memory only, and data are sent from one processor to another one by explicit message-passing. Efficient implementation of an adaptive algorithm for the multidimensional quadrature on a parallel computer is quite difficult, because of the need for continuous information exchange between processors. Our algorithm is based on a global adaptive strategy which dynamically balances the workload and reduces the data communication between processors in order to use the message-passing environment efficiently. The results and timings for several tests are given. 相似文献

2.

Multiprogramming and memory contention

Alan Jay Smith 《Software》1980,10(7):531-552

We study memory contention during multiprogramming when programs are free to compete for page frames. A random walk between the possible partitions of memory over the set of active programs is used to model memory contention and calculate throughput. Our model of contention takes into account program characteristics by using miss ratio curves, and also considers memory size and page fetch latency. With the aid of numerous trace-driven simulations, we are able to verify our model, finding good agreement both in the observed distribution of memory among competing programs and in CPU utilization. We find that for high ratios of secondary to primary memory access time and under conditions of high memory contention, small programs with compact working sets are able to run with far less than expected interference from larger, more diffuse programs. In the case of multiprogramming the same program several times, we find that observed partition distributions are not necessarily even and that higher than expected levels of CPU use are observed. Lower ratios of access time are found to yield different results; programs compete on a more even basis and partition memory relatively more evenly than with higher ratios. 相似文献

3.

Multiprogramming with virtual memory—a queueing model

M. Hofri M. Yadin 《Information Sciences》1976,11(3):187-221

A multiprogramming computing system which utilizes a virtual memory operating system, with paging-on-demand, is defined in queueing-theoretic terms. The validity and possible uses of such a model are discussed. Several quantities and measures of effectiveness, such as paging time, total system response time, and memory requirements are computed. The discussion and analysis place emphasis on exact, computable results. 相似文献

4.

Free-Lagrange hydrodynamics with a distributed-memory parallel processor

Roy Williams 《Parallel Computing》1988,7(3):439-443

The PIFL (Parallel Irregular Free-Lagrange) code solves two-dimensional hydrodynamics with the mesh vertices moving with the fluid, with no rezoning. The irregular mesh is made of triangles and each processor deals with one or more connected domains of fluid. After each time step the mesh is topologically restructured, mesh points may be created or destroyed, and there is a local load-balance. Every few steps there is a global load balance. The code runs on a hypercube under Cubix and is designed to run most efficiently in the limit of a large number of large-memory processors. 相似文献

5.

Multiprogramming für DV-Anlagen mittlerer Leistung

Dr. W. Bundke 《Computing》1973,11(3):213-220

The situation of the hardwaretechnic enables also computers of average-performance to install the multiprogramming operational-mode. The structure and the strategy of a supplementary program for the semi-automatical multiprogramming operation will be stated. By these means the computers may be used more economically with only small additional display of software. 相似文献

6.

Parallel ILP for distributed-memory architectures

Nuno A. Fonseca Ashwin Srinivasan Fernando Silva Rui Camacho 《Machine Learning》2009,74(3):257-279

The growth of machine-generated relational databases, both in the sciences and in industry, is rapidly outpacing our ability to extract useful information from them by manual means. This has brought into focus machine learning techniques like Inductive Logic Programming (ILP) that are able to extract human-comprehensible models for complex relational data. The price to pay is that ILP techniques are not efficient: they can be seen as performing a form of discrete optimisation, which is known to be computationally hard; and the complexity is usually some super-linear function of the number of examples. While little can be done to alter the theoretical bounds on the worst-case complexity of ILP systems, some practical gains may follow from the use of multiple processors. In this paper we survey the state-of-the-art on parallel ILP. We implement several parallel algorithms and study their performance using some standard benchmarks. The principal findings of interest are these: (1) of the techniques investigated, one that simply constructs models in parallel on each processor using a subset of data and then combines the models into a single one, yields the best results; and (2) sequential (approximate) ILP algorithms based on randomized searches have lower execution times than (exact) parallel algorithms, without sacrificing the quality of the solutions found. This is an extended version of the paper entitled Strategies to Parallelize ILP Systems, published in the Proceedings of the 15th International Conference on Inductive Logic Programming (ILP 2005), vol. 3625 of LNAI, pp. 136–153, Springer-Verlag. 相似文献

7.

Compiling programs for distributed-memory multiprocessors 总被引：1，自引：0，他引：1

David Callahan Ken Kennedy 《The Journal of supercomputing》1988,2(2):151-169

We describe a new approach to programming distributed-memory computers. Rather than having each node in the system explicitly programmed, we derive an efficient message-passing program from a sequential shared-memory program annotated with directions on how elements of shared arrays are distributed to processors. This article describes one possible input language for describing distributions and then details the compilation process and the optimization necessary to generate an efficient program.Research supported by Intel. 相似文献

8.

A Simple, Object-Based View of Multiprogramming

Jayadev Misra 《Formal Methods in System Design》2002,20(1):23-45

Object-based sequential programming has had a major impact on software engineering. However, object-based concurrent programming remains elusive as an effective programming tool. The class of applications that will be implemented on future high-bandwidth networks of processors will be significantly more ambitious than the current applications (which are mostly involved with transmissions of digital data and images), and object-based concurrent programming has the potential to simplify designs of such applications. Many of the programming concepts developed for databases, object-oriented programming and designs of reactive systems can be unified into a compact model of concurrent programs that can serve as the foundation for designing these future applications. We propose a model of multiprograms and a discipline of programming that addresses the issues of reasoning (e.g., understanding) and efficient implementation. The major point of departure is the disentanglement of sequential and multiprogramming features. We propose a sparse model of multiprograms that distinguishes these two forms of computations and allows their disciplined interactions. 相似文献

9.

Low-cost task scheduling for distributed-memory machines 总被引：2，自引：0，他引：2

Radulescu A. van Gemund A.J.C. 《Parallel and Distributed Systems, IEEE Transactions on》2002,13(6):648-658

In compile-time task scheduling for distributed-memory systems, list scheduling is generally accepted as an attractive approach, since it pairs low cost with good results. List-scheduling algorithms schedule tasks in order of their priority. This priority can be computed either (1) statically, before the scheduling, or (2) dynamically, during the scheduling. In this paper, we show that list scheduling with statically-computed priorities (LSSP) can be performed at a significantly lower cost than existing approaches, without sacrificing performance. Our approach is general, i.e. it can be applied to any LSSP algorithm. The low complexity is achieved by using low-complexity methods for the most time-consuming parts in list-scheduling algorithms, i.e. processor selection and task selection, preserving the criteria used in the original algorithms. We exemplify our method by applying it to the MCP (Modified Critical Path) algorithm. Using an extension of this method, we can also reduce the time complexity of a particular class of list scheduling with dynamic priorities (LSDP) [including algorithms such as DLS (Dynamic Level Scheduling), ETF (Earliest Task First) and ERT (Earliest Ready Task)]. Our results confirm that the modified versions of the list-scheduling algorithms obtain a performance comparable to their original versions, yet at a significantly lower cost. We also show that the modified versions of the list-scheduling algorithms consistently outperform multi-step algorithms, such as DSC-LLB (Dynamic Sequence Clustering with List Load Balancing), which also have higher complexity and clearly outperform algorithms in the same class of complexity, such as CPM (Critical Path Method) 相似文献

10.

Random Injection Control of Multiprogramming in Virtual Memory

《IEEE transactions on pattern analysis and machine intelligence》1978,(1):2-17

We propose a new method for the control of a multiprogrammed virtual memory computer system. A mathematical model solved by decomposition permits us to justify that the method avoids thrashing. Simulation experiments are used to test the robustness of the predictions of the mathematical model when certain simplifying assumptions are relaxed and when a slightly simpler control technique based on the same principle is used. Comparisons are given with the case where an "optimal" control is used and with that with no control. We also provide a simulation evaluating the estimators used in an implementation of the control, as well as the responsiveness of the controlled system to transients in the workload. 相似文献

11.

基于多路规划遗传算法的负载均衡方法

余燕芳《计算机仿真》2008,25(12)

在负载均衡问题中,负载调度方法足核心,它的好坏直接影响均衡系统的性能.提出一种基于多路规划遗传算法的服务器端负载均衡算法.该方法借鉴生物界自然选择和自然遗传机制,模拟自然进化过程搜索最优解,为负载均衡问题提供了新的计算模型.同时,多路规划(多次交叉或变异)后取最优策略的应用,使得多路规划遗传算法的优化性能大为提高.该方法降低了服务器端请求的响应时间,提高了服务器端CPU的利用率,从而改善了系统性能.数据实例表明,该方法是可行的、正确的和有效的. 相似文献

12.

Data management for a class of iterative computations on distributed-memory MIMD systems

M. C. Cornea-Hasegan Dan C. Marinescu Zhongyun Zhang 《Concurrency and Computation》1994,6(3):205-229

The paper discusses data management techniques for mapping a large data space onto the memory hierarchy of a distributed memory MIMD system. Experimental results for structural biology computations using the Molecular Replacement Method are presented. 相似文献

13.

Loop parallelisation for pvm-based distributed-memory systems

《国际计算机数学杂志》2012,89(3):265-278

Writing programs for a distributed-memory system (DMS) is a difficult job. In this paper, a method for parallelising sequential programs for DMS is presented. The input programs are C programs and the output parallel versions are programs containing routines for the Parallel Virtual Machine (PVM). PVM allows a group of computers in a network to be specified as a DMS and provides the routines for task activation and communication. The main task in this parallelisation of program is to process the loops in the source program and determine if there exists any data dependences or not. If the loop iterations are independent, the body will be transformed to tasks that will run independently for PVM. 相似文献

14.

Kernel-Kernel communication in a shared-memory multiprocessor

Eliseu M. Chaves Prakash Ch. Das Thomas J. Leblanc Brian D. Marsh Michael L. Scott 《Concurrency and Computation》1993,5(3):171-191

In the standard kernel organization on a bus-based multiprocessor, all processors share the code and data of the operating system; explicit synchronization is used to control access to kernel data structures. Distributed-memory multicomputers use an alternative approach, in which each instance of the kernel performs local operations directly and uses remote invocation to perform remote operations. Either approach to interkernel communication can be used in a large-scale shared-memory multiprocessor. In the paper we discuss the issues and architectural features that must be considered when choosing between remote memory access and remote invocation. We focus in particular on experience with the Psyche multiprocessor operating system on the BBN Butterfly Plus. We find that the Butterfly architecture is biased towards the use of remote invocation for kernel operations that perform a significant number of memory references, and that current architectural trends are likely to increase this bias in future machines. This conclusion suggests that straightforward parallelization of existing kernels (e.g. by using semaphores to protect shared data) is unlikely in the future to yield acceptable performance. We note, however, that remote memory access is useful for small, frequently-executed operations, and is likely to remain so. 相似文献

15.

Edison—a multiprocessor language

Brinch Hansen 《Software》1981,11(4):325-359

This paper defines a programming language called Edison. The language is suitable both for teaching the principles of concurrent programming and for designing reliable real-time programs for multiprocessor systems. Edison is block structured and includes modules, concurrent statements, and when statements. 相似文献

16.

Balancing the performance of block multithreaded distributed-memory systems

W.M. Zuberek 《Simulation Modelling Practice and Theory》2011,19(5):1318-1329

The performance of modern computer systems is increasingly often limited by long latencies of accesses to the memory subsystems. Instruction-level multithreading is an architectural approach to tolerating such long latencies by switching instruction threads rather than waiting for the completion of memory operations. The paper studies performance limitations in distributed-memory block multithreaded systems and determines conditions for such systems to be balanced. Event-driven simulation of a timed Petri net model of a simple distributed-memory system confirms the derived performance results. 相似文献

17.

The Paradigm compiler for distributed-memory multicomputers 总被引：1，自引：0，他引：1

Banerjee P. Chandy J.A. Gupta M. Hodges E.W. IV. Holm J.G. Lain A. Palermo D.J. Ramaswamy S. Su E. 《Computer》1995,28(10):37-47

To harness the computational power of massively parallel distributed-memory multicomputers, users must write efficient software. This process is laborious because of the absence of global address space. The programmer must manually distribute computations and data across processors and explicitly manage communication. The Paradigm (PARAllelizing compiler for DIstributed-memory, General-purpose Multicomputers) project at the University of Illinois addresses this problem by developing automatic methods for the efficient parallelization of sequential programs. A unified approach efficiently supports regular and irregular computations using data and functional parallelism 相似文献

18.

Synthetic models of distributed-memory parallel programs

David A. Poplawski 《Journal of Parallel and Distributed Computing》1991,12(4)

This paper deals with the construction and use of simple synthetic programs that model the behavior of more complex, real parallel programs. Synthetic programs can be used in many ways: to construct an easily ported suite of benchmark programs, to experiment with alternate parallel implementations of a program without actually writing them, and to predict the behavior and performance of an algorithm on a new or hypothetical machine. Synthetic programs are constructed easily from scratch and from existing programs and can even be constructed using nothing but information obtained from traces of the real program's execution. 相似文献

19.

共享存储多处理机系统在并行程序设计模式下的处理机分配法 总被引：1，自引：0，他引：1

陈蓉西李克清《计算机工程与应用》2000,36(6):63-64,77

该文在假定的系统结构和编程模型之上,分析了几种已有的基于共享存储的多处理机系统的处理机分配方法及其缺陷,提出了一种改进的方法,并对其实现进行了探讨。相似文献

20.

Hardware monitoring of a multiprocessor system

Liu A.-C. Parthasarathi R. 《Micro, IEEE》1989,9(5):44-51

The Testbed for Distributed Processing, or Ted, consists of Intel Corp.'s iSBC 8086 single board computers (SBCs) organized into groups or clusters. Each cluster consists of several SBCs that communicate via a shared memory. Intercluster communication occurs through an Ethernet interface. A hardware monitor designed and implemented to handle the monitoring activities within a cluster in the Ted system is described. By using specified patterns and don't-care masks, the system can detect accesses to selected data, addresses, or blocks of addresses. This function helps monitor events such as the access or usage of a memory location or a group of mailbox addresses. It also determines the amount of time consumed by the performance of specific operations 相似文献