首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The cost of synchronizing a multicomputer increases with system size. For large multicomputers, the time and resources spent to enable each node to estimate the clock value of every other node in the system can be prohibitive. We show how to reduce the cost of synchronization by assigning each node to one or more groups, then having each node estimate the clock values of only those nodes with which it shares a group. Since each node estimates the clock value of only a subset of the nodes, the cost of synchronization can be significantly reduced. We also provide a method for computing the maximum skew between any two nodes in the multicomputer, and a method for computing the maximum time between synchronizations. We also show how the fault tolerance of the synchronization algorithm may be determined  相似文献   

2.
This paper presents in detail an efficient and simple state estimation method suitable for real-time calculations in large power systems. According to this method state estimators of large power systems can be obtained easily using single computers as well as multicomputer systems. Various problems connected with use of this method are discussed and some important numerical results are given.  相似文献   

3.
We present a new concurrent LU-decomposition algorithm based on implicit pivoting of both rows and columns. This algorithm is, to a large extent, independent of the distribution of the matrix over the concurrent processes. As a result, it can be used in programs with dynamically varying data distributions. Another advantage is that most pivoting strategies are easily incorporated. We also introduce two new, intrinsically concurrent, pivoting strategies: multirow and multicolumn pivoting. With this program, we study the performance of concurrent LU-decomposition as a function of data distribution and pivoting strategy. We show that LU-decomposition with some pivoting strategies is both faster and numerically more stable than LU-decomposition without pivoting. Experimental evidence on the Symult 2010 and the iPSC/2 shows that, for performance considerations, pivoting is equivalent to randomizing the data distribution.  相似文献   

4.
Shared nothing multiprocessor architecture is known to be more scalable to support very large databases. Compared to other join strategies, a hash-based join algorithm is particularly efficient and easily parallelized for this computation model. However, this hardware structure is very sensitive to the skew in tuple distribution. Unless the parallel hash join algorithm includes some dynamic load balancing mechanism, the skew effect can severely deteriorate the system performance. In this paper, we investigate this issue. In particular, three parallel hash join algorithms are presented. We implement a simulator to study the effectiveness of these schemes. The simulation model is validated by comparing the simulation results to those produced by the actual implementation of the algorithms running on a multiprocessor system. Our performance study indicates that a naive approach is not able to provide tangible savings. However, the carefully designed strategies can offer substantial improvement over conventional techniques for a wide range of skew conditions  相似文献   

5.
This paper proposes a fault-tolerant distributed subcube management scheme for hypercube multicomputer systems. Gracefully degradable subcube management is supported by a data structure, called the distributed subcube table (DST), and a fault-tolerant broadcast protocol, called the reliably synchronized broadcast (RSB). In an n-dimensional hypercube, DST is the collection of 2n local subcube tables (LSTs), DST={LST0, LST, ..., LST2-1 n}, where LST, is a bit-mapped table assigned to Nx, a fault-free node whose address is x. LSTx, ∀x, is n+1 bits long, and it records the status (free/busy) of certain subcubes adjacent to Nx. The RSB diagnoses and avoids faults during interprocessor communication to prevent faulty nodes from being allocated for job execution. In addition to possessing a fault-tolerant design, our scheme can also achieve comparable or better performance than existing centralized schemes, as verified by extensive simulation  相似文献   

6.
The reliability of processors is an important issue for designing a massively parallel processing system for which fault-tolerant computing is crucial. In order to achieve high system reliability and availability, a faulty processor (node) when found should be replaced by a fault-free processor. Within a multiprocessor system, the technique of identifying faulty nodes by constructing tests on the nodes and interpreting the test outcomes is known as system-level diagnosis. The topological structure of a multicomputer system can be modeled by a graph of which the vertices and edges correspond to nodes and links of the system, respectively. This work presents a system-level diagnosis algorithm for a generalized hypercube which is an attractive variance of a hypercube. The proposed algorithm is based on the PMC model and can isolate all faulty nodes to within a set which contains at most one fault-free node. If the total number of nodes to be diagnosed in a generalized hypercube is N, the proposed algorithm can run in O(Nlog?N) time, and being superior to Yang??s algorithm proposed in 2004, it can diagnose not only a hypercube but also a generalized hypercube.  相似文献   

7.
This paper describes a system-level diagnosis algorithm for hypercube multicomputer systems. The algorithm is based on the PMC model and can isolate all faulty processors to within a set that contains at most one fault-free processor. If we denote by N the total number of processors in a hypercube system to be diagnosed, then, based on the judiciously designed data structures, the algorithm can run in O(Nlog2N) time; whereas the best-known diagnosis algorithm, the YML algorithm, runs in O(N2.5) time. Consequently, the new algorithm is remarkably superior to the YML algorithm in terms of the time cost.  相似文献   

8.
For non-fully connected heterogeneous computational systems, we introduce a modified digraph model and determine sufficient conditions under which mutual information reconciliation can be achieved. We show a method of designing mutual information reconciliation algorithms for such systems.  相似文献   

9.
The paper deals with the implementation of global time in multicomputer systems. After a formalization of the synchronization problem, techniques to estimate the synchronization delay and to compensate the drift error are proposed. Then SYNC_WAVE, a clock synchronization algorithm where the values of a reference clock are diffused in a wave-like manner, is described. SYNC_WAVE has no provision for fault-tolerance and is specially designed to introduce low CPU and communication overhead, in order to support performance analysis applications efficiently. An implementation of the devised algorithm in a transputer-based system is presented, showing the accuracy results obtained. Finally SYNC_WAVE is compared to other synchronization algorithms and several of its possible applications are suggested.  相似文献   

10.
The M-Machine is an experimental multicomputer being developed to test architectural concepts motivated by the constraints of modern semiconductor technology and the demands of programming systems. The M-Machine computing nodes are connected with a 3-D mesh network; each node is a multithreaded processor incorporating 9 function units, on-chip cache, and local memory. The multiple function units are used to exploit both instruction-level and thread-level parallelism. A user accessible message passing system yields fast communication and synchronization between nodes. Rapid access to remote memory is provided transparently to the user with a combination of hardware and software mechanisms. This paper presents the architecture of the M-Machine and describes how its mechanisms attempt to maximize both single thread performance and overall system throughput. The architecture is complete and the MAP chip, which will serve as the M-Machine processing node, is currently being implemented.  相似文献   

11.
For the most general, so-called byzantine, failures that lie in an arbitrary, malevolent behavior of the faulty computer, a method of distributed system diagnosis was proposed. It is applicable to the partially connected multicomputer systems featuring certain structural characteristics and enables the system to detect and identify the manifested failures on its own without an outside aid.Translated from Avtomatika i Telemekhanika, No. 2, 2005, pp. 148–157.Original Russian Text Copyright © 2005 by Grishin, Lobanov, Sirenko.  相似文献   

12.
We investigate the problem of scheduling parallel tasks with precedence constraints on mesh connected multicomputer systems. It is still an open problem on whether there exists an approximation algorithm with finite asymptotic worst-case and/or average-case performance bound for this scheduling problem. As an early attempt to solve our problem, we propose and analyze the performance of a level-by-level scheduling algorithm LL. In fact, we solve a special case of the problem when all tasks request for square submeshes and run on a square mesh system whose size is a power of 2. There are three basic techniques in algorithm LL, i.e., the level-by-level scheduling strategy for handling precedence constraints, the largest-task-first algorithm for scheduling tasks in the same level, and the two-dimensional buddy system for system partitioning and processor allocation. Algorithm LL does not have a finite worst-case performance bound; however, it has quite acceptable average-case performance. The main contribution of the paper is to show that under the assumptions that task sizes are independent and identically distributed (i.i.d.) random variables with a common probability distribution, and that task execution times are i.i.d. random variables with finite mean and variance, and that the probability distributions of task sizes and execution times are independent of each other, for wide task graphs and typical task size distributions, algorithm LL has an asymptotic average-case performance bound about two for all probability distributions of task execution times.  相似文献   

13.
The hypercube and torus are two important message-passing network architectures of high-performance multicomputers. Analytical models of multicomputer networks under the non-bursty Poisson traffic have been widely reported. Motivated by the convincing evidence of bursty and batch arrival nature of traffic generated by many real-world parallel applications in high-performance computing environments, we develop a new and concise analytical model in this paper for hypercube and torus networks in the presence of batch message arrivals modelled by the compound Poisson process with geometrically distributed batch sizes. The average degree of virtual channel multiplexing is derived by employing a Markov chain which can capture the batch arrival nature. An attractive advantage of the model is its constant computation complexity independent of the network size. The accuracy of the analytical performance results is validated against those obtained from simulation experiments of an actual system.  相似文献   

14.
Programming a hypercube multicomputer   总被引:1,自引:0,他引:1  
Ranka  S. Won  Y. Sahni  S. 《Software, IEEE》1988,5(5):69-77
The issues encountered by programmers are explored. A brief overview of parallel architectures is followed by an example problem of image-template matching. The programming consideration for this problem are discussed  相似文献   

15.
An important issue in getting the agent technology into mainstream software development is the development of appropriate methodologies for developing agent-oriented systems. This paper presents an approach to model distributed systems based on a goal-oriented requirements acquisition. These models are acquired as instances of a conceptual meta-model. The latter can be represented as a graph where each node captures a concept such as, e.g., goal, action, agent, or scenario, and where the edges capture semantic links between such abstractions. This approach is supported by a modeling language, the ANote, which presents views that capture the most important modeling aspects according to the concept currently under consideration.  相似文献   

16.
Squared error clustering algorithms for single-instruction multiple-data (SIMD) hypercubes are presented. The algorithms are shown to be asymptotically faster than previously known algorithms and require less memory per processing element (PE). For a clustering problem with N patterns, M features per pattern, and K clusters, the algorithms complete in O(k+log NM ) steps on NM processor hypercubes. This is optimal up to a constant factor. These results are extended to the case in which NMK processors are available. Experimental results from a multiple-instruction, multiple-data (MIMD) medium-grain hypercube are also presented  相似文献   

17.
In this paper, the formalism of Relational Transition Systems (RTSes) is used to model data-intensive reactive systems, and four RTS models of reactive systems based on temporal logic programming, production systems, recurrence equations, and Petri nets are presented. The paper also describes different methods of comparison of the expressive powers of various RTSes in terms of the trajectories they can generate and carries out this comparison for the four RTS formalisms. It is shown that these formalisms have the same expressive power in the deterministic case. The paper also compares expressive powers of non-deterministic production systems and non-deterministic temporal logic programming systems. It is shown that, although the two formalisms are incomparable in the general case, their restricted versions are isomorphic to each other. Received December 7, 1993 / January 26, 1995  相似文献   

18.
Efficient routing of messages is a key to the performance of multicomputers. Multicast communication refers to the delivery of the same message from a source node to an arbitrary number of destination nodes. While multicast communication is highly demanded in many applications, most of the existing multicomputers do not directly support this service; rather it is indirectly supported by multiple one-to-one or broadcast communications, which result in more network traffic and a waste of system resources. The authors study routing evaluation criteria for multicast communication under different switching technologies. Multicast communication in multicomputers is formulated as a graph theoretical problem. Depending on the evaluation criteria and switching technologies, they study three optimal multicast communication problems, which are equivalent to the finding of the following three subgraphs: optimal multicast path, optimal multicast cycle, and minimal Steiner tree, where the interconnection of a multicomputer defines a host graph. They show that all these optimization problems are NP-complete for the popular 2D-mesh and hypercube host graphs. Heuristic multicast algorithms for these routing problems are proposed  相似文献   

19.
Adaptive networks are a novel class of dynamical networks whose topologies and states coevolve. Many real-world complex systems can be modeled as adaptive networks, including social networks, transportation networks, neural networks and biological networks. In this paper, we introduce fundamental concepts and unique properties of adaptive networks through a brief, non-comprehensive review of recent literature on mathematical/computational modeling and analysis of such networks. We also report our recent work on several applications of computational adaptive network modeling and analysis to real-world problems, including temporal development of search and rescue operational networks, automated rule discovery from empirical network evolution data, and cultural integration in corporate merger.  相似文献   

20.
The execution of a concurrent computation by a network of processors requires a routing algorithm that is deadlock free. Many routing algorithms proposed for processor networks have the potential of deadlock due to the cyclic topology of the network. In this paper we first formalize the concept of message routing. Next, we show a method by which a deadlock-free routing algorithm can be constructed out of a given routing algorithm. Finally the method is illustrated by constructing deadlock-free routing algorithms for cartesian product processor networks. Peter A.J. Hilbers received the B.S. and M.S. degrees in mathematics, and the Ph.D. degree in computer science from Groningen University, Groningen, The Netherlands, in 1979, 1983, and 1989, respectively. From 1988 to 1989 he was an Assistant Professor with the Department of Computing Science at Groningen University. Currently he is a research engineer in the Department of Mathematics and Systems Engineering at the Koninklijke/Shell-Laboratorium, Amsterdam (Shell Research B.V.). His research intersts are program derivation and correctness, concurrency, and processor networks. Johan J. Lukkien received the B.S. and M.S. degrees in mathematics from Groningen University, Groningen, The Netherlands in 1982 and 1986 respectively. Currently he is a Ph.D. student at the Department of Computing Science, Groningen University. His research area is the construction and verification of concurrent programs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号