首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper describes a new sequential diagnosis algorithm for hypercubes. The algorithm is based on the PMC model and it assumes the existence of a central observer for syndrome decoding. If we denote the total number of processors in a given hypercube by N, then the algorithm achieves Θ([formula]) degree of diagnosability using only O(N) tests over all iterations of diagnosis and repair. The aggregated syndrome decoding time is also shown to be O(N) for this algorithm. The number of iterations of diagnosis and repair needed by the algorithm is O(log N).  相似文献   

2.
Li  Jie  Pan  Yi  Shen  Hong 《The Journal of supercomputing》2003,24(3):251-258
Topological sort of an acyclic graph has many applications such as job scheduling and network analysis. Due to its importance, it has been tackled on many models. Dekel et al. [3], proposed an algorithm for solving the problem in O(log2 N) time on the hypercube or shuffle-exchange networks with O(N 3) processors. Chaudhuri [2], gave an O(log N) algorithm using O(N 3) processors on a CRCW PRAM model. On the LARPBS (Linear Arrays with a Reconfigurable Pipelined Bus System) model, Li et al. [5] showed that the problem for a weighted directed graph with N vertices can be solved in O(log N) time by using N 3 processors. In this paper, a more efficient topological sort algorithm is proposed on the same LARPBS model. We show that the problem can be solved in O(log N) time by using N 3/log N processors. We show that the algorithm has better time and processor complexities than the best algorithm on the hypercube, and has the same time complexity but better processor complexity than the best algorithm on the CRCW PRAM model.  相似文献   

3.
Maximal outerplanar graphs constitute an important class of graphs, often encountered in various applications, e.g., computational geometry, robotics, etc. In this paper, we propose a parallel algorithm for testing the isomorphism of maximal outerplanar graphs. Given the ordered adjacency lists of the two graphs, the proposed algorithm tests their isomorphism inO(log N) time usingNprocessors, for graphs withNnodes on an EREW shared memory model, as well as on a hypercube arhitecture. When the adjacency matrices of the graphs are given, this algorithm can be redesigned onN2processors to run inO(log N) time.  相似文献   

4.
The reliability of processors is an important issue for designing a massively parallel processing system for which fault-tolerant computing is crucial. In order to achieve high system reliability and availability, a faulty processor (node) when found should be replaced by a fault-free processor. Within a multiprocessor system, the technique of identifying faulty nodes by constructing tests on the nodes and interpreting the test outcomes is known as system-level diagnosis. The topological structure of a multicomputer system can be modeled by a graph of which the vertices and edges correspond to nodes and links of the system, respectively. This work presents a system-level diagnosis algorithm for a generalized hypercube which is an attractive variance of a hypercube. The proposed algorithm is based on the PMC model and can isolate all faulty nodes to within a set which contains at most one fault-free node. If the total number of nodes to be diagnosed in a generalized hypercube is N, the proposed algorithm can run in O(Nlog?N) time, and being superior to Yang??s algorithm proposed in 2004, it can diagnose not only a hypercube but also a generalized hypercube.  相似文献   

5.
Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N2) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications.  相似文献   

6.
The derivation of the prefixes of a given sequence (prefix computation) and the fast reduction of a tree to a single node (tree contraction) are two useful primitives for many applications on parallel computers. It is well known that certain special cases of the two problems can be solved efficiently on the hypercube. Here we extend this result to a large family of parallel computers. The family of parallel computers are based on a novel interconnection scheme called thegeneralized Fibonacci cube that encompasses both the hypercube and the second-order Fibonacci cube in [8]. Specifically, we show that thek-th order Fibonacci tree of sizeN can be reduced to a single node inO(logN) steps on ak-th order Fibonacci cube withN nodes (processors). Assuming thatO(logN) data items are on each of theN processors, we also show that the prefixes can be computed inO(logN) steps on thek-th order Fibonacci cube.  相似文献   

7.
We present a unified parallel algorithm for constructing various search trees. The tree construction is based on a unified scheme, called bottom-level balancing, which constructs a perfectly balanced search tree having a uniform distribution of keys. The algorithm takes O(log log N) time using N/log log N processors on the EREW PRAM model, and O(1) time with N processors on the CREW PRAM model, where N is the number of keys in the tree.  相似文献   

8.
This paper considers the histogramming problem on hypercube.N-PE hypercube is used to process anN 12 × N12digitized image in which each pixel has a gray-level value between 0 andM − 1. In general,M, the range of gray-level values is much smaller thanN, the number of pixels being processed. Our algorithm generates the histogram of the image inO(logM * logN) time using radix sort and efficient data movement operations. This technique can be implemented on butterfly, shuffle-exchange and fat pyramid organizations.  相似文献   

9.
We study the message complexity of theElectionProblem in hypercube networks, when the processors have a “Sense of Direction,” i.e., the capability to distinguish between adjacent communication links according to some globally consistent scheme. We present two models of Sense of Direction, which differ regarding the way the labeling of the links in the network is done: either by matching based on dimensions or by distance along a Hamiltonian cycle. In the dimension model, we give an optimal linear algorithm which uses the natural dimensional labeling of the communication links. We prove that, in the distance-based case, the graph symmetry of the hypercube is broken and, thus, the leader Election does not require a global maximum-finding algorithm:O(1) messages suffice to select the leader, whereas the Θ(N) messages are required only for the final broadcasting. Finally, we study the communication cost of changing one orientation labeling to the other and prove thatO(N) messages suffice.  相似文献   

10.
A novel reconfigurable network referred to as the Reconfigurable Multi-Ring Network (RMRN) is described. The RMRN is shown to be a truly scalable network, in that each node in the network has a fixed degree of connectivity and the reconfiguration mechanism ensures a network diameter of O(log2N) for an N-processor network. Algorithms for the 2-D mesh and the SIMD n-cube are shown to map very elegantly onto the RMRN. Basic message passing and reconfiguration primitives for the SIMD RMRN are designed which could be used as building blocks for more complex parallel algorithms. The RMRN is shown to be a viable architecture for image processing and computer vision problems via the parallel computation of the Hough transform. The parallel implementation of the Y-angle Hough transform of an N × N image is showed to have a asymptotic complexity of O(Y log2Y + log2N) on the SIMD RMRN with O(N2) processors. This compares favorably with the O(Y + log2N) optimal algorithm for the same Hough transform on the MIMD n-cube with O(N2) processors.  相似文献   

11.
This paper describes a new factorization of the inverse of the joint-space inertia matrix M. In this factorization, M ?1 is directly obtained as the product of a set of sparse matrices wherein, for a serial chain, only the inversion of a block-tridiagonal matrix is needed. In other words, this factorization reduces the inversion of a dense matrix to that of a block-tridiagonal one. As a result, this factorization leads to both an optimal serial and an optimal parallel algorithm, that is, a serial algorithm with a complexity of O(N) and a parallel algorithm with a time complexity of O(logN) on a computer with O(N) processors. The novel feature of this algorithm is that it first calculates the interbody forces. Once these forces are known, the accelerations are easily calculated. We discuss the extension of the algorithm to the task of calculating the forward dynamics of a kinematic tree consisting of a single main chain plus any number of short side branches. We also show that this new factorization of M ?1 leads to a new factorization of the operational-space inverse inertia, Λ ?1, in the form of a product involving sparse matrices. We show that this factorization can be exploited for optimal serial and parallel computation of Λ ?1, that is, a serial algorithm with a complexity of O(N) and a parallel algorithm with a time complexity of O(logN) on a computer with O(N) processors.  相似文献   

12.
We propose the time division multiplexed hypercube (TDM-cube) and the time/wavelength division multiplexed mesh (TWDM-mesh). The TDM-cube is an extension of the earlier work by Thompson on the dilated slipped banyan network, DSB. While the DSB(N) provides the complete connection among N users in O(N) time via the time division multiplexing, the TDM-cube(N) implements the binary hypercube interconnection among N users in O(log2 N) time. The TWDM-mesh(n2) uses a DSB(n), and combines the TDM and WDM. Like the Bus-Mesh, it requires at most 2 hops to send a packet from one node to any other node. The TWDM-mesh has a much higher network throughput than the Bus-Mesh. Both the TDM-cube and TWDM-mesh require only one fixed-wavelength transmitter/receiver per node, and they have a simple column control and dilated operation. The performance in terms of scalability, delay, and throughput is considered.  相似文献   

13.
An O(N2) heuristic algorithm is presented that embeds all binary trees, with dilation 2 and small average dilation, into the optimal-sized hypercube. The heuristic relies on a conjecture about all binary trees containing a perfect matching. It provides a practical and robust technique for mapping binary trees into the hypercube and ensures that the communication load is evenly distributed across the network assuming any shortest path routing strategy. One contribution of this work is the identification of a rich collection of binary trees that can be easily mapped into the hypercube.  相似文献   

14.
As a generalization of the precise and pessimistic diagnosis strategies of system-level diagnosis of multicomputers, the t/k diagnosis strategy can significantly improve the self-diagnosing capability of a system at the expense of no more than k fault-free processors (nodes) being mistakenly diagnosed as faulty. In the case k ? 2, to our knowledge, there is no known t/k diagnosis algorithm for general diagnosable system or for any specific system. Hypercube is a popular topology for interconnecting processors of multicomputers. It is known that an n-dimensional cube is (4n − 9)/3-diagnosable. This paper addresses the (4n − 9)/3 diagnosis of n-dimensional cube. By exploring the relationship between a largest connected component of the 0-test subgraph of a faulty hypercube and the distribution of the faulty nodes over the network, the fault diagnosis of an n-dimensional cube can be reduced to those of two constituent (n − 1)-dimensional cubes. On this basis, a diagnosis algorithm is presented. Given that there are no more than 4n − 9 faulty nodes, this algorithm can isolate all faulty nodes to within a set in which at most three nodes are fault-free. The proposed algorithm can operate in O(N log2 N) time, where N = 2n is the total number of nodes of the hypercube. The work of this paper provides insight into developing efficient t/k diagnosis algorithms for larger k value and for other types of interconnection networks.  相似文献   

15.
In this paper we consider the following problem of computing a map of geometric minimal cuts (called MGMC problem): Given a graph G=(V,E) and a planar rectilinear embedding of a subgraph H=(V H ,E H ) of G, compute the map of geometric minimal cuts induced by axis-aligned rectangles in the embedding plane. The MGMC problem is motivated by the critical area extraction problem in VLSI designs and finds applications in several other fields. In this paper, we propose a novel approach based on a mix of geometric and graph algorithm techniques for the MGMC problem. Our approach first shows that unlike the classic min-cut problem on graphs, the number of all rectilinear geometric minimal cuts is bounded by a low polynomial, O(n 3). Our algorithm for identifying geometric minimal cuts runs in O(n 3logn(loglogn)3) expected time which can be reduced to O(nlogn(loglogn)3) when the maximum size of the cut is bounded by a constant, where n=|V H |. Once geometric minimal cuts are identified we show that the problem can be reduced to computing the L Hausdorff Voronoi diagram of axis aligned rectangles. We present the first output-sensitive algorithm to compute this diagram which runs in O((N+K)log2 NloglogN) time and O(Nlog2 N) space, where N is the number of rectangles and K is the complexity of the Hausdorff Voronoi diagram. Our approach settles several open problems regarding the MGMC problem.  相似文献   

16.
The reconfigurable array with slotted optical buses (RASOB) has recently received a lot of attention from the research community. In this paper, we first discuss the reconfiguration methods and communication capabilities of the RASOB architecture. Then, we use this architecture for the implementation of efficient sorting algorithms on the 1D RASOB and the 2D RASOB. Our parallel sorting algorithm on the 1D RASOB is based on an efficient divide-and-conquer scheme. It sortsNdata items usingNprocessors inO(k) communication cycles where k is the size of the data items to be sorted in bits. We further develop a parallel sorting algorithm on the 2D RASOB based on the sorting algorithm on the 1D RASOB in conjunction with the well known Rotatesort algorithm. Similarly, this algorithm sortsNdata items on a 2D RASOB of sizeNinO(k) communication cycles. These sorting algorithms are much more efficient than state-of-the-art sorting algorithms on reconfigurable arrays of processors withelectronicbuses using the same number of processors.  相似文献   

17.
Previous research on developing parallel triangulation algorithms concentrated on triangulating planar point sets.O(log3 n) running time algorithms usingO(n) processors have been developed in Refs. 1 and 2. Atallah and Goodrich(3) presented a data structure that can be viewed as a parallel analogue of the sequential plane-sweeping paradigm, which can be used to triangulate a planar point set inO(logn loglogn) time usingO(n) processors. Recently Merks(4) described an algorithm for triangulating point sets which runs inO(logn) time usingO(n) processors, and is thus optimal. In this paper we develop a parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5. The algorithm runs inO(log2 n) time usingO(n/logn) processors. The algorithm hasO(n logn) as the product of the running time and the number of processors; i.e., an optimal speed-up.  相似文献   

18.
This paper presents a principle to create Almost Optimal Dynamical 2-3 trees basedon the theory of Miller et al.,and gives a searching algorithm,an insertion algorithmand a deletion algorithm for these 2-3 trees.Experimental result given in this paperindicates that these 2-3 trees have very good performance at node-visit cost.We discussasymptotic property of the 2-3 trees as N→∞,and evaluate its approximate height,h=log_(2.45)(N+1),where N is the number of nodes of a 2-3 tree.Finally,this paper analysesthe time complexities of the algorithms,which are O(log_(2.45)(N+1)).  相似文献   

19.
In this paper a general technique for reducing processors in simulation without any increase in time is described. This results in an O(√logn) time algorithm for simulating one step of PRIORITY on TOLERANT with processor-time product of O(n log logn); the same as that for simulating PRIORITY on ARBITRARY. This is used to obtain anO(logn/log logn + √logn (log logm ? log logn)) time algorithm for sortingn integers from the set {0,...,m ? 1},mn, with a processor-time product ofO(n log logm log logn) on a TOLERANT CRCW PRAM. New upper and lower bounds for ordered chaining problem on an allocated COMMON CRCW model are also obtained. The algorithm for ordered chaining takesO(logn/log logn) time on an allocated PRAM of sizen. It is shown that this result is best possible (upto a constant multiplicative factor) by obtaining a lower bound of Ω(r logn/(logr + log logn)) for finding the first (leftmost one) live processor on an allocated-COMMON PRAM of sizen ofr-slow virtual processors (one processor simulatesr processors of allocated PRAM). As a result, for ordered chaining problem, “processor-time product” has to be at least Ω(n logn/log logn) for any poly-logarithmic time algorithm. Algorithm for ordered-chaining problem results in anO(logN/log logN) time algorithm for (stable) sorting ofn integers from the set {0,...,m ? 1} withn-processors on a COMMON CRCW PRAM; hereN = max(n, m). In particular if,m =n O(1), then sorting takes Θ(logn/log logn) time on both TOLERANT and COMMON CRCW PRAMs. Processor-time product for TOLERANT isO(n(log logn)2). Algorithm for COMMON usesn processors.  相似文献   

20.
An analysis of the efficiency of the alpha-beta algorithm is carried out based on a probabilistic model in which terminal node scores depend on random branch values. Explicit expressions are derived for the expected number of terminal nodes scored for the cases of uniform trees of fanout N and of depths 2 and 3. For trees of depth 2, the expected number is of order O(NHN); for trees of depth 3, the expected number is of order O(N2). An upper bound on the expected number of terminal nodes scored for trees of depth 4 is shown to be no greater than O(N2HN2) and no less than O(N2).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号