首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A parallel sorting algorithm for sorting n elements evenly distributed over 2d p nodes of a d-dimensional hypercube is presented. The average running time of the algorithm is O((n log n)/p+p log 2n). The algorithm maintains a perfect load balance in the nodes by determining the (kn/p)th elements (k1,. . ., (p-1)) of the final sorted list in advance. These p-1 keys are used to partition the sorted sublists in each node to redistribute data to the nodes to be merged in parallel. The nodes finish the sort with an equal number of elements (n/ p) regardless of the data distribution. A parallel selection algorithm for determining the balanced partition keys in O(p log2n) time is presented. The speed of the sorting algorithm is further enhanced by the distance-d communication capability of the iPSC/2 hypercube computer and a novel conflict-free routing algorithm. Experimental results on a 16-node hypercube computer show that the sorting algorithm is competitive with the previous algorithms and faster for skewed data distributions  相似文献   

2.
An adaptive parallel algorithm for inducing a priority queue structure on an n-element array is presented. The algorithm is extended to provide optimal parallel construction algorithms for three other heap-like structures useful in implementing double-ended priority queues, namely min-max heaps, deeps, and min-max-pair heaps. It is shown that an n-element array can be made into a heap, a deap, a min-max heap, or a min-max-pair heap in O(log n+(n /p)) time using no more than n/log n processors, in the exclusive-read-exclusive-write parallel random-access machine model  相似文献   

3.
A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p 1+ε) for any constant ε>0, and in time O(n log n/p+log3 p) otherwise. This clearly attains a linear speedup when n=Ω(p 1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication  相似文献   

4.
It is shown that there is a continuously parameterized family F of n-dimensional single-input single-output (SISO) stabilizable detectable linear system Σ(p) which contains at least one realization of each reduced, strictly proper transfer function of McMillan degree not exceeding n. The parameterization map p→Σ(p) is a polynomial function in 2n indeterminates from an open convex polyhedron in R2n to the linear space of all SISO n-dimensional linear systems  相似文献   

5.
Parallel algorithms for several important combinatorial problems such as the all nearest smaller values problem, triangulating a monotone polygon, and line packing are presented. These algorithms achieve linear speedups on the pipelined hypercube, and provably optimal speedups on the shuffle-exchange and the cube-connected-cycles for any number p of processors satisfying 1⩽pn/((log3n)(loglog n)2), where n is the input size. The lower bound results are established under no restriction on how the input is mapped into the local memories of the different processors  相似文献   

6.
It is shown that for a given p (1<pn ), the n-cube network can tolerate up to p2(n-p)-1 processor failures and remains connected provided that at most p neighbors of any nonfaulty processor are allowed to fail. This generalizes the result for p=n-1, obtained by A.-M Esfahanian (1989). It is also shown that the n-cube network with n⩾5 remains connected provided that at most two neighbors of any processor are allowed to fail  相似文献   

7.
Two arrays of numbers sorted in nondecreasing order are given: an array A of size n and an array B of size m, where n<m. It is required to determine, for every element of A, the smallest element of B (if one exists) that is larger than or equal to it. It is shown how to solve this problem on the EREW PRAM (exclusive-read exclusive-write parallel random-access machine) in O(logm logn/log log m) time using n processors. The solution is then extended to the case in which fewer than n processors are available. This yields an EREW PRAM algorithm for the problem whose cost is O(n log m, which is O(m)) for nm/log m. It is shown how the solution obtained leads to an improved parallel merging algorithm  相似文献   

8.
Algorithms are proposed for eigenvalue assignment (EVA) by constant as well as dynamic output feedback. The main algorithm is developed for single-input, multioutput systems and the results are then extended to multiinput, multioutput systems. In computing the feedback, use is made of the fact that the closed-loop eigenvalues can almost always be assigned arbitrarily close to the desired locations in the complex plane, provided the system satisfies the condition m+ p>n, where m, p, and n are , respectively, the number of inputs, outputs and states of the system. The EVA problem has been treated as a converse of the algebraic eigenvalue problem. The proposed algorithms are based on the implicitly shifted QR algorithm for solving the algebraic eigenvalue problem. The performance of the algorithms is illustrated by several numerical examples  相似文献   

9.
Using a directed acyclic graph (DAG) model of algorithms, the authors focus on processor-time-minimal multiprocessor schedules: time-minimal multiprocessor schedules that use as few processors as possible. The Kung, Lo, and Lewis (KLL) algorithm for computing the transitive closure of a relation over a set of n elements requires at least 5n-4 parallel steps. As originally reported. their systolic array comprises n2 processing elements. It is shown that any time-minimal multiprocessor schedule of the KLL algorithm's dag needs at least n2/3 processing elements. Then a processor-time-minimal systolic array realizing the KLL dag is constructed. Its processing elements are organized as a cylindrically connected 2-D mesh, when n=0 mod 3. When n≠0 mod 3, the 2-D mesh is connected as a torus  相似文献   

10.
The binary-image-compression problem is analyzed using irreducible cover of maximal rectangles. A bound on the minimum-rectangular-cover problem for image compression is given under certain conditions that previously have not been analyzed. It is demonstrated that for a simply connected image, the irreducible cover proposed uses less than four times the number of the rectangles in a minimum cover. With n pixels in a square, the parallel algorithm for obtaining the irreducible cover uses (n/log n) concurrent-read-exclusive write (CREW) processors in O(log n) time  相似文献   

11.
An algorithm for convolving a k×k window of weighting coefficients with an n×n image matrix on a pyramid computer of O(n2) processors in time O(logn+k2), excluding the time to load the image matrix, is presented. If k=Ω (√log n), which is typical in practice, the algorithm has a processor-time product O(n 2 k2) which is optimal with respect to the usual sequential algorithm. A feature of the algorithm is that the mechanism for controlling the transmission and distribution of data in each processor is finite state, independent of the values of n and k. Thus, for convolving two {0, 1}-valued matrices using Boolean operations rather than the typical sum and product operations, the processors of the pyramid computer are finite-state  相似文献   

12.
Dynamic scheduling of hard real-time tasks and real-time threads   总被引:1,自引:0,他引:1  
The authors investigate the dynamic scheduling of tasks with well-defined timing constraints. They present a dynamic uniprocessor scheduling algorithm with an O(n log n) worst-case complexity. The preemptive scheduling performed by the algorithm is shown to be of higher efficiency than that of other known algorithms. Furthermore, tasks may be related by precedence constraints, and they may have arbitrary deadlines and start times (which need not equal their arrival times). An experimental evaluation of the algorithm compares its average case behavior to the worst case. An analytic model used for explanation of the experimental results is validated with actual system measurements. The dynamic scheduling algorithm is the basis of a real-time multiprocessor operating system kernel developed in conjunction with this research. Specifically, this algorithm is used at the lowest, threads-based layer of the kernel whenever threads are created  相似文献   

13.
An O(n2) time serial algorithm is developed for obtaining the medial axis transform (MAT) of an n×n image. An O(log n) time CREW PRAM algorithm and an O(log2 n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n2) processors. Two problems associated with the MAT, the area and perimeter reporting problem, are studied. An O(log n) time hypercube algorithm is developed for both of them, where n is the number of squares in the MAT, and the algorithms use O(n2) processors  相似文献   

14.
The problem of distributed leader election in an asynchronous complete network, in the presence of faults that occurred prior to the execution of the election algorithm, is discussed. Failures of this type are encountered, for example, during a recovery from a crash in the network. For a network with n processors, k of which start the algorithm that uses at most O(n log k +n+kt) messages is presented and shown to be optimal. An optimal algorithm for the case where the identities of the neighbors are known is also presented. It is noted that the order of the message complexity of a t-resilient algorithm is not always higher than that of a nonresilient one. The t-resilient algorithm is a systematic modification of an existing algorithm for a fault-free network  相似文献   

15.
A distributed knot detection algorithm for general graphs is presented. The knot detection algorithm uses at most O(n log n+m) messages and O(m+n log n) bits of memory to detect all knots' nodes in the network (where n is the number of nodes and m is the number of links). This is compared to O(n2) messages needed in the best algorithm previously published. The knot detection algorithm makes use of efficient cycle detection and clustering techniques. Various applications for the knot detection algorithms are presented. In particular, its importance to deadlock detection in store and forward communication networks and in transaction systems is demonstrated  相似文献   

16.
A family of intervals on the real line provides a natural model for a vast number of scheduling and VLSI problems. Recently, a number of parallel algorithms to solve a variety of practical problems on such a family of intervals have been proposed in the literature. The authors develop computational tools and show how they can be used for the purpose of devising cost-optimal parallel algorithms for a number of interval-related problems, including finding a largest subset of pairwise nonoverlapping intervals, a minimum dominating subset of intervals, along with algorithms to compute the shortest path between a pair of intervals and, based on the shortest path, a parallel algorithm to find the center of the family of intervals. More precisely, with an arbitrary family of n intervals as input, all the algorithms run in O(log n) time using O(n) processors in the EREW-PRAM model of computation  相似文献   

17.
A mechanism for scheduling communications in a network in which individuals exchange information periodically according to a fixed schedule is presented. A proper k edge-coloring of the network is considered to be a schedule of allowed communications such that an edge of color i can be used only at times i modulo k. Within this communication scheduling mechanism, the information exchange problem known as gossiping is considered. It is proved that there is a proper k edge-coloring such that gossip can be completed in a path of n edges in a certain time for nk⩾1. Gossip can not be completed in such a path any earlier under any proper k edge-coloring. In any tree of bounded degree Δ and diameter d, gossip can be completed under a proper Δ edge-coloring in time (Δ-1)d +1. In a k edge-colored cycle of n vertices, other time requirements of gossip are determined  相似文献   

18.
Let ξ be a random variable over a finite set with an arbitrary probability distribution. Improvements to a fast method of generating sample values for ξ in constant time are suggested. The proposed modification reduces the time required for initialization to O( n). For a simple genetic algorithm, this improvement changes an O(g n 1n n) algorithm into an O(g n) algorithm (where g is the number of generations, and n is the population size)  相似文献   

19.
Using a directed acyclic graph (DAG) model of algorithms, the paper focuses on time-minimal multiprocessor schedules that use as few processors as possible. Such a processor-time-minimal scheduling of an algorithm's DAG first is illustrated using a triangular shaped 2-D directed mesh (representing, for example, an algorithm for solving a triangular system of linear equations). Then, algorithms represented by an n×n×n directed mesh are investigated. This cubical directed mesh is fundamental; it represents the standard algorithm for computing matrix product as well as many other algorithms. Completion of the cubical mesh required 3n-2 steps. It is shown that the number of processing elements needed to achieve this time bound is at least [3n2/4]. A systolic array for the cubical directed mesh is then presented. It completes the mesh using the minimum number of steps and exactly [3n 2/4] processing elements it is processor-time-minimal. The systolic array's topology is that of a hexagonally shaped, cylindrically connected, 2-D directed mesh  相似文献   

20.
The eigenstructure assignment problem with output feedback is studied for systems satisfying the condition p+m> n. The main tool used is the concept of (C, A, B)-invariance and two coupled Sylvester equations, the solution of which leads to the computation of an output stabilizing feedback. A computationally efficient algorithm for the solution of these two coupled equations, which leads to the computation of a desired output feedback, is presented  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号