期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Balanced parallel sort on hypercube multiprocessors

Abali B. Ozguner F. Bataineh A. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(5):572-581

A parallel sorting algorithm for sorting n elements evenly distributed over 2^d p nodes of a d-dimensional hypercube is presented. The average running time of the algorithm is O((n log n)/p+p log 2n). The algorithm maintains a perfect load balance in the nodes by determining the (kn/p)th elements (k1,. . ., (p-1)) of the final sorted list in advance. These p-1 keys are used to partition the sorted sublists in each node to redistribute data to the nodes to be merged in parallel. The nodes finish the sort with an equal number of elements (n/ p) regardless of the data distribution. A parallel selection algorithm for determining the balanced partition keys in O(p log2n) time is presented. The speed of the sorting algorithm is further enhanced by the distance-d communication capability of the iPSC/2 hypercube computer and a novel conflict-free routing algorithm. Experimental results on a 16-node hypercube computer show that the sorting algorithm is competitive with the previous algorithms and faster for skewed data distributions 相似文献

2.

Optimal parallel initialization algorithms for a class of priorityqueues

Olariu S. Wen Z. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):423-429

An adaptive parallel algorithm for inducing a priority queue structure on an n-element array is presented. The algorithm is extended to provide optimal parallel construction algorithms for three other heap-like structures useful in implementing double-ended priority queues, namely min-max heaps, deeps, and min-max-pair heaps. It is shown that an n-element array can be made into a heap, a deap, a min-max heap, or a min-max-pair heap in O(log n+(n /p)) time using no more than n/log n processors, in the exclusive-read-exclusive-write parallel random-access machine model 相似文献

3.

Efficient algorithms for list ranking and for solving graphproblems on the hypercube

Ryu K.W. Jaja J. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(1):83-90

A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p ^1+ε) for any constant ε>0, and in time O(n log n/p+log³ p) otherwise. This clearly attains a linear speedup when n=Ω(p ^1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication 相似文献

4.

A smoothly parameterized family of stabilizable, observable linearsystems containing realizations of all transfer functions of McMillandegree not exceeding n

Pait F. Morse A.S. 《Automatic Control, IEEE Transactions on》1991,36(12):1475-1477

It is shown that there is a continuously parameterized family F of n-dimensional single-input single-output (SISO) stabilizable detectable linear system Σ(p) which contains at least one realization of each reduced, strictly proper transfer function of McMillan degree not exceeding n. The parameterization map p→Σ(p) is a polynomial function in 2n indeterminates from an open convex polyhedron in R²ⁿ to the linear space of all SISO n-dimensional linear systems 相似文献

5.

Optimal algorithms on the pipelined hypercube and related networks

JaJa J. Ryu K.W. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(5):582-591

Parallel algorithms for several important combinatorial problems such as the all nearest smaller values problem, triangulating a monotone polygon, and line packing are presented. These algorithms achieve linear speedups on the pipelined hypercube, and provably optimal speedups on the shuffle-exchange and the cube-connected-cycles for any number p of processors satisfying 1⩽p⩽n/((log³n)(loglog n)²), where n is the input size. The lower bound results are established under no restriction on how the input is mapped into the local memories of the different processors 相似文献

6.

Generalized measures of fault tolerance in n-cube networks

Oh A.D. Choi H.-A. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(6):702-703

It is shown that for a given p (1<p⩽n ), the n-cube network can tolerate up to p2^(n-p)-1 processor failures and remains connected provided that at most p neighbors of any nonfaulty processor are allowed to fail. This generalizes the result for p=n-1, obtained by A.-M Esfahanian (1989). It is also shown that the n-cube network with n⩾5 remains connected provided that at most two neighbors of any processor are allowed to fail 相似文献

7.

Parallel binary search

Akl S.G. Meijer H. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(2):247-250

Two arrays of numbers sorted in nondecreasing order are given: an array A of size n and an array B of size m, where n<m. It is required to determine, for every element of A, the smallest element of B (if one exists) that is larger than or equal to it. It is shown how to solve this problem on the EREW PRAM (exclusive-read exclusive-write parallel random-access machine) in O(logm logn/log log m) time using n processors. The solution is then extended to the case in which fewer than n processors are available. This yields an EREW PRAM algorithm for the problem whose cost is O(n log m, which is O(m)) for n⩽m/log m. It is shown how the solution obtained leads to an improved parallel merging algorithm 相似文献

8.

Numerical algorithms for eigenvalue assignment by constant anddynamic output feedback

Misra P. Patel R.V. 《Automatic Control, IEEE Transactions on》1989,34(6):579-588

Algorithms are proposed for eigenvalue assignment (EVA) by constant as well as dynamic output feedback. The main algorithm is developed for single-input, multioutput systems and the results are then extended to multiinput, multioutput systems. In computing the feedback, use is made of the fact that the closed-loop eigenvalues can almost always be assigned arbitrarily close to the desired locations in the complex plane, provided the system satisfies the condition m+ p>n, where m, p, and n are , respectively, the number of inputs, outputs and states of the system. The EVA problem has been treated as a converse of the algebraic eigenvalue problem. The proposed algorithms are based on the implicitly shifted QR algorithm for solving the algebraic eigenvalue problem. The performance of the algorithms is illustrated by several numerical examples 相似文献

9.

A processor-time-minimal systolic array for transitive closure

Scheiman C.J. Cappello P.R. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(3):257-269

Using a directed acyclic graph (DAG) model of algorithms, the authors focus on processor-time-minimal multiprocessor schedules: time-minimal multiprocessor schedules that use as few processors as possible. The Kung, Lo, and Lewis (KLL) algorithm for computing the transitive closure of a relation over a set of n elements requires at least 5n-4 parallel steps. As originally reported. their systolic array comprises n² processing elements. It is shown that any time-minimal multiprocessor schedule of the KLL algorithm's dag needs at least n²/3 processing elements. Then a processor-time-minimal systolic array realizing the KLL dag is constructed. Its processing elements are organized as a cylindrically connected 2-D mesh, when n=0 mod 3. When n≠0 mod 3, the 2-D mesh is connected as a torus 相似文献

10.

A new method of image compression using irreducible covers ofmaximal rectangles

Cheng Y. Iyengara S.S. Kashyap R.L. 《IEEE transactions on pattern analysis and machine intelligence》1988,14(5):651-658

The binary-image-compression problem is analyzed using irreducible cover of maximal rectangles. A bound on the minimum-rectangular-cover problem for image compression is given under certain conditions that previously have not been analyzed. It is demonstrated that for a simply connected image, the irreducible cover proposed uses less than four times the number of the rectangles in a minimum cover. With n pixels in a square, the parallel algorithm for obtaining the irreducible cover uses (n/log n) concurrent-read-exclusive write (CREW) processors in O(log n) time 相似文献

11.

Two-dimensional convolution on a pyramid computer

Chang J.H. Ibarra O.H. Pong T.-C. Sohn S.M. 《IEEE transactions on pattern analysis and machine intelligence》1988,10(4):590-593

An algorithm for convolving a k×k window of weighting coefficients with an n×n image matrix on a pyramid computer of O(n²) processors in time O(logn+k²), excluding the time to load the image matrix, is presented. If k=Ω (√log n), which is typical in practice, the algorithm has a processor-time product O(n ² k²) which is optimal with respect to the usual sequential algorithm. A feature of the algorithm is that the mechanism for controlling the transmission and distribution of data in each processor is finite state, independent of the values of n and k. Thus, for convolving two {0, 1}-valued matrices using Boolean operations rather than the typical sum and product operations, the processors of the pyramid computer are finite-state 相似文献

12.

Dynamic scheduling of hard real-time tasks and real-time threads 总被引：1，自引：0，他引：1

Schwan K. Zhou H. 《IEEE transactions on pattern analysis and machine intelligence》1992,18(8):736-748

The authors investigate the dynamic scheduling of tasks with well-defined timing constraints. They present a dynamic uniprocessor scheduling algorithm with an O(n log n) worst-case complexity. The preemptive scheduling performed by the algorithm is shown to be of higher efficiency than that of other known algorithms. Furthermore, tasks may be related by precedence constraints, and they may have arbitrary deadlines and start times (which need not equal their arrival times). An experimental evaluation of the algorithm compares its average case behavior to the worst case. An analytic model used for explanation of the experimental results is validated with actual system measurements. The dynamic scheduling algorithm is the basis of a real-time multiprocessor operating system kernel developed in conjunction with this research. Specifically, this algorithm is used at the lowest, threads-based layer of the kernel whenever threads are created 相似文献

13.

Serial and parallel algorithms for the medial axis transform

Jenq J.-F. Sahni S. 《IEEE transactions on pattern analysis and machine intelligence》1992,14(12):1218-1224

An O(n²) time serial algorithm is developed for obtaining the medial axis transform (MAT) of an n×n image. An O(log n) time CREW PRAM algorithm and an O(log² n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n²) processors. Two problems associated with the MAT, the area and perimeter reporting problem, are studied. An O(log n) time hypercube algorithm is developed for both of them, where n is the number of squares in the MAT, and the algorithms use O(n²) processors 相似文献

14.

Optimal distributed t-resilient election in completenetworks

Itai A. Kutten S. Wolfstahl Y. Zaks S. 《IEEE transactions on pattern analysis and machine intelligence》1990,16(4):415-420

The problem of distributed leader election in an asynchronous complete network, in the presence of faults that occurred prior to the execution of the election algorithm, is discussed. Failures of this type are encountered, for example, during a recovery from a crash in the network. For a network with n processors, k of which start the algorithm that uses at most O(n log k +n+kt) messages is presented and shown to be optimal. An optimal algorithm for the case where the identities of the neighbors are known is also presented. It is noted that the order of the message complexity of a t-resilient algorithm is not always higher than that of a nonresilient one. The t-resilient algorithm is a systematic modification of an existing algorithm for a fault-free network 相似文献

15.

An efficient distributed knot detection algorithm

Cidon I. 《IEEE transactions on pattern analysis and machine intelligence》1989,15(5):644-649

A distributed knot detection algorithm for general graphs is presented. The knot detection algorithm uses at most O(n log n+m) messages and O(m+n log n) bits of memory to detect all knots' nodes in the network (where n is the number of nodes and m is the number of links). This is compared to O(n²) messages needed in the best algorithm previously published. The knot detection algorithm makes use of efficient cycle detection and clustering techniques. Various applications for the knot detection algorithms are presented. In particular, its importance to deadlock detection in store and forward communication networks and in transaction systems is demonstrated 相似文献

16.

Optimal parallel algorithms for problems modeled by a family ofintervals

Olariu S. Schwing J.L. Zhang J. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(3):364-374

A family of intervals on the real line provides a natural model for a vast number of scheduling and VLSI problems. Recently, a number of parallel algorithms to solve a variety of practical problems on such a family of intervals have been proposed in the literature. The authors develop computational tools and show how they can be used for the purpose of devising cost-optimal parallel algorithms for a number of interval-related problems, including finding a largest subset of pairwise nonoverlapping intervals, a minimum dominating subset of intervals, along with algorithms to compute the shortest path between a pair of intervals and, based on the shortest path, a parallel algorithm to find the center of the family of intervals. More precisely, with an arbitrary family of n intervals as input, all the algorithms run in O(log n) time using O(n) processors in the EREW-PRAM model of computation 相似文献

17.

Network communication in edge-colored graphs: gossiping

Liestman A.L. Richards D. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(4):438-445

A mechanism for scheduling communications in a network in which individuals exchange information periodically according to a fixed schedule is presented. A proper k edge-coloring of the network is considered to be a schedule of allowed communications such that an edge of color i can be used only at times i modulo k. Within this communication scheduling mechanism, the information exchange problem known as gossiping is considered. It is proved that there is a proper k edge-coloring such that gossip can be completed in a path of n edges in a certain time for n⩾k⩾1. Gossip can not be completed in such a path any earlier under any proper k edge-coloring. In any tree of bounded degree Δ and diameter d, gossip can be completed under a proper Δ edge-coloring in time (Δ-1)d +1. In a k edge-colored cycle of n vertices, other time requirements of gossip are determined 相似文献

18.

A linear algorithm for generating random numbers with a givendistribution

Vose M.D. 《IEEE transactions on pattern analysis and machine intelligence》1991,17(9):972-975

Let ξ be a random variable over a finite set with an arbitrary probability distribution. Improvements to a fast method of generating sample values for ξ in constant time are suggested. The proposed modification reduces the time required for initialization to O( n). For a simple genetic algorithm, this improvement changes an O(g n 1n n) algorithm into an O(g n) algorithm (where g is the number of generations, and n is the population size) 相似文献

19.

A processor-time-minimal systolic array for cubical mesh algorithms

Cappello P. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(1):4-13

Using a directed acyclic graph (DAG) model of algorithms, the paper focuses on time-minimal multiprocessor schedules that use as few processors as possible. Such a processor-time-minimal scheduling of an algorithm's DAG first is illustrated using a triangular shaped 2-D directed mesh (representing, for example, an algorithm for solving a triangular system of linear equations). Then, algorithms represented by an n×n×n directed mesh are investigated. This cubical directed mesh is fundamental; it represents the standard algorithm for computing matrix product as well as many other algorithms. Completion of the cubical mesh required 3n-2 steps. It is shown that the number of processing elements needed to achieve this time bound is at least [3n^2/4]. A systolic array for the cubical directed mesh is then presented. It completes the mesh using the minimum number of steps and exactly [3n ^2/4] processing elements it is processor-time-minimal. The systolic array's topology is that of a hexagonally shaped, cylindrically connected, 2-D directed mesh 相似文献

20.

Output feedback eigenstructure assignment using two Sylvesterequations

Syrmos V.L. Lewis F.L. 《Automatic Control, IEEE Transactions on》1993,38(3):495-499

The eigenstructure assignment problem with output feedback is studied for systems satisfying the condition p+m> n. The main tool used is the concept of (C, A, B)-invariance and two coupled Sylvester equations, the solution of which leads to the computation of an output stabilizing feedback. A computationally efficient algorithm for the solution of these two coupled equations, which leads to the computation of a desired output feedback, is presented 相似文献