首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A parallel sorting algorithm for sorting n elements evenly distributed over 2d p nodes of a d-dimensional hypercube is presented. The average running time of the algorithm is O((n log n)/p+p log 2n). The algorithm maintains a perfect load balance in the nodes by determining the (kn/p)th elements (k1,. . ., (p-1)) of the final sorted list in advance. These p-1 keys are used to partition the sorted sublists in each node to redistribute data to the nodes to be merged in parallel. The nodes finish the sort with an equal number of elements (n/ p) regardless of the data distribution. A parallel selection algorithm for determining the balanced partition keys in O(p log2n) time is presented. The speed of the sorting algorithm is further enhanced by the distance-d communication capability of the iPSC/2 hypercube computer and a novel conflict-free routing algorithm. Experimental results on a 16-node hypercube computer show that the sorting algorithm is competitive with the previous algorithms and faster for skewed data distributions  相似文献   

2.
Pin minimization is an important issue for massively parallel architectures because the number of processing elements that can be placed on a chip, board, or chassis is often pin limited. A d-dimensional bused hypercube interconnection network is presented that allows nodes to simultaneously (in one clock tick) exchange data across any dimension using only d+1 ports per node rather than 2d. Despite this near two-to-one reduction, the network also allows nodes that are two dimensions apart to simultaneously exchange data; as a result, certain routings can be performed in nearly half the time. The network is shown to be a special case of a general construction in which any set of d permutations can be performed, in one clock tick, using only d+1 ports per node. A lower-bound technique is also presented and used to establish the optimality of the network, as well as that of several other new bused networks  相似文献   

3.
Simple formulas are presented to compute the internally balanced minimal realization and the singular decomposition of the Hankel operator of a given continuous-time p×m stable transfer function matrix E(s)/d(s). The proposed formulas involve the Schwarz numbers of d(s) and the singular eigenvalues-eigenmatrices of a suitable finite matrix. Similar results are also obtained for a given discrete-time transfer function matrix  相似文献   

4.
A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p 1+ε) for any constant ε>0, and in time O(n log n/p+log3 p) otherwise. This clearly attains a linear speedup when n=Ω(p 1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication  相似文献   

5.
The job scheduling problem in a partitionable mesh-connected system in which jobs require square meshes and the system is a square mesh whose size is a power of two is discussed. A heuristic algorithm of time complexity O(n(log n+log p)), in which n is the number of jobs to be scheduled and p is the size of the system is presented. The algorithm adopts the largest-job-first scheduling policy and uses a two-dimensional buddy system as the system partitioning scheme. It is shown that, in the worst case, the algorithm produces a schedule four times longer than an optimal schedule, and, on the average, schedules generated by the algorithm are twice as long as optimal schedules  相似文献   

6.
It is shown that for a given p (1<pn ), the n-cube network can tolerate up to p2(n-p)-1 processor failures and remains connected provided that at most p neighbors of any nonfaulty processor are allowed to fail. This generalizes the result for p=n-1, obtained by A.-M Esfahanian (1989). It is also shown that the n-cube network with n⩾5 remains connected provided that at most two neighbors of any processor are allowed to fail  相似文献   

7.
Exact analytical expressions are obtained for the likelihood and likelihood gradient stationary autoregressive moving average (ARMA) models. Denote the sample size by N, the autoregressive order by p, and the moving average order by q. The calculation of the likelihood requires (p+2q+1)N +o(N) multiply-add operations, and the calculation of the likelihood gradient requires (2p+6q+2)N+o(N) multiply-add operations. These expressions may be used to obtain an iterative, Newton-Raphson-type converging algorithm, with superlinear convergence rate, that computes the maximum-likelihood estimator in (2 p+6q+2)N+o(N) multiply-add operations per iteration  相似文献   

8.
The problem of absolute stability in a vibrational feedback controller is introduced and discussed. It is shown that for any rational G(s)=n(s)/d(s ) with d(s) Hurwitz and deg d(s) -deg n(s)=1 there exists a linear dynamic periodic controller that ensures, in a certain sense, the infinite sector of absolute stability. This implies that an additional dynamical element, inserted in the feedback loop, may lead to improvements in the robustness of nonlinear systems  相似文献   

9.
An indirect adaptation algorithm to adaptively implement a control weighted one-step-ahead suboptimal control law previously derived for the control of stochastic linear dynamic systems having polynomial nonlinearities is presented. Since identification is separated from the calculation of controller design and a single adaptation algorithm is used, the indirect scheme provides more flexibility and has the advantage that few estimated parameters and less memory space are required in the case when the system time delay d and/or degree p of the polynomial are large. It has been shown that the proposed indirect scheme has the same global convergence property as the direct scheme given by Zhang and Lang (see ibid., vol.AC-32, no.8 (1987))  相似文献   

10.
A simultaneous access design of a dictionary machine which supports insert, delete, and search operations is presented. The design is able to handle p accesses simultaneously and allows redundant accesses to occur. In the design, processors performing insert or delete operations are free to perform other tasks after submitting their accesses to the design; processors that perform search operations get their response in O(log N) time. Compared to all sequential access designs of a dictionary which require O(p ) time to process p accesses, the presented design provides much higher throughput; specifically, O(p/log p) times better. It also provides a fast mechanism to avoid the sequential access bottleneck in any large multiprocessor system  相似文献   

11.
The problem of estimating the autoregressive (AR)-order and the AR parameters of a causal, stable, single input single output (SISO) autoregressive moving average (ARMA) (p,q) model, excited by an unobservable i.i.d. process, is addressed. The observed output is corrupted by additive colored Gaussian noise, whose power spectral density is unknown. The ARMA model may be mixed-phase, and have inherent all-pass factors and repeated poles. It is shown that consistent AR parameter estimates can be obtained via the normal equations based on (p+1) 1-D slices of the mth-order ( m>2) cumulant. It is shown via a counterexample that consistent AR estimates cannot, in general, be obtained from a subset of these p+1 slices. Necessary and sufficient conditions for the existence of a full-rank slice are also derived  相似文献   

12.
13.
Consideration is given to transforming depth p-nested for loop algorithms into q-dimensional systolic VLSI arrays where 1⩽qp-1. Previously, there existed complete characterizations of correct transformation only for the cases where q=p-1 or q=1. This gap is filled by giving formal necessary and sufficient conditions for correct transformation of a p-nested loop algorithm into a q-dimensional systolic array for any q, 1⩽qp-1. Practical methods are presented. The techniques developed are applied to the automatic design of special purpose and programmable systolic arrays. The results also contribute toward automatic compilation onto more general purpose programmable arrays. Synthesis of linear and planar systolic array implementations for a three-dimensional cube-graph algorithm and a reindexed Warshall-Floyd path-finding algorithm are used to illustrate the method  相似文献   

14.
Theorems for order-determination without a priori knowledge of upper bounds on the order in MIMO dynamic systems are developed. Also, deterministic procedures are introduced to determine orders and estimate parameters simultaneously by recursively computing the order-determining quantity Sn(a,b,k), which plays a crucial role in order-determination procedures, and the least-squares estimate &thetas;n(a,b) of &thetas;(p,q), with p and q denoting the true orders  相似文献   

15.
It is shown that there is a continuously parameterized family F of n-dimensional single-input single-output (SISO) stabilizable detectable linear system Σ(p) which contains at least one realization of each reduced, strictly proper transfer function of McMillan degree not exceeding n. The parameterization map p→Σ(p) is a polynomial function in 2n indeterminates from an open convex polyhedron in R2n to the linear space of all SISO n-dimensional linear systems  相似文献   

16.
Efficient parallel processing of image contours   总被引:1,自引:0,他引:1  
Describes two parallel algorithms for ranking the pixels on a curve in O (log N) time using either an EREW or CREW PRAM model. The algorithms accomplish this with N processors for a √N×√N image. After applying such an algorithm to an image, it is possible to move the pixels from a curve into processors having consecutive addresses. This is important because one can subsequently apply many algorithms to the curve (such as piecewise linear approximation algorithms or point in polygon tests) using segmented scan operations (i.e. parallel prefix operations). Scan operations can be executed in logarithmic time on many interconnection networks, such as hypercube, tree, butterfly, and shuffle exchange machines as well as on the EREW PRAM. The algorithms were implemented on the hypercube structured Connection Machine, and various performance tests were conducted  相似文献   

17.
Necessary and sufficient criteria in the frequency domain for robust stability are given under the assumption that coefficients of a characteristic polynomial belong to a transformed lp-ball. Three cases are considered in detail: p =∞ (interval uncertainty), p=2 (ellipsoidal uncertainty), and p=1 (octahedral uncertainty). It is shown that frequency-domain-stability criteria are efficient when one deals with robust stability problems. The frequency-domain approach considered can be extended to various problems of robust stability  相似文献   

18.
The authors analyze the problem in which each node of the binary hypercube independently generates packets according to a Poisson process with rate λ; each of the packets is to be broadcast to all other nodes. Assuming unit packet length and no other communications taking place, it is observed that the system can be stable in steady-state only if the load factor ρ≡λ (2d-1)/d satisfies ρ<1 where d is the dimensionality (diameter) of the hypercube. Moreover, the authors establish some lower bounds for the steady-state average delay D per packet and devise and analyze two distributed routing schemes that are efficient in the sense that stability is maintained for all ρ<ρ* where ρ* does not depend on the dimensionality d of the network, while the average delay D per packet satisfies DKd(1+ρ) for small values of ρ (with constant K). The performance evaluation is rigorous for one scheme, while for the other the authors resort to approximations and simulations  相似文献   

19.
Algorithms are proposed for eigenvalue assignment (EVA) by constant as well as dynamic output feedback. The main algorithm is developed for single-input, multioutput systems and the results are then extended to multiinput, multioutput systems. In computing the feedback, use is made of the fact that the closed-loop eigenvalues can almost always be assigned arbitrarily close to the desired locations in the complex plane, provided the system satisfies the condition m+ p>n, where m, p, and n are , respectively, the number of inputs, outputs and states of the system. The EVA problem has been treated as a converse of the algebraic eigenvalue problem. The proposed algorithms are based on the implicitly shifted QR algorithm for solving the algebraic eigenvalue problem. The performance of the algorithms is illustrated by several numerical examples  相似文献   

20.
It is shown that on the set of m-input p-output minimal nth-order state-space systems the graph topology and the induced Euclidean quotient topology are identified. The author considers the set Lnp×m of m -input p-output nth-order minimal state-space systems. The author presents three lemmas and a corollary from which a theorem is proved stating that the graph topology and the quotient Euclidean topology are identical on a quotient space Ln p×m/~. Since the graph topology is constructed to be weak, and the quotient Euclidean topology is intuitively strong, this result is unexpected  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号