首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
The problem of estimating the autoregressive (AR)-order and the AR parameters of a causal, stable, single input single output (SISO) autoregressive moving average (ARMA) (p,q) model, excited by an unobservable i.i.d. process, is addressed. The observed output is corrupted by additive colored Gaussian noise, whose power spectral density is unknown. The ARMA model may be mixed-phase, and have inherent all-pass factors and repeated poles. It is shown that consistent AR parameter estimates can be obtained via the normal equations based on (p+1) 1-D slices of the mth-order ( m>2) cumulant. It is shown via a counterexample that consistent AR estimates cannot, in general, be obtained from a subset of these p+1 slices. Necessary and sufficient conditions for the existence of a full-rank slice are also derived  相似文献   

2.
Consideration is given to transforming depth p-nested for loop algorithms into q-dimensional systolic VLSI arrays where 1⩽qp-1. Previously, there existed complete characterizations of correct transformation only for the cases where q=p-1 or q=1. This gap is filled by giving formal necessary and sufficient conditions for correct transformation of a p-nested loop algorithm into a q-dimensional systolic array for any q, 1⩽qp-1. Practical methods are presented. The techniques developed are applied to the automatic design of special purpose and programmable systolic arrays. The results also contribute toward automatic compilation onto more general purpose programmable arrays. Synthesis of linear and planar systolic array implementations for a three-dimensional cube-graph algorithm and a reindexed Warshall-Floyd path-finding algorithm are used to illustrate the method  相似文献   

3.
A simultaneous access design of a dictionary machine which supports insert, delete, and search operations is presented. The design is able to handle p accesses simultaneously and allows redundant accesses to occur. In the design, processors performing insert or delete operations are free to perform other tasks after submitting their accesses to the design; processors that perform search operations get their response in O(log N) time. Compared to all sequential access designs of a dictionary which require O(p ) time to process p accesses, the presented design provides much higher throughput; specifically, O(p/log p) times better. It also provides a fast mechanism to avoid the sequential access bottleneck in any large multiprocessor system  相似文献   

4.
A finite-order stationary and minimum-phase ARMA (autoregressive moving-average) (p,q) model is equivalent to an infinite-order AR (autoregressive) model. Two methods of estimating the parameters of the ARMA (p,q) model by solving only linear equations are based on or closely related to this equivalence relation. One method was derived directly from the equivalence relation by D. Graupe et al. (ibid., vol.AC-20, p.104-107, Feb. 1975). The other was derived by S. Li and B.W. Dickinson (ibid., vol.AC-31, p.275-278, Mar. 1986 and IEEE Trans. Acoust. Speech Signal Process., vol.ASSP-36, p.502-512, Apr. 1988) based on an iterated least-squares regression approach. The end results bear close resemblance to those of Graupe et al. The two methods are compared, and ways to improve the parameter estimates are suggested  相似文献   

5.
Theorems for order-determination without a priori knowledge of upper bounds on the order in MIMO dynamic systems are developed. Also, deterministic procedures are introduced to determine orders and estimate parameters simultaneously by recursively computing the order-determining quantity Sn(a,b,k), which plays a crucial role in order-determination procedures, and the least-squares estimate &thetas;n(a,b) of &thetas;(p,q), with p and q denoting the true orders  相似文献   

6.
An efficient parallel algorithm is presented for convolution on a mesh-connected computer with wraparound. The algorithm does not require a broadcast feature for data values, as assumed by previously proposed algorithms. As a result, the algorithm is applicable to both SIMD and MIMD meshes. For an N×N image and a M×M template, the previous algorithms take O (M2q) time on an N×N mesh-connected multicomputer (q is the number of bits in each entry of the convolution matrix). The algorithms have complexity O(M2r), where r=max {number of bits in an image entry, number of bits in a template entry}. In addition to not requiring a broadcast capability, these algorithms are faster for binary images  相似文献   

7.
An application-specific architecture for the parallel calculation of the decimation in time and radix 2 fast Hartley (FHT) and Fourier (FFT) transforms is presented. A real sequence with N=2n data items is considered as input. The system calculates the FHT and the FFT in n and n+1 stages. respectively. The modular and regular parallel architecture is based on a constant geometry algorithm using butterflies of four data items and the perfect unshuffle permutation. With this permutation, the mapping of the algorithm in VLSI technology is simplified and the communications among processors are minimized. Organization of the processor memory based on first-in, first-out (FIFO) queues facilitates a systolic data flow and permits the implementation in a direct way of the complex data movements and address sequences of the transforms. This is accomplished by means of simple multiplexing operations, using hardwired control. The total calculation time is (Nlog2N)/4Q cycles for the FHT and N(1+log2N)/4Q cycles for the FFT, where Q is the number of processors ( Q= 2q, QN/4)  相似文献   

8.
A parallel sorting algorithm for sorting n elements evenly distributed over 2d p nodes of a d-dimensional hypercube is presented. The average running time of the algorithm is O((n log n)/p+p log 2n). The algorithm maintains a perfect load balance in the nodes by determining the (kn/p)th elements (k1,. . ., (p-1)) of the final sorted list in advance. These p-1 keys are used to partition the sorted sublists in each node to redistribute data to the nodes to be merged in parallel. The nodes finish the sort with an equal number of elements (n/ p) regardless of the data distribution. A parallel selection algorithm for determining the balanced partition keys in O(p log2n) time is presented. The speed of the sorting algorithm is further enhanced by the distance-d communication capability of the iPSC/2 hypercube computer and a novel conflict-free routing algorithm. Experimental results on a 16-node hypercube computer show that the sorting algorithm is competitive with the previous algorithms and faster for skewed data distributions  相似文献   

9.
A linear-time algorithm is developed to perform all odd (even) length circular shifts of data in an SIMD (single-instruction-stream, multiple-data-stream) hypercube. As an application, the algorithm is used to obtain an O(M2+log N) time and O(1) memory per processor algorithm to compute the two-dimensional convolution of an N×N image and an M×M template on an N2 processor SIMD hypercube. This improves the previous best complexity of O(M2 log M+log N)  相似文献   

10.
A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p 1+ε) for any constant ε>0, and in time O(n log n/p+log3 p) otherwise. This clearly attains a linear speedup when n=Ω(p 1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication  相似文献   

11.
A closed-form expression has been reported in the literature for LN, the number of digital line segments of length N that correspond to lines of the form y=ax+β, O⩽α, β<1. The authors prove an asymptotic estimate for LN that might prove useful for many applications, namely, LN=N 32+O(N2 log N). An application to an image registration problem is given  相似文献   

12.
The theorem states that every block square matrix satisfies its own m-D (m-dimensional, m⩾1) matrix characteristic polynomial. The exact statement and a simple proof of this theorem are given. The theorem refers to a matrix A subdivided into m blocks, and hence having dimension at least m. The conclusion is that every square matrix A with dimension M satisfies several m-D characteristic matrix polynomials with degrees N1 . . ., N m, such that N1+ . . . +Nm M  相似文献   

13.
Efficient parallel processing of image contours   总被引:1,自引:0,他引:1  
Describes two parallel algorithms for ranking the pixels on a curve in O (log N) time using either an EREW or CREW PRAM model. The algorithms accomplish this with N processors for a √N×√N image. After applying such an algorithm to an image, it is possible to move the pixels from a curve into processors having consecutive addresses. This is important because one can subsequently apply many algorithms to the curve (such as piecewise linear approximation algorithms or point in polygon tests) using segmented scan operations (i.e. parallel prefix operations). Scan operations can be executed in logarithmic time on many interconnection networks, such as hypercube, tree, butterfly, and shuffle exchange machines as well as on the EREW PRAM. The algorithms were implemented on the hypercube structured Connection Machine, and various performance tests were conducted  相似文献   

14.
It is shown that for a given p (1<pn ), the n-cube network can tolerate up to p2(n-p)-1 processor failures and remains connected provided that at most p neighbors of any nonfaulty processor are allowed to fail. This generalizes the result for p=n-1, obtained by A.-M Esfahanian (1989). It is also shown that the n-cube network with n⩾5 remains connected provided that at most two neighbors of any processor are allowed to fail  相似文献   

15.
The job scheduling problem in a partitionable mesh-connected system in which jobs require square meshes and the system is a square mesh whose size is a power of two is discussed. A heuristic algorithm of time complexity O(n(log n+log p)), in which n is the number of jobs to be scheduled and p is the size of the system is presented. The algorithm adopts the largest-job-first scheduling policy and uses a two-dimensional buddy system as the system partitioning scheme. It is shown that, in the worst case, the algorithm produces a schedule four times longer than an optimal schedule, and, on the average, schedules generated by the algorithm are twice as long as optimal schedules  相似文献   

16.
It is shown that there is a continuously parameterized family F of n-dimensional single-input single-output (SISO) stabilizable detectable linear system Σ(p) which contains at least one realization of each reduced, strictly proper transfer function of McMillan degree not exceeding n. The parameterization map p→Σ(p) is a polynomial function in 2n indeterminates from an open convex polyhedron in R2n to the linear space of all SISO n-dimensional linear systems  相似文献   

17.
Semigroup and prefix computations on two-dimensional mesh-connected computers with multiple broadcasting (2-MCCMBs) are studied. Previously, only square 2-MCCMBs with N processing elements were considered for semigroup computations of N data items, and O(N1/6) time was required. It is found that square machines are not the best form for semigroup computations, and an O(N1/8)-time algorithm is derived on an N5/8×N3/8 rectangular 2-MCCMB. This time complexity can be further reduced to O(N1/9) if fewer processing elements are used. Parallel algorithms for prefix computations with the same time complexities are derived  相似文献   

18.
In a general algebraic framework, starting with a bicoprime factorization P=NprD-1 Npl, a right-coprime factorization Np Dp-1, a left-coprime factorization D-1pNp, and the generalized Bezout identities associated with the pairs (Np, Dp) and (D˜ p, N˜p) are obtained. The set of all H-stabilizing compensators for P in the unity-feedback configuration S(P, C) are expressed in terms of (Npr, D, N pt) and the elements of the Bezout identity. The state-space representation P=C(sI-A)-1B is included as an example  相似文献   

19.
Performing reduction operations with distributed memory machines whose interconnection networks are reconfigurable is considered. The focus is on machines whose interconnection graph can be configured as any graph of maximum degree d. The best way of interconnecting the p processors as a function of p,d and some problem- and machine-dependent parameters that characterize the ratio communication/arithmetic for the reduction operation are discussed. Experiments on transputer-based networks are in good accordance with the theoretical results  相似文献   

20.
Parallel algorithms on SIMD (single-instruction stream multiple-data stream) machines for hierarchical clustering and cluster validity computation are proposed. The machine model uses a parallel memory system and an alignment network to facilitate parallel access to both pattern matrix and proximity matrix. For a problem with N patterns, the number of memory accesses is reduced from O(N 3) on a sequential machine to O(N2) on an SIMD machine with N PEs  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号