首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Two arrays of numbers sorted in nondecreasing order are given: an array A of size n and an array B of size m, where n<m. It is required to determine, for every element of A, the smallest element of B (if one exists) that is larger than or equal to it. It is shown how to solve this problem on the EREW PRAM (exclusive-read exclusive-write parallel random-access machine) in O(logm logn/log log m) time using n processors. The solution is then extended to the case in which fewer than n processors are available. This yields an EREW PRAM algorithm for the problem whose cost is O(n log m, which is O(m)) for nm/log m. It is shown how the solution obtained leads to an improved parallel merging algorithm  相似文献   

2.
A distributed knot detection algorithm for general graphs is presented. The knot detection algorithm uses at most O(n log n+m) messages and O(m+n log n) bits of memory to detect all knots' nodes in the network (where n is the number of nodes and m is the number of links). This is compared to O(n2) messages needed in the best algorithm previously published. The knot detection algorithm makes use of efficient cycle detection and clustering techniques. Various applications for the knot detection algorithms are presented. In particular, its importance to deadlock detection in store and forward communication networks and in transaction systems is demonstrated  相似文献   

3.
A parallel sorting algorithm for sorting n elements evenly distributed over 2d p nodes of a d-dimensional hypercube is presented. The average running time of the algorithm is O((n log n)/p+p log 2n). The algorithm maintains a perfect load balance in the nodes by determining the (kn/p)th elements (k1,. . ., (p-1)) of the final sorted list in advance. These p-1 keys are used to partition the sorted sublists in each node to redistribute data to the nodes to be merged in parallel. The nodes finish the sort with an equal number of elements (n/ p) regardless of the data distribution. A parallel selection algorithm for determining the balanced partition keys in O(p log2n) time is presented. The speed of the sorting algorithm is further enhanced by the distance-d communication capability of the iPSC/2 hypercube computer and a novel conflict-free routing algorithm. Experimental results on a 16-node hypercube computer show that the sorting algorithm is competitive with the previous algorithms and faster for skewed data distributions  相似文献   

4.
An adaptive parallel algorithm for inducing a priority queue structure on an n-element array is presented. The algorithm is extended to provide optimal parallel construction algorithms for three other heap-like structures useful in implementing double-ended priority queues, namely min-max heaps, deeps, and min-max-pair heaps. It is shown that an n-element array can be made into a heap, a deap, a min-max heap, or a min-max-pair heap in O(log n+(n /p)) time using no more than n/log n processors, in the exclusive-read-exclusive-write parallel random-access machine model  相似文献   

5.
Parallel implementations of the extended square-root covariance filter (ESRCF) for tracking applications are developed. The decoupling technique and special properties used in the tracking Kalman filter (KF) are employed to reduce computational requirements and to increase parallelism. The application of the decoupling technique to the ESRCF results in the time and measurement updates of m decoupled (n/m)-dimensional matrices instead of one coupled n-dimensional matrix, where m denotes the tracking dimension and n denotes the number of state elements. The updates of m decoupled matrices are found to require approximately m fewer processing elements and clock cycles than the updates of one coupled matrix. The transformation of the Kalman gain which accounts for the decoupling is found to be straightforward to implement. The sparse nature of the measurement matrix and the sparse, band nature of the transition matrix are explored to simplify matrix multiplications  相似文献   

6.
It is shown that the existence of duplicate values in some attribute columns has a significant impact on the computational complexity of the sorting and joining operations. This is especially true when the number of distinct tuple values is a small fraction of the total number of tuples. The authors characterize a multirelation M (n, L) by its cardinality n and the number of distinct elements L it contains. Under this characterization, the worst time complexity of sorting such a multirelation with binary comparisons as basic operations is investigated. Upper and lower bounds on the number of three-branch comparisons needed to sort such a multirelation are established. Thereafter, the methodology used to study the complexity of sorting is applied to the natural join operation. It is shown that the existence of duplicate values in the join attribute columns can be exploited to reduce the computational complexity of the natural join operation  相似文献   

7.
A method for analyzing the lengths of memory queues when the network is conflict-free is described. An algorithm based on this method is shown to efficiently determine the upper and lower bounds of the queue length. Analysis indicates that the strategy of using hashing to spread data across memory modules is a good one. Results show that if the size of the system is increased while maintaining a constant ratio of numbers of processors to memories, then, asymptotically, the slowdown in performance from conflicts at the memory modules is Θ(log m /log log m). For m and n less than 100000 and λ between 0.25 and 4.0, the graphical data confirm this growth rate  相似文献   

8.
The problem of determining whether a polytope P of n ×n matrices is D-stable-i.e. whether each point in P has all its eigenvalues in a given nonempty, open, convex, conjugate-symmetric subset D of the complex plane-is discussed. An approach which checks the D-stability of certain faces of P is used. In particular, for each D and n the smallest integer m such that D-stability of every m-dimensional face guarantees D-stability of P is determined. It is shown that, without further information describing the particular structure of a polytope, either (2n-4)-dimensional or (2n-2)-dimensional faces need to be checked for D-stability, depending on the structure of D. Thus more work needs to be done before a computationally tractable algorithm for checking D-stability can be devised  相似文献   

9.
A novel discrete relaxation architecture   总被引:1,自引:0,他引:1  
The discrete relaxation algorithm (DRA) is a computational technique that enforces arc consistency (AC) in a constraint satisfaction problem (CSP). The original sequential AC-1 algorithm suffers from O(n3m3) time complexity, and even the optimal sequential AC-4 algorithm is O (n2m2) for an n-object and m-label DRA problem. Sample problem runs show that these algorithms are all too slow to meet the need for any useful, real-time CSP applications. A parallel DRA5 algorithm that reaches a lower bound of O(nm) (where the number of processors is polynomial in the problem size) is given. A fine-grained, massively parallel hardware computer architecture has been designed for the DRA5 algorithm. For practical problems, many orders of magnitude of efficiency improvement can be reached on such a hardware architecture  相似文献   

10.
The authors investigate the computing capabilities of formal McCulloch-Pitts neurons when errors are permitted in decisions. They assume that m decisions are to be made on a randomly specified m set of points in n space and that an error tolerance of ϵm decision errors is allowed, with 0⩽ϵ<1/2. The authors are interested in how large an m can be selected such that the neuron makes reliable decisions within the prescribed error tolerance. Formal results for two protocols for error-tolerance-a random error protocol and an exhaustive error protocol-are obtained. The results demonstrate that a formal neuron has a computational capacity that is linear in n and that this rate of capacity growth persists even when errors are tolerated in the decisions  相似文献   

11.
The author considers an indirect adaptive unity feedback controller consisting of an mth-order SISO (single input, single output) compensator controlling an nth-order strictly proper SISO plant. It is shown that exponential convergence of the plant parameter estimation error as well as asymptotic time invariance and global exponential stability of the controlled closed-loop system can be guaranteed by requiring that the reference input has at least 2n+m points of spectral support  相似文献   

12.
An efficiently computable metric for comparing polygonal shapes   总被引:9,自引:0,他引:9  
A method for comparing polygons that is a metric, invariant under translation, rotation, and change of scale, reasonably easy to compute, and intuitive is presented. The method is based on the L2 distance between the turning functions of the two polygons. It works for both convex and nonconvex polygons and runs in time O(mn log mn), where m is the number of vertices in one polygon and n is the number of vertices in the other. Some examples showing that the method produces answers that are intuitively reasonable are presented  相似文献   

13.
An algorithm for convolving a k×k window of weighting coefficients with an n×n image matrix on a pyramid computer of O(n2) processors in time O(logn+k2), excluding the time to load the image matrix, is presented. If k=Ω (√log n), which is typical in practice, the algorithm has a processor-time product O(n 2 k2) which is optimal with respect to the usual sequential algorithm. A feature of the algorithm is that the mechanism for controlling the transmission and distribution of data in each processor is finite state, independent of the values of n and k. Thus, for convolving two {0, 1}-valued matrices using Boolean operations rather than the typical sum and product operations, the processors of the pyramid computer are finite-state  相似文献   

14.
The number of distinct entries among the m2n entries of the nth Kronecker power of an m×m matrix is derived. An algorithm to find the value of each entry of the Kronecker power is presented  相似文献   

15.
A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p 1+ε) for any constant ε>0, and in time O(n log n/p+log3 p) otherwise. This clearly attains a linear speedup when n=Ω(p 1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication  相似文献   

16.
An O(n2) time serial algorithm is developed for obtaining the medial axis transform (MAT) of an n×n image. An O(log n) time CREW PRAM algorithm and an O(log2 n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n2) processors. Two problems associated with the MAT, the area and perimeter reporting problem, are studied. An O(log n) time hypercube algorithm is developed for both of them, where n is the number of squares in the MAT, and the algorithms use O(n2) processors  相似文献   

17.
A unified analytical model for computing the task-based dependability (TDB) of hypercube architectures is presented. A hypercube is deemed operational as long as a task can be executed on the system. The technique can compute both reliability and availability for two types of task requirements-I-connected model and subcube model. The I-connected TBD assumes that a connected group of at least I working nodes is required for task execution. The subcube TBD needs at least an m-cube in an n-cube, mn, for task execution. The dependability is computed by multiplying the probability that x nodes (xI or x⩾2m) are working in an n-cube at time t by the conditional probability that the hypercube can satisfy any one of the two task requirements from x working nodes. Recursive models are proposed for the two types of task requirements to find the connection probability. The subcube requirement is extended to find multiple subcubes for analyzing multitask dependability. The analytical results are validated through extensive simulation  相似文献   

18.
The eigenstructure assignment problem with output feedback is studied for systems satisfying the condition p+m> n. The main tool used is the concept of (C, A, B)-invariance and two coupled Sylvester equations, the solution of which leads to the computation of an output stabilizing feedback. A computationally efficient algorithm for the solution of these two coupled equations, which leads to the computation of a desired output feedback, is presented  相似文献   

19.
Algorithms are proposed for eigenvalue assignment (EVA) by constant as well as dynamic output feedback. The main algorithm is developed for single-input, multioutput systems and the results are then extended to multiinput, multioutput systems. In computing the feedback, use is made of the fact that the closed-loop eigenvalues can almost always be assigned arbitrarily close to the desired locations in the complex plane, provided the system satisfies the condition m+ p>n, where m, p, and n are , respectively, the number of inputs, outputs and states of the system. The EVA problem has been treated as a converse of the algebraic eigenvalue problem. The proposed algorithms are based on the implicitly shifted QR algorithm for solving the algebraic eigenvalue problem. The performance of the algorithms is illustrated by several numerical examples  相似文献   

20.
The design is discussed of distributed algorithms for the single-source shortest-path problem to run on an asynchronous directed network in which some of the edges may be associated with negative weights, and thus in which a cycle of negative total weight may also exist. The only existing solution in the literature for this problem is due to K.M. Chandy and J. Misra (1982), and it has, in the worst case, an unbounded message complexity. A synchronous version of the Chandy-Misra algorithm is described and studied, and it is proved that for a network with m edges and n nodes, the worst case message and time complexities of this algorithm are O(mn ) and O(n), respectively. This algorithm is then combined with an efficient synchronizer to yield an asynchronous protocol that retains the same message and time complexities  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号