首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Cluster Computing for Determining Three-Dimensional Protein Structure   总被引:1,自引:0,他引:1  
Determining the three-dimensional structure of proteins is crucial to efficient drug design and understanding biological processes. One successful method for computing the molecule’s shape relies on inter-atomic distance bounds provided by Nuclear Magnetic Resonance spectroscopy. The accuracy of computed structures as well as the time required to obtain them are greatly improved if the gaps between the upper and lower distance-bounds are reduced. These gaps are reduced most effectively by applying the tetrangle inequality, derived from the Cayley-Menger determinant, to all atom-quadruples. However, tetrangle-inequality bound-smoothing is an extremely computation intensive task, requiring O(n4) time for an n-atom molecule. To reduce computation time, we propose a novel coarse-grained parallel algorithm intended for a Beowulf-type cluster of PCs. The algorithm employs pn/6 processors and requires O(n4/p) time and O(p2) communications, where n is the number of atoms in a molecule. The number of communications is at least an order of magnitude lower than in the earlier parallelizations. Our implementation utilized processors with at least 59% efficiency (including the communication overhead)—an impressive figure for a non-embarrassingly parallel problem on a cluster of workstations.  相似文献   

3.
We present processor-time optimal parallel algorithms for several problems onn ×n digitized image arrays, on a mesh-connected array havingp processors and a memory of sizeO(n 2) words. The number of processorsp can vary over the range [1,n 3/2] while providing optimal speedup for these problems. The class of image problems considered here includes labeling the connected components of an image; computing the convex hull, the diameter, and a smallest enclosing box of each component; and computing all closest neighbors. Such problems arise in medium-level vision and require global operations on image pixels. To achieve optimal performance, several efficient data-movement and reduction techniques are developed for the proposed organization.This research was supported in part by the National Science Foundation under Grant IRI-8710836 and in part by DARPA under Contract F33615-87-C-1436 monitored by the Wright Patterson Airforce Base.  相似文献   

4.
We present efficient algorithms for solving several fundamental graph-theoretic problems on a Linear Array with a Reconfigurable Pipelined Bus System (LARPBS), one of the recently proposed models of computation based on optical buses. Our algorithms include finding connected components, minimum spanning forest, biconnected components, bridges and articulation points for an undirected graph. We compute the connected components and minimum spanning forest of a graph in O(log n) time using O(m+n) processors where m and n are the number of edges and vertices in the graph and m=O(n 2) for a dense graph. Both the processor and time complexities of these two algorithms match the complexities of algorithms on the Arbitrary and Priority CRCW PRAM models which are two of the strongest PRAM models. The algorithms for these two problems published by Li et al. [7] have been considered to be the most efficient on the LARPBS model till now. Their algorithm [7] for these two problems require O(log n) time and O(n 3/log n) processors. Hence, our algorithms have the same time complexity but require less processors. Our algorithms for computing biconnected components, bridges and articulation points of a graph run in O(log n) time on an LARPBS with O(n 2) processors. No previous algorithm was known for these latter problems on the LARPBS.  相似文献   

5.
The list marking problem involves marking the nodes of an ℓ-node linked list stored in the memory of a (p, n)-PRAM, when only the position of the head of the list is initially known, while the remaining list nodes are stored in arbitrary memory locations. Under the assumption that cells containing list nodes bear no distinctive tags distinguishing them from other cells, we establish anΩ(min{ℓ, n/p}) randomized lower bound for ℓ-node lists and present a deterministic algorithm whose running time is within a logarithmic additive term of this bound. Such a result implies that randomization cannot be exploited in any significant way in this setting. For the case where list cells are tagged in a way that differentiates them from other cells, the above lower bound still applies to deterministic algorithms, while we establish a tight

bound for randomized algorithms. Therefore, in the latter case, randomization yields a better performance for a wide range of parameter values.  相似文献   

6.
This paper presents new algorithms for solving some geometric problems on a shared memory parallel computer, where concurrent reads are allowed but no two processors can simultaneously attempt to write in the same memory location. The algorithms are quite different from known sequential algorithms, and are based on the use of a new parallel divide-and-conquer technique. One of our results is an O(log n) time, O(n) processor algorithm for the convex hull problem. Another result is an O(log n log log n) time, O(n) processor algorithm for the problem of selecting a closest pair of points among n input points.  相似文献   

7.
Given ann-vertex simple polygon we address the following problems: (i) find the shortest path between two pointss andd insideP, and (ii) compute the shortestpath tree between a single points and each vertex ofP (which implicitly represents all the shortest paths). We show how to solve the first problem inO(logn) time usingO(n) processors, and the more general second problem inO(log2 n) time usingO(n) processors, and the more general second problem inO(log2 n) time usingO(n) processors for any simple polygonP. We assume the CREW RAM shared memory model of computation in which concurrent reads are allowed, but no two processors should attempt to simultaneously write in the same memory location. The algorithms are based on the divide-and-conquer paradigm and are quite different from the known sequential algorithmsResearch supported by the Faculty of Graduate Studies and Research (McGill University) grant 276-07  相似文献   

8.
We present an optimal parallel algorithm for computing a cycle separator of ann-vertex embedded planar undirected graph inO(logn) time onn/logn processors. As a consequence, we also obtain an improved parallel algorithm for constructing a depth-first search tree rooted at any given vertex in a connected planar undirected graph in O(log2 n) time on n/logn processors. The best previous algorithms for computing depth-first search trees and cycle separators achieved the same time complexities, but withn processors. Our algorithms run on a parallel random access machine that permits concurrent reads and concurrent writes in its shared memory and allows an arbitrary processor to succeed in case of a write conflict.A preliminary version of this paper appeared as Improved Parallel Depth-First Search in Undirected Planar Graphs in theProceedings of the Third Workshop on Algorithms and Data Structures, 1993, pp. 407–420.Supported in part by NSF Grant CCR-9101385.  相似文献   

9.
We describe two new parallel algorithms, one conservative and another optimistic, for discrete-event simulation on an exclusive-read exclusive-write parallel random-access machine (EREW PRAM). The target physical systems are bounded degree networks which are represented by logic circuits. Employing p processors, our conservative algorithm can simulate up to O(p) independent messages of a system with n logical processes in O(log n) time. The number of processors, p, can be optimally varied in the range 1 ≤ pn. To identify independent messages, this algorithm also introduces a novel scheme based on a variable size time window. Our optimistic algorithm is designed to reduce the rollback frequency and the memory requirement to save past states and messages. The optimistic algorithm also simulates O(p) earliest messages on a p-processor computer in O(log n) time. To our knowledge, such a theoretical efficiency in parallel simulation algorithms, conservative or optimistic, has been achieved for the first time.  相似文献   

10.
Previous research on developing parallel triangulation algorithms concentrated on triangulating planar point sets.O(log3 n) running time algorithms usingO(n) processors have been developed in Refs. 1 and 2. Atallah and Goodrich(3) presented a data structure that can be viewed as a parallel analogue of the sequential plane-sweeping paradigm, which can be used to triangulate a planar point set inO(logn loglogn) time usingO(n) processors. Recently Merks(4) described an algorithm for triangulating point sets which runs inO(logn) time usingO(n) processors, and is thus optimal. In this paper we develop a parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5. The algorithm runs inO(log2 n) time usingO(n/logn) processors. The algorithm hasO(n logn) as the product of the running time and the number of processors; i.e., an optimal speed-up.  相似文献   

11.
Xin He  Yaacov Yesha 《Algorithmica》1990,5(1):129-145
We develop efficient parallel algorithms for ther-dominating set and thep-center problems on trees. On a concurrent-read exclusive-write PRAM, our algorithm for ther-dominating set problem runs inO(logn log logn) time withn processors. The algorithm for thep-center problem runs inO(log2 n log logn) time withn processors.Xin He was supported in part by an Ohio State University Presidential Fellowship, and by the Office of Research and Graduate Studies of Ohio State University. Yaacov Yesha was supported in part by the National Science Foundation under Grant No. DCR-8606366.  相似文献   

12.
分布式存储的并行串匹配算法的设计与分析   总被引:7,自引:0,他引:7  
陈国良  林洁  顾乃杰 《软件学报》2000,11(6):771-778
并行串匹配算法的研究大都集中在PRAM(parallel random access machine)模型上,其他更为实际的模型上的并行串匹配算法的研究相对要薄弱得多.该文采用将最优串行算法并行化的技术,利用模式串的周期性质,巧妙地将改进的KMP(Knuth-Morris-Pratt)算法并行化,提出了一个简便、高效且具有良好可扩放性的分布式串匹配算法,其计算复杂度为O(n/p+m),通信复杂度为O(ulogp相似文献   

13.
We present an algorithm to solve the subset‐sum problem (SSP) of capacity c and n items with weights wi,1≤in, spending O(n(mwmin)/p) time and O(n + mwmin) space in the Concurrent Read/Concurrent Write (CRCW) PRAM model with 1≤pmwmin processors, where wmin is the lowest weight and , improving both upper‐bounds. Thus, when nc, it is possible to solve the SSP in O(n) time in parallel environments with low memory. We also show OpenMP and CUDA implementations of this algorithm and, on Graphics Processing Unit, we obtained better performance than the best sequential and parallel algorithms known so far. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

14.
A stringw isprimitive if it is not a power of another string (i.e., writingw =v k impliesk = 1. Conversely,w is asquare ifw =vv, withv a primitive string. A stringx issquare-free if it has no nonempty substring of the formww. It is shown that the square-freedom of a string ofn symbols over an arbitrary alphabet can be tested by a CRCW PRAM withn processors inO(logn) time and linear auxiliary space. If the cardinality of the input alphabet is bounded by a constant independent of the input size, then the number of processors can be reduced ton/logn without affecting the time complexity of this strategy. The fastest sequential algorithms solve this problemO(n logn) orO(n) time, depending on whether the cardinality of the input alphabet is unbounded or bounded, and either performance is known to be optimal within its class. More elaborate constructions lead to a CRCW PRAM algorithm for detecting, within the samen-processors bounds, all positioned squares inx in timeO(logn) and using linear auxiliary space. The fastest sequential algorithms solve this problem inO(n logn) time, and such a performance is known to be optimal.This research was supported, through the Leonardo Fibonacci Institute, by the Istituto Trentino di Cultura, Trento, Italy. Additional support was provided by the French and Italian Ministries of Education, by the National Research Council of Italy, by the British Research Council Grant SERC-E76797, by NSF Grant CCR-89-00305, by NIH Library of Medicine Grant ROI LM05118, by AFOSR Grant 90-0107, and by NATO Grant CRG900293.  相似文献   

15.
Computing an optimal solution to the knapsack problem is known to be NP-hard. Consequently, fast parallel algorithms for finding such a solution without using an exponential number of processors appear unlikely. An attractive alternative is to compute an approximate solution to this problem rapidly using a polynomial number of processors. In this paper, we present an efficient parallel algorithm for finding approximate solutions to the 0–1 knapsack problem. Our algorithm takes an , 0 < < 1, as a parameter and computes a solution such that the ratio of its deviation from the optimal solution is at most a fraction of the optimal solution. For a problem instance having n items, this computation uses O(n5/2/3/2) processors and requires O(log3n + log2nlog(1/)) time. The upper bound on the processor requirement of our algorithm is established by reducing it to a problem on weighted bipartite graphs. This processor complexity is a significant improvement over that of other known parallel algorithms for this problem.  相似文献   

16.
The recent availability of detailed geographic data permits terrain applications to process large areas at high resolution. However the required massive data processing presents significant challenges, demanding algorithms optimized for both data movement and computation. One such application is viewshed computation, that is, to determine all the points visible from a given point p. In this paper, we present an efficient algorithm to compute viewsheds on terrain stored in external memory. In the usual case where the observer’s radius of interest is smaller than the terrain size, the algorithm complexity is θ(scan(n 2)) where n 2 is the number of points in an n × n DEM and scan(n 2) is the minimum number of I/O operations required to read n 2 contiguous items from external memory. This is much faster than existing published algorithms.  相似文献   

17.
In this paper we give efficient parallel algorithms for solving a number of visibility and shortest-path problems for simple polygons. Our algorithms all run inO(logn) time and are based on the use of a new data structure for implicitly representing all shortest paths in a simple polygonP, which we call thestratified decomposition tree. We use this approach to derive efficient parallel methods for computing the visibility ofP from an edge, constructing the visibility graph of the vertices ofP (using an output-sensitive number of processors), constructing the shortest-path tree from a vertex ofP, and determining all-farthest neighbors for the vertices inP. The computational model we use is the CREW PRAM.This research was announced in preliminary form in theProceedings of the 6th ACM Symposium on Computational Geometry, 1990, pp. 73–82. The research of Michael T. Goodrich was supported by the National Science Foundation under Grants CCR-8810568 and CCR-9003299, and by the NSF and DARPA under Grant CCR-8908092.  相似文献   

18.
A planar monotone circuit (PMC) is a Boolean circuit that can be embedded in the plane and that contains only AND and OR gates. A layered PMC is a PMC in which all input nodes are in the external face, and the gates can be assigned to layers in such a way that every wire goes between gates in successive layers. Goldschlager, Cook and Dymond, and others have developed NC 2 algorithms to evaluate a layered PMC when the output node is in the same face as the input nodes. These algorithms require a large number of processors (Ω(n 6 ), where n is the size of the input circuit). In this paper we give an efficient parallel algorithm that evaluates a layered PMC of size n in time using only a linear number of processors on an EREW PRAM. Our parallel algorithm is the best possible to within a polylog factor, and is a substantial improvement over the earlier algorithms for the problem. Received April 18, 1994; revised April 7, 1995.  相似文献   

19.
Optimal and near optimal parallel algorithms for several fundamental problems are proposed for a parallel organization consistmg of n processors, each having access to a row and a column of an n × n array of memory modules. Parallel computations are implemented on such an organization by decomposing them into alternating orthogonal processing phases. A number of efficient data movement techniques are developed for the proposed organization which lead to optimal or near optimal solutions to several communication-intensive problems such as sorting, performing permutations, list ranking (data dependent parallel prefix), and problems on graphs represented by an unsorted list of n2 edges. It is also shown that the proposed organization is capable of simulating any fixed-degree network on n2 processors with O(n) loss in time, which is optimal. Finally, an enhanced organization having p processors, 1 ≤ pn2, and O(n2) memory locations is presented, and is shown to provide optimal speedups for adjacency-matrix based graph problems for any number of processors in the range [1, n3/2].  相似文献   

20.
Parallel algorithms for the problems of selection and searching on sorted matrices are formulated. The selection algorithm takesO(lognlog lognlog*n) time withO(n/lognlog*n) processors on an EREW PRAM. This algorithm can be generalized to solve the selection problem on a set of sorted matrices. The searching algorithm takesO(log logn) time withO(n/log logn) processors on a Common CRCW PRAM, which is optimal. We show that no algorithm using at mostnlogcnprocessors,c≥ 1, can solve the matrix search problem in time faster than Ω(log logn) and that Ω(logn) steps are needed to solve this problem on any model that does not allow concurrent writes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号