首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a parallel algorithm for solving thenext element search problemon a set of line segments, using a BSP-like model referred to as thecoarse grained multicomputer(CGM). The algorithm requiresO(1) communication rounds (h-relations withh=O(n/p)),O((n/p) log n) local computation, andO((n/p) log p) memory per processor, assumingn/pp. Our result implies solutions to the point location, trapezoidal decomposition, and polygon triangulation problems. A simplified version for axis-parallel segments requires onlyO(n/p) memory per processor, and we discuss an implementation of this version. As in a previous paper by Develliers and Fabri (Int. J. Comput. Geom. Appl.6(1996), 487–506), our algorithm is based on a distributed implementation of segment trees which are of sizeO(n log n). This paper improves onop. cit.in several ways: (1) It studies the more general next element search problem which also solves, e.g., planar point location. (2) The algorithms require onlyO((n/p) log n) local computation instead ofO(log p*(n/p) log n). (3) The algorithms require onlyO((n/p) log p) local memory instead ofO((n/p) log n).  相似文献   

2.
We modify the concept of LLL-reduction of lattice bases in the sense of Lenstra, Lenstra, Lovász, Factoring polynomials with rational coefficients, Math. Ann. 261 (1982) 515–534 towards a faster reduction algorithm. We organize LLL-reduction in segments of the basis. Our SLLL-bases approximate the successive minima of the lattice in nearly the same way as LLL-bases. For integer lattices of dimension n given by a basis of length 2O(n), SLLL-reduction runs in O (n5 +ε) bit operations for every ε > 0, compared to O (n7 +ε) for the original LLL and to O (n6 +ε) for the LLL-algorithms of Schnorr, A more efficient algorithm for lattice reduction, Journal of Algorithm, 9 (1988) 47–62 and Storjohann, Faster Algorithms for Integer Lattice Basis Reduction. TR 249, Swiss Federal Institute of Technology, ETH-Zurich, Department of Computer Science, Zurich, Switzerland, July 1996. We present an even faster algorithm for SLLL-reduction via iterated subsegments running in O (n3log n) arithmetic steps. Householder reflections are shown to provide better accuracy than Gram–Schmidt for orthogonalizing LLL-bases in floating point arithmetic.  相似文献   

3.
We present a divide and conquer based algorithm for optimal quantum compression/decompression, using O(n(log4n)log log n) elementary quantum operations. Our result provides the first quasi-linear time algorithm for asymptotically optimal (in size and fidelity) quantum compression and decompression. We also outline the quantum gate array model to bring about this compression in a quantum computer. Our method uses various classical algorithmic tools to significantly improve the bound from the previous best known bound of O(n3) for this operation.  相似文献   

4.
In this paper we describe scalable parallel algorithms for building the convex hull and a triangulation ofncoplanar points. These algorithms are designed for thecoarse grained multicomputermodel:pprocessors withO(n/p)⪢O(1) local memory each, connected to some arbitrary interconnection network. They scale over a large range of values ofnandp, assuming only thatnp1+ε(ε>0) and require timeO((Tsequential/p)+Ts(n, p)), whereTs(n, p) refers to the time of a global sort ofndata on approcessor machine. Furthermore, they involve only a constant number of global communication rounds. Since computing either 2D convex hull or triangulation requires timeTsequential=Θ(n log n) these algorithms either run in optimal time,Θ((n log n)/p), or in sort time,Ts(n, p), for the interconnection network in question. These results become optimal whenTsequential/pdominatesTs(n, p) or for interconnection networks like the mesh for which optimal sorting algorithms exist.  相似文献   

5.
We introduce a GPU-based parallel vertex substitution (pVS) algorithm for the p-median problem using the CUDA architecture by NVIDIA. pVS is developed based on the best profit search algorithm, an implementation of vertex substitution (VS), that is shown to produce reliable solutions for p-median problems. In our approach, each candidate solution in the entire search space is allocated to a separate thread, rather than dividing the search space into parallel subsets. This strategy maximizes the usage of GPU parallel architecture and results in a significant speedup and robust solution quality. Computationally, pVS reduces the worst case complexity from sequential VS’s O(p · n2) to O(p · (n ? p)) on each thread by parallelizing computational tasks on GPU implementation. We tested the performance of pVS on two sets of numerous test cases (including 40 network instances from OR-lib) and compared the results against a CPU-based sequential VS implementation. Our results show that pVS achieved a speed gain ranging from 10 to 57 times over the traditional VS in all test network instances.  相似文献   

6.
Uneven energy consumption is an inherent problem in wireless sensor networks characterized by multi-hop routing and many-to-one traffic pattern. Such unbalanced energy dissipation can significantly reduce network lifetime. In this paper, we study the problem of prolonging network lifetime in large-scale wireless sensor networks where a mobile sink gathers data periodically along the predefined path and each sensor node uploads its data to the mobile sink over a multi-hop communication path. By using greedy policy and dynamic programming, we propose a heuristic topology control algorithm with time complexity O(n(m + n log n)), where n and m are the number of nodes and edges in the network, respectively, and further discuss how to refine our algorithm to satisfy practical requirements such as distributed computing and transmission timeliness. Theoretical analysis and experimental results show that our algorithm is superior to several earlier algorithms for extending network lifetime.  相似文献   

7.
We address the problem of determining all extreme supported solutions of the biobjective shortest path problem. A novel Dijkstra-like method generalizing Dijkstra׳s algorithm to this biobjective case is proposed. The algorithm runs in O(N(m+n log n)) time to solve one-to-one and one-to-all biobjective shortest path problems determining all extreme supported non-dominated points in the outcome space and one supported efficient path associated with each one of them. Here n is the number of nodes, m is the number of arcs and N is the number of extreme supported points in outcome space for the one-to-all biobjective shortest path problem. The memory space required by the algorithm is O(n+m) for the one-to-one problem and O(N+m) for the one-to-all problem. A computational experiment comparing the performance of the proposed methods and state-of-the-art methods is included.  相似文献   

8.
We present a parallel priority queue that supports the following operations in constant time:parallel insertionof a sequence of elements ordered according to key,parallel decrease keyfor a sequence of elements ordered according to key,deletion of the minimum key element, anddeletion of an arbitrary element. Our data structure is the first to support multi-insertion and multi-decrease key in constant time. The priority queue can be implemented on the EREW PRAM and can perform any sequence ofnoperations inO(n) time andO(mlogn) work,mbeing the total number of keyes inserted and/or updated. A main application is a parallel implementation of Dijkstra's algorithm for the single-source shortest path problem, which runs inO(n) time andO(mlogn) work on a CREW PRAM on graphs withnvertices andmedges. This is a logarithmic factor improvement in the running time compared with previous approaches.  相似文献   

9.
The well-known Goldbach Conjecture (GC) states that any sufficiently large even number can be represented as a sum of two odd primes. Although not yet demonstrated, it has been checked for integers up to 1014. Using two stronger versions of the conjecture, we offer a simple and fast method for recognition of a gray box group G known to be isomorphic to Sn(or An) with knownn   20, i.e. for construction1of an isomorphism from G toSn (or An). Correctness and rigorous worst case complexity estimates rely heavily on the conjectures, and yield times of O([ρ + ν + μ ] n log2n) or O([ ρ + ν + μ ] n logn / loglog n) depending on which of the stronger versions of the GC is assumed to hold. Here,ρ is the complexity of generating a uniform random element of G, ν is the complexity of finding the order of a group element in G, and μ is the time necessary for group multiplication in G. Rigorous lower bound and probabilistic approach to the time complexity of the algorithm are discussed in the Appendix.  相似文献   

10.
Given a set S of n points in the plane, we consider the problem of partitioning S into two subsets such that the maximum of their diameters is minimized. We present a parallel algorithm to solve this problem that runs in time O(log n) using the CREW PRAM with 0(n2) processors.  相似文献   

11.
We have developed a parallel algorithm for radial basis function (rbf) interpolation that exhibits O(N) complexity, requires O(N) storage, and scales excellently up to a thousand processes. The algorithm uses a gmres iterative solver with a restricted additive Schwarz method (rasm) as a preconditioner and a fast matrix-vector algorithm. Previous fast rbf methods — achieving at most O(NlogN) complexity — were developed using multiquadric and polyharmonic basis functions. In contrast, the present method uses Gaussians with a small variance with respect to the domain, but with sufficient overlap. This is a common choice in particle methods for fluid simulation, our main target application. The fast decay of the Gaussian basis function allows rapid convergence of the iterative solver even when the subdomains in the rasm are very small. At the same time we show that the accuracy of the interpolation can achieve machine precision. The present method was implemented in parallel using the petsc library (developer version). Numerical experiments demonstrate its capability in problems of rbf interpolation with more than 50 million data points, timing at 106 s (19 iterations for an error tolerance of 10? 15) on 1024 processors of a Blue Gene/L (700 MHz PowerPC processors). The parallel code is freely available in the open-source model.  相似文献   

12.
Diagnosis of reliability is an important topic for interconnection networks. Under the classical PMC model, Dahura and Masson [5] proposed a polynomial time algorithm with time complexity O(N2.5) to identify all faulty nodes in an N-node network. This paper addresses the fault diagnosis of so called bijective connection (BC) graphs including hypercubes, twisted cubes, locally twisted cubes, crossed cubes, and Möbius cubes. Utilizing a helpful structure proposed by Hsu and Tan [20] that was called the extending star by Lin et al. [24], and noting the existence of a structured Hamiltonian path within any BC graph, we present a fast diagnostic algorithm to identify all faulty nodes in O(N) time, where N = 2n, n ? 4, stands for the total number of nodes in the n-dimensional BC graph. As a result, this algorithm is significantly superior to Dahura–Masson’s algorithm when applied to BC graphs.  相似文献   

13.
Congruence closure algorithms for deduction in ground equational theories are ubiquitous in many (semi-)decision procedures used for verification and automated deduction. In many of these applications one needs an incremental algorithm that is moreover capable of recovering, among the thousands of input equations, the small subset that explains the equivalence of a given pair of terms. In this paper we present an algorithm satisfying all these requirements. First, building on ideas from abstract congruence closure algorithms, we present a very simple and clean incremental congruence closure algorithm and show that it runs in the best known time O(n  log  n). After that, we introduce a proof-producing union-find data structure that is then used for extending our congruence closure algorithm, without increasing the overall O(n  log  n) time, in order to produce a k-step explanation for a given equation in almost optimal time (quasi-linear in k). Finally, we show that the previous algorithms can be smoothly extended, while still obtaining the same asymptotic time bounds, in order to support the interpreted functions symbols successor and predecessor, which have been shown to be very useful in applications such as microprocessor verification.  相似文献   

14.
Maximal outerplanar graphs constitute an important class of graphs, often encountered in various applications, e.g., computational geometry, robotics, etc. In this paper, we propose a parallel algorithm for testing the isomorphism of maximal outerplanar graphs. Given the ordered adjacency lists of the two graphs, the proposed algorithm tests their isomorphism inO(log N) time usingNprocessors, for graphs withNnodes on an EREW shared memory model, as well as on a hypercube arhitecture. When the adjacency matrices of the graphs are given, this algorithm can be redesigned onN2processors to run inO(log N) time.  相似文献   

15.
In this paper, a quadtree-based mesh generation method is described to create guaranteed-quality, geometry-adapted all-quadrilateral (all-quad) meshes with feature preservation for arbitrary planar domains. Given point cloud, our method generates all-quad meshes with these points as vertices and all the angles are within [45°, 135°]. For given planar curves, quadtree-based spatial decomposition is governed by the curvature of the boundaries and narrow regions. 2-refinement templates are chosen for local mesh refinement without creating any hanging nodes. A buffer zone is created by removing elements around the boundary. To guarantee the mesh quality, the angles facing the boundary are improved via template implementation, and two buffer layers are inserted in the buffer zone. It is proved that all the elements of the final mesh are quads with angles between 45° ± ε and 135° ± ε (ε  5°) with the exception of badly shaped elements that may be required by the sharp angles in the input geometry. We also prove that the scaled Jacobians defined by two edge vectors are in the range of [sin(45° ? ε), sin90°], or [0.64, 1.0]. Furthermore, sharp features and narrow regions are detected and preserved automatically. Boundary layer meshes are generated by splitting elements of the second buffer layer. We have applied our algorithm to a set of complicated geometries, including the Lake Superior map and the air foil with multiple components.  相似文献   

16.
TheP4-tidy graphs were introduced by I. Rusu to generalize some already known classes of graphs with few inducedP4(cographs,P4-sparse graphs,P4-lite graphs). Here, we propose an extension of R. Lin and S. Olariu's work (1994.J. Parallel Distributed Computing22, 26–36.) on cographs, using the modular decomposition. As an application, we show how to obtain a maximum matching parallel algorithm for the family ofP4-tidy graphs (repre- sented by a parse tree) inO(log n) time withO(n/log n) processors in the EREW-PRAM model withn-vertex graphs.  相似文献   

17.
We prove new lower bounds for nearest neighbor search in the Hamming cube. Our lower bounds are for randomized, two-sided error, algorithms in Yao's cell probe model. Our bounds are in the form of a tradeoff among the number of cells, the size of a cell, and the search time. For example, suppose we are searching among n points in the d dimensional cube, we use poly(n,d) cells, each containing poly(d, log n) bits. We get a lower bound of Ω(d/log n) on the search time, a significant improvement over the recent bound of Ω(log d) of Borodin et al. This should be contrasted with the upper bound of O(log log d) for approximate search (and O(1) for a decision version of the problem; our lower bounds hold in that case). By previous results, the bounds for the cube imply similar bounds for nearest neighbor search in high dimensional Euclidean space, and for other geometric problems.  相似文献   

18.
Computations of irregular primes and associated cyclotomic invariants were extended to all primes up to 12 million using multisectioning/convolution methods and a novel approach which originated in the study of Stickelberger codes Shokrollahi (1996). The latter idea reduces the problem to that of finding zeros of a polynomial overFpof degree  <  (p   1) / 2 among the quadratic nonresidues mod p. Use of fast polynomial gcd-algorithms gives anO (p log2p loglog p)-algorithm for this task. A more efficient algorithm, with comparable asymptotic running time, can be obtained by using Schönhage–Strassen integer multiplication techniques and fast multiple polynomial evaluation algorithms; this approach is particularly efficient when run on primes p for whichp   1 has small prime factors. We also give some improvements on previous implementations for verifying the Kummer–Vandiver conjecture and for computing the cyclotomic invariants of a prime.  相似文献   

19.
In this paper, we present a parallel sorting algorithm using the technique of multi-way merge. This algorithm, when implemented on a t dimensional mesh having nt nodes (t>2), sorts nt elements in O((t2−3t+2) n) time, thus offering a better order of time complexity than the [((t2t) n log n)/2+O(nt)]-time algorithm of P. F. Corbett and I. D. Scherson (1992, IEEE Trans. Parallel Distrib. Systems3, 626–632). Further, the proposed algorithm can also be implemented on a Multi-Mesh network (1999, D. Das, M. De, and B. P. Sinha, IEEE Trans. Comput.48, 536–551) to sort N elements in 54N1/4+o(N1/4) steps, which shows an improvement over 58N1/4+o(N1/4) steps needed by the algorithm in (1997, M. De, D. Das, M. Ghosh, and P. B. Sinha, IEEE Trans. Comput.46, 1132–1137).  相似文献   

20.
In this paper, we present a novel hexagon-based mesh generation method which creates all-quadrilateral (all-quad) meshes with guaranteed angle bounds and feature preservation for arbitrary planar domains. Given any planar curves, an adaptive hexagon-tree structure is constructed by using the curvature of the boundaries and narrow regions. Then a buffer zone and a hexagonal core mesh are created by removing elements outside or around the boundary. To guarantee the mesh quality, boundary edges of the core mesh are adjusted to improve their formed angles facing the boundary, and two layers of quad elements are inserted in the buffer zone. For any curve with sharp features, a corresponding smooth curve is firstly constructed and meshed, and then another layer of elements is inserted to match the smooth curve with the original one. It is proved that for any planar smooth curve all the element angles are within [60° ? ε, 120° + ε] (ε ? 5°). We also prove that the scaled Jacobians defined by two edge vectors are in the range of [sin (60° ? ε),  sin 90°], or [0.82, 1.0]. The same angle range can be guaranteed for curves with sharp features, with the exception of small angles in the input curve. Furthermore, an approach is introduced to match the generated interior and exterior meshes with a relaxed angle range, [30°, 150°]. We have applied our algorithm to a set of complicated geometries, including the China map, the Lake Superior map, and a three-component air foil with sharp features. In addition, all the elements in the final mesh are grouped into five types, and most elements only need a few flops to construct the stiffness matrix for finite element analysis. This will significantly reduce the computational time and the required memory during the stiffness matrix construction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号