首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We substantially improve the known algorithms for approximating all the complex zeros of an nth degree polynomial p(x). Our new algorithms save both Boolean and arithmetic sequential time, versus the previous best algorithms of Schönhage [1], Pan [2], and Neff and Reif [3]. In parallel (NC) implementation, we dramatically decrease the number of processors, versus the parallel algorithm of Neff [4], which was the only NC algorithm known for this problem so far. Specifically, under the simple normalization assumption that the variable x has been scaled so as to confine the zeros of p(x) to the unit disc x : |x| ≤ 1, our algorithms (which promise to be practically effective) approximate all the zeros of p(x) within the absolute error bound 2b, by using order of n arithmetic operations and order of (b + n)n2 Boolean (bitwise) operations (in both cases up to within polylogarithmic factors). The algorithms allow their optimal (work preserving) NC parallelization, so that they can be implemented by using polylogarithmic time and the orders of n arithmetic processors or (b + n)n2 Boolean processors. All the cited bounds on the computational complexity are within polylogarithmic factors from the optimum (in terms of n and b) under both arithmetic and Boolean models of computation (in the Boolean case, under the additional (realistic) assumption that n = O(b)).  相似文献   

2.
讨论了同时求解n次多项式所有零点的牛顿法及其改进;给出了保证它们收敛的初值应满足的一个充分条件,并证明了收敛性.数值实例的计算结果是满意的.  相似文献   

3.
《Automatica》2014,50(12):3030-3037
We present an elimination theory-based method for solving equality-constrained multivariable polynomial least-squares problems in system identification. While most algorithms in elimination theory rely upon Groebner bases and symbolic multivariable polynomial division algorithms, we present an algorithm which is based on computing the nullspace of a large sparse matrix and the zeros of a scalar, univariate polynomial.  相似文献   

4.
This paper describes the parallel solution of a class of large sparse systems of linear equations produced by an oil reservoir simulator. Specifically, we focus on the implementation of a conjugate gradient algorithm for a transputer-based machine. After discussing communication, harnesses, we present strategies for decomposing the algorithm on a transputer array, and report the results of measurements of speed-ups for some practical reservoir problems. We then address the problemsof preconditioning by first implementing distributed forms of three standard iterative algorithms, namely Jacobi, Gauss-Seidel and Successive Over-relaxation, and determining their convergence and speed-up properties. On the basis of these measurements, we suggest that a Jacobi preconditioned conjugate gradient (JPCG) algorithm appears likely to be the most cost-effective for the class of problems under considerations. Finally we implement the JPCG algorithm and present measurements in support of our claim.  相似文献   

5.
Logistic Regression,AdaBoost and Bregman Distances   总被引:8,自引:0,他引:8  
Collins  Michael  Schapire  Robert E.  Singer  Yoram 《Machine Learning》2002,48(1-3):253-285
We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt algorithms designed for one problem to the other. For both problems, we give new algorithms and explain their potential advantages over existing methods. These algorithms are iterative and can be divided into two types based on whether the parameters are updated sequentially (one at a time) or in parallel (all at once). We also describe a parameterized family of algorithms that includes both a sequential- and a parallel-update algorithm as special cases, thus showing how the sequential and parallel approaches can themselves be unified. For all of the algorithms, we give convergence proofs using a general formalization of the auxiliary-function proof technique. As one of our sequential-update algorithms is equivalent to AdaBoost, this provides the first general proof of convergence for AdaBoost. We show that all of our algorithms generalize easily to the multiclass case, and we contrast the new algorithms with the iterative scaling algorithm. We conclude with a few experimental results with synthetic data that highlight the behavior of the old and newly proposed algorithms in different settings.  相似文献   

6.
A c-vertex-ranking of a graph G for a positive integer c is a labeling of the vertices of G with integers such that, for any label i, deletion of all vertices with labels >i leaves connected components, each having at most c vertices with label i. A c-vertex-ranking is optimal if the number of labels used is as small as possible. We present sequential and parallel algorithms to find an optimal c-vertex-ranking of a partial k-tree, that is, a graph of treewidth bounded by a fixed integer k. The sequential algorithm takes polynomial-time for any positive integer c. The parallel algorithm takes O(log n) parallel time using a polynomial number of processors on the common CRCW PRAM, where n is the number of vertices in G.  相似文献   

7.
The implementation and evaluation of the performances on the ICL DAP of two algorithms for the parallel computation of eigenvalues and eigenvectors of moderately large real symmetric matrices of order N, where 64 < N 256, is reported. The first of the algorithms is a modified form of a Parallel Orthogonal Transformation algorithm proposed by Clint et al., which has already been implemented on the DAP for matrices of order N, where N < 65. The second, which has also been implemented on the DAP for matrices of order N, where N < 65, is Jacobi's algorithm, in the modified form proposed by Modi and Pryce. A comparison of the efficiency of the two algorithms for the solution of a variety of large matrices is given.  相似文献   

8.
Initial conditions that provide guaranteed and fast convergence of the Weierstrass-like cubically convergent iterative method for the simultaneous determination of all simple zeros of a polynomial are considered. It is proved that this method is convergent under suitable conditions stated in the spirit of Smale's point estimation theory. The proposed convergence conditions are computationally verifiable since they depend only on initial approximations and the degree of a given polynomial, which is of practical importance.  相似文献   

9.
We propose a model of parallel computation, the YPRAM, that allows general parallel algorithms to be designed for a wide class of parallel models. The basic model captures locality among processors, which is measured as a function of two parameters; latency and bandwidth.

We design YPRAM algorithms for solving several fundamental problems: parallel prefix, sorting, sorting numbers from a bounded range, and list ranking. We show that our model predicts, reasonably accurately, the actual known performances of several basic parallel models — PRAM, hypercube, mesh and tree — when solving these problems.  相似文献   


10.
A parallel two-list algorithm for the knapsack problem   总被引:10,自引:0,他引:10  
An n-element knapsack problem has 2n possible solutions to search over, so a task which can be accomplished in 2″ trials if an exhaustive search is used. Due to the exponential time in solving the knapsack problem, the problem is considered to be very hard. In the past decade, much effort has been done in order to find techniques which could lead to practical algorithms with reasonable running time. In 1994, Chang et al. proposed a brilliant parallel algorithm, which needs O(2n/8) processors to solve the knapsack problem in O(2n/2) time; that is, the cost of Chang et al.'s parallel algorithm is O(25n/8). In this paper, we propose a parallel algorithm to improve Chang et al.'s parallel algorithm by reducing the time complexity to be O(23n/8) under the same O(2n/8) processors available. Thus, the proposed parallel algorithm has a cost of O(2n/2). It is an improvement over previous literature. We believe that the proposed parallel algorithm is pragmatically feasible at the moment when multiprocessor systems become more and more popular.  相似文献   

11.
J. W. Demmel 《Computing》1987,38(1):43-57
We compare three methods for refining estimates of invariant subspaces, due to Chatelin, Dongarra/Moler/Wilkinson, and Stewart. Even though these methods all apparently solve different equations, we show by changing variables that they all solve the same equation, the Riccati equation. The benefit of this point of view is threefold. First, the same convergence theory applies to all three methods, yielding a single criterion under which the last two methods converge linearly, and a slightly stronger criterion under which the first algorithm converges quadratically. Second, it suggest a hybrid algorithm combining advantages of all three. Third, it leads to algorithms (and convergence criteria) for the generalized eigenvalue problem. These techniques are compared to techniques used in the control systems community.  相似文献   

12.
We consider the problem of inferring the evolutionary tree of a set of n species. We propose a quartet reconstruction method which specifically produces trees whose edges have strong combinatorial evidence. Let Q be a set of resolved quartets defined on the studied species, the method computes the unique maximum subset Q* of Q which is equivalent to a tree and outputs the corresponding tree as an estimate of the species’ phylogeny. We use a characterization of the subset Q* due to Bandelt and Dress (Adv. Appl. Math. 7 (1986) 309–343) to provide an O(n4) incremental algorithm for this variant of the NP-hard quartet consistency problem. Moreover, when chosing the resolution of the quartets by the four-point method (FPM) and considering the Cavender–Farris model of evolution, we show that the convergence rate of the Q* method is at worst polynomial when the maximum evolutive distance between two species is bounded. We complete these theoretical results by an experimental study on real and simulated data sets. The results show that (i) as expected, the strong combinatorial constraints it imposes on each edge leads the Q* method to propose very few incorrect edges; (ii) more surprisingly; the method infers trees with a relatively high degree of resolution.  相似文献   

13.
We present three parallel sorting algorithms suitable for implementation on tightly coupled multiprocessors and compare their performance on the Denelcor HEP. Two of the algorithms implemented—parallel Shellsort and quickmerge—are new. Shellsort is amenable to parallelization; however, since Shellsort has higher complexity than quicksort, parallel Shellsort is inferior to parallel quicksort. A second new parallel algorithm, called quickmerge, is based upon both quicksort and mergesort. Our implementation of quickmerge achieves significantly higher speedup than occur implementation of parallel quicksort.  相似文献   

14.
We examine the parallel execution of a class of stochastic algorithms called Markov chain Monte-Carlo (MCMC) algorithms. We focus on MCMC algorithms in the context of image processing, using Markov random field models. Our parallelisation approach is based on several, concurrently running, instances of the same stochastic algorithm that deal with the whole data set. Firstly we show that the speed-up of the parallel algorithm is limited because of the statistical properties of the MCMC algorithm. We examine coupled MCMC as a remedy for this problem. Secondly, we exploit the parallel execution to monitor the convergence of the stochastic algorithms in a statistically reliable manner. This new convergence measure for MCMC algorithms performs well, and is an improvement on known convergence measures. We also link our findings with recent work in the statistical theory of MCMC.  相似文献   

15.
This paper describes several parallel algorithms for image edge relaxation on array processors with different numbers of processing elements (PEs) connected by a mesh or hypercube network. The time complexity of Prager's original edge relaxation scheme is O(N2) per iteration using floating-point operations on a sequential machine, where N2 is the number of pixels in the image. Modifications to the scheme are made so that no multiplications are employed and only integer operations are required. Moreover, with parallel processing, the time complexity per iteration is reduced to some constant value. A time complexity analysis on two parallel algorithms is performed. Although the algorithm on an array processor with 4N2 PEs achieved higher degree of parallelism, the algorithm with N2 PEs is preferred. Further modifications on the latter algorithm are made to accommodate to fewer PEs.  相似文献   

16.
A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PRSVR have four major advantages over previous methods. (1) We prove that the proposed algorithms achieve an average convergence rate that is so far the fastest bounded convergence rate, among all SVM decomposition training algorithms to the best of our knowledge. The fast average convergence bound is achieved by a unique priority based sampling mechanism. (2) Unlike previous work (Provably fast training algorithm for support vector machines, 2001) the proposed algorithms work for general linear-nonseparable SVM and general non-linear SVR problems. This improvement is achieved by modeling new LP-type problems based on Karush–Kuhn–Tucker optimality conditions. (3) The proposed algorithms are the first parallel version of randomized sampling algorithms for SVM and SVR. Both the analytical convergence bound and the numerical results in a real application show that the proposed algorithm has good scalability. (4) We present demonstrations of the algorithms based on both synthetic data and data obtained from a real word application. Performance comparisons with SVMlight show that the proposed algorithms may be efficiently implemented.  相似文献   

17.
In this work we show a portable sequential and a portable parallel algorithm for solving the inverse eigenproblem for real symmetric Toeplitz matrices. Both algorithms are based on Broyden's method for solving nonlinear systems. We reduced the computational cost for some problem sizes, and furthermore we managed to reduce spatial cost considerably, compared in both cases with parallel algorithms proposed by other authors and by us, although sometimes quasi‐Newton methods (as Broyden) do not reach convergence in all the test cases. We have implemented the parallel algorithm using the parallel numerical linear algebra library SCALAPACK based on the MPI environment. Experimental results have been obtained using two different architectures: a shared memory multiprocessor, the SGI PowerChallenge, and a cluster of Pentium II PCs connected through a myrinet network. The algorithms obtained are scalable in all the cases. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

18.
Our approach combines the method of inexact steepest descent with the method of contractor directions to obtain an algorithm for solving systems of linear equations. In order to enhance the scope of applicability, we consider an iterative method with variable step-size iterations. We prove the convergence and given an error estimate for our method.

The algorithm is well-suited for parallel computation. In fact, for systems with m equations and n unknowns, each iteration may be computed in parallel time O(log m + log n), on an EREW PRAM with O(mn) processors.  相似文献   


19.
In this paper, sequential and parallel algorithms using derivatives for solving unconstrained one-dimensional global optimization problems are described. Sufficient conditions of convergence to all global minimizers are established for both methods. Parallel algorithm conditions, which guarantee significant speed up in comparison to the sequential version of the method, are presented. The sequential method is numerically compared with the algorithms of Breiman and Cutler, Pijavskii, and Strongin on a set of 20 test functions taken from literature. We also present results of numerical experiments illustrating the performance of the parallel method. All experiments have been executed on the parallel computer ALLIANT FX/80.  相似文献   

20.
The problem of finding a rectilinear minimum bend path (RMBP) between two designated points inside a rectilinear polygon has applications in robotics and motion planning. In this paper, we present efficient algorithms to solve the query version of the RMBP problem for special classes of rectilinear polygons given their visibility graphs. Specifically, we show that given an unweighted graph G = (V, E), with ¦V¦ = N and ¦E¦ = M, algorithms to preprocess G in linear space and time such that the shortest distance queries — queries asking for the distance between any pair of nodes in the graph — can be answered in constant time and space are presented in this paper. For the case of a chordal graph G, our algorithms give a distance which is at most one away from the actual shortest distance. When G is a K-chordal graph, our algorithm produces an exact shortest distance in O(K) time. We also present a non-trivial parallel implementation of the sequential preprocessing algorithm for the CREW-PRAM model which runs in O(log2 N) time using O(N + M) processors. After the preprocessing, we can answer the queries in constant time using a single processor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号