首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We examine the underlying structure of popular algorithms for variational methods used in image processing. We focus here on operator splittings and Bregman methods based on a unified approach via fixed point iterations and averaged operators. In particular, the recently proposed alternating split Bregman method can be interpreted from different points of view—as a Bregman, as an augmented Lagrangian and as a Douglas-Rachford splitting algorithm which is a classical operator splitting method. We also study similarities between this method and the forward-backward splitting method when applied to two frequently used models for image denoising which employ a Besov-norm and a total variation regularization term, respectively. In the first setting, we show that for a discretization based on Parseval frames the gradient descent reprojection and the alternating split Bregman algorithm are equivalent and turn out to be a frame shrinkage method. For the total variation regularizer, we also present a numerical comparison with multistep methods.  相似文献   

2.
We present an analysis of sets of matrices with rank less than or equal to a specified number s. We provide a simple formula for the normal cone to such sets, and use this to show that these sets are prox-regular at all points with rank exactly equal to s. The normal cone formula appears to be new. This allows for easy application of prior results guaranteeing local linear convergence of the fundamental alternating projection algorithm between sets, one of which is a rank constraint set. We apply this to show local linear convergence of another fundamental algorithm, approximate steepest descent. Our results apply not only to linear systems with rank constraints, as has been treated extensively in the literature, but also nonconvex systems with rank constraints.  相似文献   

3.
Algorithms for accelerated convergence of adaptive PCA   总被引:3,自引:0,他引:3  
We derive and discuss adaptive algorithms for principal component analysis (PCA) that are shown to converge faster than the traditional PCA algorithms due to Oja and Karhunen (1985), Sanger (1989), and Xu (1993). It is well known that traditional PCA algorithms that are derived by using gradient descent on an objective function are slow to converge. Furthermore, the convergence of these algorithms depends on appropriate choices of the gain sequences. Since online applications demand faster convergence and an automatic selection of gains, we present new adaptive algorithms to solve these problems. We first present an unconstrained objective function, which can be minimized to obtain the principal components. We derive adaptive algorithms from this objective function by using: (1) gradient descent; (2) steepest descent; (3) conjugate direction; and (4) Newton-Raphson methods. Although gradient descent produces Xu's LMSER algorithm, the steepest descent, conjugate direction, and Newton-Raphson methods produce new adaptive algorithms for PCA. We also provide a discussion on the landscape of the objective function, and present a global convergence proof of the adaptive gradient descent PCA algorithm using stochastic approximation theory. Extensive experiments with stationary and nonstationary multidimensional Gaussian sequences show faster convergence of the new algorithms over the traditional gradient descent methods. We also compare the steepest descent adaptive algorithm with state-of-the-art methods on stationary and nonstationary sequences.  相似文献   

4.
This article proposes an efficient and simple algebraic method of computation of a Gröbner basis generating the alternating galoisian ideal of a univariate separable polynomial. We named this method “the descent of the Vandermonde determinants”.  相似文献   

5.
We generalize discrete variational models involving the infimal convolution (IC) of first and second order differences and the total generalized variation (TGV) to manifold-valued images. We propose both extrinsic and intrinsic approaches. The extrinsic models are based on embedding the manifold into an Euclidean space of higher dimension with manifold constraints. An alternating direction methods of multipliers can be employed for finding the minimizers. However, the components within the extrinsic IC or TGV decompositions live in the embedding space which makes their interpretation difficult. Therefore, we investigate two intrinsic approaches: for Lie groups, we employ the group action within the models; for more general manifolds, our IC model is based on recently developed absolute second order differences on manifolds, while our TGV approach uses an approximation of the parallel transport by the pole ladder. For computing the minimizers of the intrinsic models, we apply gradient descent algorithms. Numerical examples demonstrate that our approaches work well for certain manifolds.  相似文献   

6.
We consider the use of a curvature-adaptive step size in gradient-based iterative methods, including quasi-Newton methods, for minimizing self-concordant functions, extending an approach first proposed for Newton's method by Nesterov. This step size has a simple expression that can be computed analytically; hence, line searches are not needed. We show that using this step size in the BFGS method (and quasi-Newton methods in the Broyden convex class other than the DFP method) results in superlinear convergence for strongly convex self-concordant functions. We present numerical experiments comparing gradient descent and BFGS methods using the curvature-adaptive step size to traditional methods on deterministic logistic regression problems, and to versions of stochastic gradient descent on stochastic optimization problems.  相似文献   

7.
In this paper, based on the numerical efficiency of Hestenes–Stiefel (HS) method, a new modified HS algorithm is proposed for unconstrained optimization. The new direction independent of the line search satisfies in the sufficient descent condition. Motivated by theoretical and numerical features of three-term conjugate gradient (CG) methods proposed by Narushima et al., similar to Dai and Kou approach, the new direction is computed by minimizing the distance between the CG direction and the direction of the three-term CG methods proposed by Narushima et al. Under some mild conditions, we establish global convergence of the new method for general functions when the standard Wolfe line search is used. Numerical experiments on some test problems from the CUTEst collection are given to show the efficiency of the proposed method.  相似文献   

8.
We study the problem of minimizing the sum of a smooth convex function and a convex block-separable regularizer and propose a new randomized coordinate descent method, which we call ALPHA. Our method at every iteration updates a random subset of coordinates, following an arbitrary distribution. No coordinate descent methods capable to handle an arbitrary sampling have been studied in the literature before for this problem. ALPHA is a very flexible algorithm: in special cases, it reduces to deterministic and randomized methods such as gradient descent, coordinate descent, parallel coordinate descent and distributed coordinate descent—both in nonaccelerated and accelerated variants. The variants with arbitrary (or importance) sampling are new. We provide a complexity analysis of ALPHA, from which we deduce as a direct corollary complexity bounds for its many variants, all matching or improving best known bounds.  相似文献   

9.
Matrix factorization has been widely utilized as a latent factor model for solving the recommender system problem using collaborative filtering. For a recommender system, all the ratings in the rating matrix are bounded within a pre-determined range. In this paper, we propose a new improved matrix factorization approach for such a rating matrix, called Bounded Matrix Factorization (BMF), which imposes a lower and an upper bound on every estimated missing element of the rating matrix. We present an efficient algorithm to solve BMF based on the block coordinate descent method. We show that our algorithm is scalable for large matrices with missing elements on multicore systems with low memory. We present substantial experimental results illustrating that the proposed method outperforms the state of the art algorithms for recommender system such as stochastic gradient descent, alternating least squares with regularization, SVD++ and Bias-SVD on real-world datasets such as Jester, Movielens, Book crossing, Online dating and Netflix.  相似文献   

10.
k错线性复杂度是度量序列密码安全性的重要指标之一。基于方体理论和Games-Chan算法的逆向推导提出构造方法,构造了具有给定k错线性复杂度谱的2n周期序列。首先使用标准方体分解算法对k错线性复杂度具有第一下降点k=2、第二下降点k′=6、第三下降点k″=10的2n周期序列进行分类,再讨论每一类序列下降点线性复杂度参数之间的关系,最后给出每种参数关系下序列的计数公式以及构造过程。事实上,所使用的方法可以用于构造具有更多下降点的2n周期序列。  相似文献   

11.
ABSTRACT

Learning parameters of a probabilistic model is a necessary step in machine learning tasks. We present a method to improve learning from small datasets by using monotonicity conditions. Monotonicity simplifies the learning and it is often required by users. We present an algorithm for Bayesian Networks parameter learning. The algorithm and monotonicity conditions are described, and it is shown that with the monotonicity conditions we can better fit underlying data. Our algorithm is tested on artificial and empiric datasets. We use different methods satisfying monotonicity conditions: the proposed gradient descent, isotonic regression EM, and non-linear optimization. We also provide results of unrestricted EM and gradient descent methods. Learned models are compared with respect to their ability to fit data in terms of log-likelihood and their fit of parameters of the generating model. Our proposed method outperforms other methods for small sets, and provides better or comparable results for larger sets.  相似文献   

12.
The Mumford–Shah model has been one of the most influential models in image segmentation and denoising. The optimization of the multiphase Mumford–Shah energy functional has been performed using level sets methods that optimize the Mumford–Shah energy by evolving the level sets via the gradient descent. These methods are very slow and prone to getting stuck in local optima due to the use of gradient descent. After the reformulation of the 2-phase Mumford–Shah functional on a graph, several groups investigated the hierarchical extension of the graph representation to multi class. The discrete hierarchical approaches are more effective than hierarchical (or direct) multiphase formulation using level sets. However, they provide approximate solutions and can diverge away from the optimal solution. In this paper, we present a discrete alternating optimization for the discretized Vese–Chan approximation of the piecewise constant multiphase Mumford–Shah functional that directly minimizes the multiphase functional without recursive bisection on the labels. Our approach handles the nonsubmodularity of the multiphase energy function and provides a global optimum if the image estimation data term is known apriori.  相似文献   

13.
There are many image fusion processes to produce a high-resolution multispectral (MS) image from low-resolution MS and high-resolution panchromatic (PAN) images. But the most significant problems are colour distortion and fusion quality. Previously, we reported a fusion process that produced a1 m resolution IKONOS fused image with minimal spectral distortion. However, block distortion appeared at the edge of the curved sections of the fused image, which was reduced by performing the wavelet transformation as a post-process. Here, we propose an image fusion process using the steepest descent method with bi-linear interpolation, which can remove block distortion without using wavelet transformation. Bi-linear interpolation provides the proper initial values of the fused image, and then the steepest descent method produces the optimum results of the fusion process. These results achieve improvement on the spectral as well as spatial quality of a1 m resolution fused image when compared with other existing methods and remove block distortion completely.  相似文献   

14.
In this work we study the parallel coordinate descent method (PCDM) proposed by Richtárik and Taká? [Parallel coordinate descent methods for big data optimization, Math. Program. Ser. A (2015), pp. 1–52] for minimizing a regularized convex function. We adopt elements from the work of Lu and Xiao [On the complexity analysis of randomized block-coordinate descent methods, Math. Program. Ser. A 152(1–2) (2015), pp. 615–642], and combine them with several new insights, to obtain sharper iteration complexity results for PCDM than those presented in [Richtárik and Taká?, Parallel coordinate descent methods for big data optimization, Math. Program. Ser. A (2015), pp. 1–52]. Moreover, we show that PCDM is monotonic in expectation, which was not confirmed in [Richtárik and Taká?, Parallel coordinate descent methods for big data optimization, Math. Program. Ser. A (2015), pp. 1–52], and we also derive the first high probability iteration complexity result where the initial levelset is unbounded.  相似文献   

15.
Many recent applications in machine learning and data fitting call for the algorithmic solution of structured smooth convex optimization problems. Although the gradient descent method is a natural choice for this task, it requires exact gradient computations and hence can be inefficient when the problem size is large or the gradient is difficult to evaluate. Therefore, there has been much interest in inexact gradient methods (IGMs), in which an efficiently computable approximate gradient is used to perform the update in each iteration. Currently, non-asymptotic linear convergence results for IGMs are typically established under the assumption that the objective function is strongly convex, which is not satisfied in many applications of interest; while linear convergence results that do not require the strong convexity assumption are usually asymptotic in nature. In this paper, we combine the best of these two types of results by developing a framework for analysing the non-asymptotic convergence rates of IGMs when they are applied to a class of structured convex optimization problems that includes least squares regression and logistic regression. We then demonstrate the power of our framework by proving, in a unified manner, new linear convergence results for three recently proposed algorithms—the incremental gradient method with increasing sample size [R.H. Byrd, G.M. Chin, J. Nocedal, and Y. Wu, Sample size selection in optimization methods for machine learning, Math. Program. Ser. B 134 (2012), pp. 127–155; M.P. Friedlander and M. Schmidt, Hybrid deterministic–stochastic methods for data fitting, SIAM J. Sci. Comput. 34 (2012), pp. A1380–A1405], the stochastic variance-reduced gradient (SVRG) method [R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems 26: Proceedings of the 2013 Conference, 2013, pp. 315–323], and the incremental aggregated gradient (IAG) method [D. Blatt, A.O. Hero, and H. Gauchman, A convergent incremental gradient method with a constant step size, SIAM J. Optim. 18 (2007), pp. 29–51]. We believe that our techniques will find further applications in the non-asymptotic convergence analysis of other first-order methods.  相似文献   

16.
Alternating minimization and Boltzmann machine learning   总被引:1,自引:0,他引:1  
Training a Boltzmann machine with hidden units is appropriately treated in information geometry using the information divergence and the technique of alternating minimization. The resulting algorithm is shown to be closely related to gradient descent Boltzmann machine learning rules, and the close relationship of both to the EM algorithm is described. An iterative proportional fitting procedure for training machines without hidden units is described and incorporated into the alternating minimization algorithm.  相似文献   

17.
The conjugate gradient method is an effective method for large-scale unconstrained optimization problems. Recent research has proposed conjugate gradient methods based on secant conditions to establish fast convergence of the methods. However, these methods do not always generate a descent search direction. In contrast, Y. Narushima, H. Yabe, and J.A. Ford [A three-term conjugate gradient method with sufficient descent property for unconstrained optimization, SIAM J. Optim. 21 (2011), pp. 212–230] proposed a three-term conjugate gradient method which always satisfies the sufficient descent condition. This paper makes use of both ideas to propose descent three-term conjugate gradient methods based on particular secant conditions, and then shows their global convergence properties. Finally, numerical results are given.  相似文献   

18.
Abstract We consider mixed variational inequalities involving a non-strictly monotone, differentiable cost mapping and a convex nondifferentiable function. We apply the Tikhonov–Browder regularization technique to these problems. We use uniformly monotone auxiliary functions for constructing regularized problems and apply the gap function approach for the perturbed uniformly monotone variational inequalities. Then we propose a combined regularization and descent method for initial monotone problems and establish convergence of its iteration sequence. Keywords. Variational inequalities, nonsmooth functions, descent methods  相似文献   

19.
We consider the monotone composite variational inequality (CVI) where the underlying mapping is formed as the sum of two monotone mappings. We combine the forward–backward and descent direction ideas together, and thus present the unified algorithmic framework of forward–backward-based descent methods for solving the CVI. A new iterate of such a method is generated by a prediction–correction fashion, where the predictor is yielded by the forward–backward method and then the predictor is corrected by a descent step. We derive some implementable forward–backward-based descent algorithms for some concrete cases of the CVI, and verify their numerical efficiency via preliminary numerical experiments.  相似文献   

20.
Werfel J  Xie X  Seung HS 《Neural computation》2005,17(12):2699-2718
Gradient-following learning methods can encounter problems of implementation in many applications, and stochastic variants are sometimes used to overcome these difficulties. We analyze three online training methods used with a linear perceptron: direct gradient descent, node perturbation, and weight perturbation. Learning speed is defined as the rate of exponential decay in the learning curves. When the scalar parameter that controls the size of weight updates is chosen to maximize learning speed, node perturbation is slower than direct gradient descent by a factor equal to the number of output units; weight perturbation is slower still by an additional factor equal to the number of input units. Parallel perturbation allows faster learning than sequential perturbation, by a factor that does not depend on network size. We also characterize how uncertainty in quantities used in the stochastic updates affects the learning curves. This study suggests that in practice, weight perturbation may be slow for large networks, and node perturbation can have performance comparable to that of direct gradient descent when there are few output units. However, these statements depend on the specifics of the learning problem, such as the input distribution and the target function, and are not universally applicable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号