共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACTMachine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but suffer from approximating a potentially indefinite Hessian with a positive-definite matrix. Hessian-free methods leverage the ability to perform Hessian-vector multiplication without needing the entire Hessian matrix, but each iteration's complexity is significantly greater than quasi-Newton methods. In this paper we propose an alternative approach for solving ML problems based on a quasi-Newton trust-region framework for solving large-scale optimization problems that allow for indefinite Hessian approximations. Numerical experiments on a standard testing data set show that with a fixed computational time budget, the proposed methods achieve better results than the traditional limited-memory BFGS and the Hessian-free methods. 相似文献
2.
Summary The numerical solution of nonlinear problems is usually connected with Newton’s method. Due to its computational cost, variants
(so-called inexact and quasi–Newton methods) have been developed in which the arising inverse of the Jacobian is replaced
by an approximation. In this article we present a new approach which is based on Broyden updates. This method does not require
to store the update history since the updates are explicitly added to the matrix. In addition to updating the inverse we introduce
a method which constructs updates of the LU decomposition. To this end, we present an algorithm for the efficient multiplication of hierarchical and semi-separable matrices.
Since an approximate LU decomposition of finite element stiffness matrices can be efficiently computed in the set of hierarchical matrices, the complexity
of the proposed method scales almost linearly. Numerical examples demonstrate the effectiveness of this new approach.
This work was supported by the DFG priority program SPP 1146 “Modellierung inkrementeller Umformverfahren”. 相似文献
3.
《国际计算机数学杂志》2012,89(12):2608-2617
In this paper, we investigate a symmetric rank-one (SR1) quasi-Newton (QN) formula in which the Hessian of the objective function has some special structure. Instead of approximating the whole Hessian via the SR1 formula, we consider an approach which only approximates part of the Hessian matrix that is not easily acquired. Although the SR1 update possesses desirable features, it is unstable in the sense that, it may not retain positive definiteness and may become undefined. Therefore, we describe some safeguards to overcome these difficulties. Since the structured SR1 method provides a more accurate Hessian approximation, therefore the proposed method reduces significantly the computational efforts needed in solving a problem. The results of a series of experiments on a typical set of standard unconstrained optimization problems are reported, which show that the structured SR1 method exhibits a clear improvement in numerical performance over some existing QN algorithms. 相似文献
4.
A 2D implicit compact scheme solver has been implemented for the vorticity-velocity formulation in the case of nonreacting, multicomponent, axisymmetric, low Mach number flows. To stabilize the discrete boundary value problem, two sets of boundary closures are introduced to couple the velocity and vorticity fields. A Newton solver is used for solving steady-state and time-dependent equations. In this solver, the Jacobian matrix is formulated and stored in component form. To solve the system of linearized equations within each iteration of Newton’s method, preconditioned Bi-CGSTAB is used in combination with a matrix-vector product computed in component form. The almost dense Jacobian matrix is approximated by a partial Jacobian. For the preconditioner equation, the partial Jacobian is approximately factored using several methods. In a detailed study of several preconditioning techniques, a promising method based on ILUT preconditioning in combination with reordering and double scaling using the MC64 algorithm by Duff and Koster is selected. To validate the implicit compact scheme solver, several nonreacting model problems have been considered. At least third order accuracy in space is recovered on nonuniform grids. A comparison of the results of the implicit compact scheme solver with the results of a traditional implicit low order solver shows an order of magnitude reduction of computer memory and time using the compact scheme solver in the case of time-dependent mixing problems. 相似文献
5.
Joanna Karpińska Krzysztof Tchoń Mariusz Janiak 《Journal of Intelligent and Robotic Systems》2012,68(3-4):211-224
This paper addresses the approximation problem of Jacobian inverse kinematics algorithms for redundant robotic manipulators. Specifically, we focus on the approximation of the Jacobian pseudo inverse by the extended Jacobian algorithm. The algorithms are defined as certain dynamic systems driven by the task space error, and identified with vector field distributions. The distribution corresponding to the Jacobian pseudo inverse is non-integrable, while that associated with the extended Jacobian is integrable. Two methods of devising the approximating extended Jacobian algorithm are examined. The first method is referred to as differential geometric, and relies on the approximation of a non-integrable distribution (in fact: a codistribution) by an integrable one. As an alternative, the approximation problem has been formulated as the minimization of an approximation error functional, and solved using the methods of the calculus of variations. Performance of the obtained extended Jacobian inverse kinematics algorithms has been compared by means of computer simulations involving the kinematics model of the 7 dof industrial manipulator POLYCRANK. It is concluded that the differential geometric method offers a rapid, while the variational method a systematic tool for solving inverse kinematic problems. 相似文献
6.
In this paper we present an extensive computational experience with several Newton-like methods, namely Newton’s method, the ABS Huang method, the ABS row update method and six Quasi-Newton methods. The methods are first tested on 31 families of problems with dimensionsn=10, 50, 100 and two starting points. Newton’s method appears to be the best in terms of number of solved problems, followed closely by the ABS Huang method. Broyden’s “bad” method and Greenstadt’s second method show a very poor performance. The other four Quasi-Newton methods perform similarly, strongly suggesting that Greenstadt’s first method and Martínez’ column update method are locally and superlinearly convergent, a result that has yet to be proven theoretically. Thomas’ method appears to be marginally more robust and fast and provides moreover a better approximation to the Jacobian. An interesting and somewhat unexpected observation is that the number of iterations for satisfying the convergence test increases very little with the dimension of the problem. In a second set of experiments we look at the structure of the regions of convergence/nonconvergence by starting the methods from all nodes of a regular grid and assigning to each node a number according to the outcome of the iteration. The obtained regions have clearly a fractal type structure, which, on the two tested problems, is much simpler for Newton’s method than for the other methods. Newton’s method also is the one with the smallest nonconvergence region. Among the Quasi-Newton methods Thomas’ method shows a definitely smaller nonconvergence region. 相似文献
7.
Shape design optimization of stationary fluid-structure interaction problems with large displacements and turbulence 总被引:1,自引:1,他引:0
This paper presents general and efficient methods for analysis and gradient based shape optimization of systems characterized as strongly coupled stationary fluid-structure interaction (FSI) problems. The incompressible fluid flow can be laminar or turbulent and is described using the Reynolds-averaged Navier-Stokes equations (RANS) together with the algebraic Baldwin–Lomax turbulence model. The structure may exhibit large displacements due to the interaction with the fluid domain, resulting in geometrically nonlinear structural behaviour and nonlinear interface coupling conditions. The problem is discretized using Galerkin and Streamline-Upwind/Petrov–Galerkin finite element methods, and the resulting nonlinear equations are solved using Newtons method. Due to the large displacements of the structure, an efficient update algorithm for the fluid mesh must be applied, leading to the use of an approximate Jacobian matrix in the solution routine. Expressions for Design Sensitivity Analysis (DSA) are derived using the direct differentiation approach, and the use of an inexact Jacobian matrix in the analysis leads to an iterative but very efficient scheme for DSA. The potential of gradient based shape optimization of fluid flow and FSI problems is illustrated by several examples. 相似文献
8.
This paper develops an algorithm for iterative learning control on the basis of the quasi-Newton method for nonlinear systems. The new quasi-Newton iterative learning control scheme using the rank-one update to derive the recurrent formula has numerous benefits, which include the approximate treatment for the inverse of the system’s Jacobian matrix. The rank-one update-based ILC also has the advantage of extension for convergence domain and hence guaranteeing the choice of initial value. The algorithm is expressed as a very general norm optimization problem in a Banach space and, in principle, can be used for both continuous and discrete time systems. Furthermore, a detailed convergence analysis is given, and it guarantees theoretically that the proposed algorithm converges at a superlinear rate. Initial conditions which the algorithm requires are also established. The simulations illustrate the theoretical results. 相似文献
9.
J. M. Martínez 《Calcolo》1995,32(3-4):107-135
When the Jacobian of a nonlinear system of equation is fully available, the main drawback for the application of Newton's method is the linear algebra work associated with its basic iteration. In many cases, quasi-Newton methods «with cheap linear algebra» can be applied. The availability of the derivatives leads us to define new quasi-Newton methods where true Jacobians are used in an efficient way. In this paper, we investigate quasi-Newton methods for solving nonlinear simultaneous equations assuming that true derivatives of the function can be used. We prove local convergence theorems and we compare several of the most promising alternatives. 相似文献
10.
New results on recurrent network training: unifying the algorithmsand accelerating convergence 总被引:1,自引:0,他引:1
How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner. 相似文献
11.
Optimization in Simulation is an important problem often encountered in system behavior investigation; however, the existing methods such as response surface methodology and stochastic approximation method are inefficient. This paper presents a modification of a quasi-Newton method, in which the parameters are determined from some numerical experiments. To demonstrate the validity of the devised method, two examples resembling the M/M/1 queueing problem are solved. The closeness of the converged solutions to the optimal solutions and a comparison with two stochastic approximation methods indicate that the modified quasi-Newton method as devised in this paper is a robust and efficient method for solving optimization problems in simulation. 相似文献
12.
Victor Y. Pan 《Computers & Mathematics with Applications》2011,62(4):1685-1706
Hensel’s symbolic lifting is a highly effective method for the solution of a general or structured (e.g. Toeplitz or Hankel) linear system of equations with integer or rational coefficients of bounded length. It can handle ill conditioned inputs, for which numerical methods become costly. Lifting amounts to recursive multiplications by vectors of the input coefficient matrices and its precomputed inverse modulo a fixed integer s. Such multiplications only involve small numbers of data movements and arithmetic operations with bounded precision. The known methods for precomputation of the inverse are more costly, however; in particular they involve more data movements. As our remedy for this bottleneck stage we create an auxiliary matrix sharing its inverse modulo s with the input matrix, and we readily compute this inverse by applying numerical iterative refinement, which is a numerical counterpart of lifting. In the case of general unstructured as well as Toeplitz, Hankel, and other popular structured inputs our hybrid algorithms involve a small number of data movements and optimal number of Boolean (that is bitwise) operations (up to a logarithmic factor). We extend the algorithms to nearly optimal computation of polynomial greatest common divisors (gcds), least common multiples (lcms) and Padé approximations, as well as the Berlekamp-Massey reconstruction of linear recurrences. We also cover Newton’s lifting for matrix inversion, specialize it to the case of structured input, and combine it with Hensel’s to enhance the overall efficiency. Our initialization techniques work for Newton’s lifting as well. Furthermore we extend all our lifting algorithms to allow their initialization modulo powers of two, thus implementing them in the binary base. 相似文献
13.
Jongrae Kim Bates D.G. Postlethwaite I. 《Automatic Control, IEEE Transactions on》2008,53(8):1937-1941
Stochastic differential equations are now commonly used to model biomolecular networks in systems biology, and much research has been devoted to the development of methods to analyse their stability properties. Stability analysis of such systems may be performed using the Laplace transform, which requires the calculation of the exponential matrix involving time symbolically. However, the calculation of the symbolic exponential matrix is not feasible for problems of even moderate size, as the required computation time increases exponentially with the matrix order. To address this issue, we present a novel method for approximating the Laplace transform which does not require the exponential matrix to be calculated explicitly. The calculation time associated with the proposed method does not increase exponentially with the size of the system, and the approximation error is shown to be of the same order as existing methods. Using this approximation method, we show how a straightforward application of the generalized Nyquist stability criterion provides necessary and sufficient conditions for the stability of stochastic biomolecular networks. The usefulness and computational efficiency of the proposed method is illustrated through its application to the problem of analysing a model for limit-cycle oscillations in cAMP during aggregation of Dictyostelium cells. 相似文献
14.
15.
A hybrid method for locating multipole equilibrium configurations has been proposed recently. The hybrid method combined the efficiency of a quasi-Newton method capable of locating stable and unstable equilibrium solutions with a robust homotopy method capable of tracking equilibrium paths with turning points and exploiting sparsity of the Jacobian matrix at the same time. A quasi-Newton method in conjunction with a deflation technique is proposed here as an alternative to the hybrid method. The proposed method not only exploits sparsity and symmetry, but also represents an improvement in efficiency. 相似文献
16.
Dr. J. Bräuninger 《Computing》1980,25(2):155-162
A quasi-Newton method for unconstrained minimization is presented, which uses a Cholesky factorization of an approximation to the Hessian matrix. In each step a new row and column of this approximation matrix is determined and its Cholesky factorization is updated. This reduces storage requirements and simplifies the calculation of the search direction. Precautions are taken to hold the approximation matrix positive definite. It is shown that under usual conditions the method converges superlinearly or evenn-step quadratic. 相似文献
17.
This paper studies the problem of pricing multi-asset American-style options in the Black–Scholes–Merton framework. The value
function of an option contract is known to satisfy a partial differential variational inequality (PDVI) when early exercise
is permitted. We develop a computational method for the valuation of multi-asset American-style options based on approximating
the PDVI by a non-linear penalized PDE with a penalty term with continuous Jacobian. We convert the non-linear PDE to a variational
(weak) form, discretize the weak formulation spatially by a Galerkin finite element method to obtain a system of ODEs, and
integrate the resulting system of ODEs in time with an adaptive variable order and variable step size solver SUNDIALS. Numerical
results demonstrate that employing a penalty term with continuous Jacobian in contrast to the penalty terms with discontinuous
Jacobians in use in the literature improves computational performance of the adaptive temporal integrator. In our framework
we are able to price American-style options with payoffs dependent on up to six assets on a PC. This is in contrast to the
existing literature on the pricing of American options by PDE methods, that has so far been limited to at most three-dimensional
problems. Our results open avenues for further applications to multi-dimensional problems, such as pricing convertible bonds
in multi-factor models, that will be explored in future work.
This research was supported by the National Science Foundation under grants DMI–0422937 and DMI–0422985. 相似文献
18.
19.
《Automatic Control, IEEE Transactions on》2009,54(6):1216-1229
20.
In this paper, we propose a regular perturbation method to obtain approximate analytic solutions of exterior and interior Dirichlet problems for Laplace’s equation in planar domains. This method, starting from a geometrical perturbation of these planar domains, reduces our problems to a family of classical Dirichlet problems for Laplace’s equation in a circle. Numerical examples are given and comparisons are made with the solutions obtained by other approximation methods. 相似文献