Nonlinear Conjugate Gradient Method

	Nonlinear Conjugate Gradient Method In numerical optimization, the nonlinear conjugate gradient method generalizes the conjugate gradient method to nonlinear optimization. For a quadratic function \displaystyle f(x) :: \displaystyle f(x)=\, Ax-b\, ^2, the minimum of f is obtained when the gradient is 0: :: \nabla_x f=2 A^T(Ax-b)=0. Whereas linear conjugate gradient seeks a solution to the linear equation \displaystyle A^T Ax=A^T b, the nonlinear conjugate gradient method is generally used to find the maxima and minima, local minimum of a nonlinear function using its gradient \nabla_x f alone. It works when the function is approximately quadratic near the minimum, which is the case when the function is twice differentiable at the minimum and the second derivative is non-singular there. Given a function \displaystyle f(x) of N variables to minimize, its gradient \nabla_x f indicates the direction of maximum increase. One simply starts in the opposite (steepest descent) direction: :: \Delta x_0=-\nabla_x f (x_0) w ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Numerical Optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries. In the more general approach, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations constitutes a large area of applied mathematics. Optimization problems Optimization problems can be divided into two categories, depending on whether the variables ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	L-BFGS Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization (mathematics), optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize f(\mathbf) over unconstrained values of the real-vector \mathbf where f is a differentiable scalar function. Like the original BFGS, L-BFGS uses an estimate of the inverse Hessian matrix to steer its search through variable space, but where BFGS stores a dense n\times n approximation to the inverse Hessian (''n'' being the number of variables in the problem), L-BFGS stores only a few vectors that represent the approximation implicitly. Due to its resulting linear memory requirement, the L-BFGS method is particularly well suited for optimization problems with many variables. Instead of the inverse Hessian H''k'', L-B ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Wolfe Conditions In the unconstrained minimization problem, the Wolfe conditions are a set of inequalities for performing inexact line search, especially in quasi-Newton methods, first published by Philip Wolfe in 1969. In these methods the idea is to find \min_x f(\mathbf) for some smooth f\colon\mathbb R^n\to\mathbb R. Each step often involves approximately solving the subproblem \min_ f(\mathbf_k + \alpha \mathbf_k) where \mathbf_k is the current best guess, \mathbf_k \in \mathbb R^n is a search direction, and \alpha \in \mathbb R is the step length. The inexact line searches provide an efficient way of computing an acceptable step length \alpha that reduces the objective function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ... 'sufficiently', rather than minimizing the objective function ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Nelder–Mead Method The Nelder–Mead method (also downhill simplex method, amoeba method, or polytope method) is a numerical method used to find the minimum or maximum of an objective function in a multidimensional space. It is a direct search method (based on function comparison) and is often applied to nonlinear optimization problems for which derivatives may not be known. However, the Nelder–Mead technique is a heuristic search method that can converge to non-stationary points * * (algorithm summary online). on problems that can be solved by alternative methods. * Yu, Wen Ci. 1979. "Positive basis and a class of direct search techniques". ''Scientia Sinica'' 'Zhongguo Kexue'' 53—68. * Yu, Wen Ci. 1979. "The convergent property of the simplex evolutionary technique". ''Scientia Sinica'' 'Zhongguo Kexue'' 69–77. * * The Nelder–Mead technique was proposed by John Nelder and Roger Mead in 1965, as a development of the method of Spendley et al. Overview The method uses th ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Conjugate Gradient Method In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-semidefinite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems. The conjugate gradient method can also be used to solve unconstrained optimization problems such as energy minimization. It is commonly attributed to Magnus Hestenes and Eduard Stiefel, who programmed it on the Z4, and extensively researched it. The biconjugate gradient method provides a generalization to non-symmetric matrices. Various nonlinear conjugate gradient methods seek minima of nonlinear optimization problems. Description of the problem addres ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Broyden–Fletcher–Goldfarb–Shanno Algorithm In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the descent direction by preconditioning the gradient with curvature information. It does so by gradually improving an approximation to the Hessian matrix of the loss function, obtained only from gradient evaluations (or approximate gradient evaluations) via a generalized secant method. Since the updates of the BFGS curvature matrix do not require matrix inversion, its computational complexity is only \mathcal(n^), compared to \mathcal(n^) in Newton's method. Also in common use is L-BFGS, which is a limited-memory version of BFGS that is particularly suited to problems with very large numbers of variables (e.g., >1000). The BFGS-B variant handles simple box constraints. The BFGS matrix also admits a compact representation, which makes it b ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Gradient Descent Gradient descent is a method for unconstrained mathematical optimization. It is a first-order iterative algorithm for minimizing a differentiable multivariate function. The idea is to take repeated steps in the opposite direction of the gradient (or approximate gradient) of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a trajectory that maximizes that function; the procedure is then known as ''gradient ascent''. It is particularly useful in machine learning for minimizing the cost or loss function. Gradient descent should not be confused with local search algorithms, although both are iterative methods for optimization. Gradient descent is generally attributed to Augustin-Louis Cauchy, who first suggested it in 1847. Jacques Hadamard independently proposed a similar method in 1907. Its convergence properties for non-linear optimization problems were first studied by Has ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Double Integrator In systems and control theory, the double integrator is a canonical example of a second-order control system. It models the dynamics of a simple mass in one-dimensional space under the effect of a time-varying force input \textbf. Differential equations The differential equations which represent a double integrator are: :\ddot = u(t) :y = q(t) where both q(t), u(t) \in \mathbb Let us now represent this in state space form with the vector \textbf = \begin q\\ \dot\\ \end : \dot(t)= \frac = \begin \dot\\ \ddot\\ \end In this representation, it is clear that the control input \textbf is the second derivative of the output \textbf. In the scalar form, the control input is the second derivative of the output q. State space representation The normalized state space model of a double integrator t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Feedback Control Feedback occurs when outputs of a system are routed back as inputs as part of a chain of cause and effect that forms a circuit or loop. The system can then be said to ''feed back'' into itself. The notion of cause-and-effect has to be handled carefully when applied to feedback systems: History Self-regulating mechanisms have existed since antiquity, and the idea of feedback started to enter economic theory in Britain by the 18th century, but it was not at that time recognized as a universal abstraction and so did not have a name. The first ever known artificial feedback device was a float valve, for maintaining water at a constant level, invented in 270 BC in Alexandria, Egypt. This device illustrated the principle of feedback: a low water level opens the valve, the rising water then provides feedback into the system, closing the valve when the required level is reached. This then reoccurs in a circular fashion as the water level fluctuates. Centrifugal governors were us ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Optimal Control Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the Moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory. Optimal control is an extension of the calculus of variations, and is a mathematical optimization method for deriving control policies. The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s, after contributions to calculus of v ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Quasi-Newton Method In numerical analysis, a quasi-Newton method is an iterative numerical method used either to find zeroes or to find local maxima and minima of functions via an iterative recurrence formula much like the one for Newton's method, except using approximations of the derivatives of the functions in place of exact derivatives. Newton's method requires the Jacobian matrix of all partial derivatives of a multivariate function when used to search for zeros or the Hessian matrix when used for finding extrema. Quasi-Newton methods, on the other hand, can be used when the Jacobian matrices or Hessian matrices are unavailable or are impractical to compute at every iteration. Some iterative methods that reduce to Newton's method, such as sequential quadratic programming, may also be considered quasi-Newton methods. Search for zeros: root finding Newton's method to find zeroes of a function g of multiple variables is given by x_ = x_n - _g(x_n) g(x_n), where _g(x_n) is the left inverse ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Conjugate Gradient Method In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-semidefinite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems. The conjugate gradient method can also be used to solve unconstrained optimization problems such as energy minimization. It is commonly attributed to Magnus Hestenes and Eduard Stiefel, who programmed it on the Z4, and extensively researched it. The biconjugate gradient method provides a generalization to non-symmetric matrices. Various nonlinear conjugate gradient methods seek minima of nonlinear optimization problems. Description of the problem addres ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]