Augmented Lagrangian methods are a certain class of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s for solving constrained

optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...

problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective; the difference is that the augmented Lagrangian method adds yet another term, designed to mimic a

Lagrange multiplier In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied ...

. The augmented Lagrangian is related to, but not identical with the method of Lagrange multipliers. Viewed differently, the unconstrained objective is the

Lagrangian Lagrangian may refer to: Mathematics * Lagrangian function, used to solve constrained minimization problems in optimization theory; see Lagrange multiplier ** Lagrangian relaxation, the method of approximating a difficult constrained problem with ...

of the constrained problem, with an additional penalty term (the augmentation). The method was originally known as the method of multipliers, and was studied much in the 1970 and 1980s as a good alternative to penalty methods. It was first discussed by Magnus Hestenes, and by

Michael Powell Michael Latham Powell (30 September 1905 – 19 February 1990) was an English filmmaker, celebrated for his partnership with Emeric Pressburger. Through their production company The Archers, they together wrote, produced and directed a serie ...

in 1969. The method was studied by R. Tyrrell Rockafellar in relation to

Fenchel duality In mathematics, Fenchel's duality theorem is a result in the theory of convex functions named after Werner Fenchel. Let ''ƒ'' be a proper convex function on R''n'' and let ''g'' be a proper concave function on R''n''. Then, if regularity con ...

, particularly in relation to proximal-point methods, Moreau–Yosida regularization, and maximal monotone operators: These methods were used in structural optimization. The method was also studied by

Dimitri Bertsekas Dimitri Panteli Bertsekas (born 1942, Athens, el, Δημήτρης Παντελής Μπερτσεκάς) is an applied mathematician, electrical engineer, and computer scientist, a McAfee Professor at the Department of Electrical Engineeri ...

, notably in his 1982 book, together with extensions involving nonquadratic regularization functions, such as entropic regularization, which gives rise to the "exponential method of multipliers," a method that handles inequality constraints with a twice differentiable augmented Lagrangian function. Since the 1970s,

sequential quadratic programming Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable. SQP me ...

(SQP) and

interior point method Interior-point methods (also referred to as barrier methods or IPMs) are a certain class of algorithms that solve linear and nonlinear convex optimization problems. An interior point method was discovered by Soviet mathematician I. I. Dikin in 1 ...

s (IPM) have had increasing attention, in part because they more easily use

sparse matrix In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse b ...

subroutine In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Functions ma ...

s from numerical

software libraries In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subro ...

, and in part because IPMs have proven complexity results via the theory of self-concordant functions. The augmented Lagrangian method was rejuvenated by the optimization systems

LANCELOT Lancelot du Lac (French for Lancelot of the Lake), also written as Launcelot and other variants (such as early German ''Lanzelet'', early French ''Lanselos'', early Welsh ''Lanslod Lak'', Italian ''Lancillotto'', Spanish ''Lanzarote del Lago' ...

, ALGENCAN and

AMPL AMPL (A Mathematical Programming Language) is an algebraic modeling language to describe and solve high-complexity problems for large-scale mathematical computing (i.e., large-scale optimization and scheduling-type problems). It was developed b ...

, which allowed sparse matrix techniques to be used on seemingly dense but "partially separable" problems. The method is still useful for some problems., chapter 17 Around 2007, there was a resurgence of augmented Lagrangian methods in fields such as total-variation denoising and

compressed sensing Compressed sensing (also known as compressive sensing, compressive sampling, or sparse sampling) is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This ...

. In particular, a variant of the standard augmented Lagrangian method that uses partial updates (similar to the Gauss–Seidel method for solving linear equations) known as the

alternating direction method of multipliers Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems ...

or ADMM gained some attention.

General method

Let us say we are solving the following constrained problem: :

\min f(\mathbf)

subject to :

c_i(\mathbf) = 0 ~\forall  i \in \mathcal,

where

\mathcal

denotes the indices for equality constraints. This problem can be solved as a series of unconstrained minimization problems. For reference, we first list the ''k''th step of the penalty method approach: :

\min \Phi_k (\mathbf) = f (\mathbf) + \mu_k ~ \sum_ ~ c_i(\mathbf)^2.

The penalty method solves this problem, then at the next iteration it re-solves the problem using a larger value of

\mu_k

(and using the old solution as the initial guess or "warm-start"). The augmented Lagrangian method uses the following unconstrained objective: :

\min \Phi_k (\mathbf) = f (\mathbf) + \frac ~ \sum_ ~ c_i(\mathbf)^2  + \sum_ ~ \lambda_i c_i(\mathbf)

and after each iteration, in addition to updating

\mu_k

, the variable

\lambda

is also updated according to the rule :

\lambda_i \leftarrow \lambda_i + \mu_k c_i(\mathbf_k)

where

\mathbf_k

is the solution to the unconstrained problem at the ''k''th step, i.e.

\mathbf_k=\text \Phi_k(\mathbf)

The variable

\lambda

is an estimate of the

, and the accuracy of this estimate improves at every step. The major advantage of the method is that unlike the penalty method, it is not necessary to take

\mu \rightarrow \infty

in order to solve the original constrained problem. Instead, because of the presence of the Lagrange multiplier term,

\mu

can stay much smaller, thus avoiding ill-conditioning. Nevertheless, it is common in practical implementations to project multipliers estimates in a large bounded set (safeguards), avoiding numerical instabilities and leading to a strong theoretical convergence. The method can be extended to handle inequality constraints. For a discussion of practical improvements, see.

Alternating direction method of multipliers

The alternating direction method of multipliers (ADMM) is a variant of the augmented Lagrangian scheme that uses partial updates for the dual variables. This method is often applied to solve problems such as :

\min_x f(x) + g(x).

This is equivalent to the constrained problem :

\min_ f(x) + g(y), \quad \text\quad  x = y.

Though this change may seem trivial, the problem can now be attacked using methods of constrained optimization (in particular, the augmented Lagrangian method), and the objective function is separable in ''x'' and ''y''. The dual update requires solving a proximity function in ''x'' and ''y'' at the same time; the ADMM technique allows this problem to be solved approximately by first solving for ''x'' with ''y'' fixed, and then solving for ''y'' with ''x'' fixed. Rather than iterate until convergence (like the Jacobi method), the algorithm proceeds directly to updating the dual variable and then repeating the process. This is not equivalent to the exact minimization, but it can still be shown that this method converges to the right answer under some assumptions. Because of this approximation, the algorithm is distinct from the pure augmented Lagrangian method. The ADMM can be viewed as an application of the Douglas-Rachford splitting algorithm, and the Douglas-Rachford algorithm is in turn an instance of the Proximal point algorithm; details can be found here. There are several modern software packages that solve Basis pursuit and variants and use the ADMM; such packages include YALL1 (2009), SpaRSA (2009) and SALSA (2009). There are also packages that use the ADMM to solve more general problems, some of which can exploit multiple computing cores SNAPVX (2015), parADMM (2016).

Stochastic optimization

Stochastic optimization considers the problem of minimizing a loss function with access to noisy samples of the (gradient of the) function. The goal is to have an estimate of the optimal parameter (minimizer) per new sample. ADMM is originally a batch method. However, with some modifications it can also be used for stochastic optimization. Since in a stochastic setting we only have access to noisy samples of gradient, we use an inexact approximation of the Lagrangian as

\hat_ = f_1(x_k)+\langle \nabla f(x_k,\zeta_),x \rangle+g(y)-z^T (Ax + By - c)+\frac \Vert Ax + By - c \Vert^2+\frac,

where

\eta_

is a time-varying step size. The alternating direction method of multipliers (ADMM) is a popular method for online and distributed optimization on a large scale, and is employed in many applications, e.g. ADMM is often applied to solve regularized problems, where the function optimization and regularization can be carried out locally, and then coordinated globally via constraints. Regularized optimization problems are especially relevant in the high dimensional regime since regularization is a natural mechanism to overcome ill-poseidness and to encourage parsimony in the optimal solution, e.g., sparsity and low rank. Due to the efficiency of ADMM in solving regularized problems, it has a good potential for stochastic optimization in high dimensions.

Alternative approaches

Sequential quadratic programming Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable. SQP me ...

* Sequential linear programming *

Sequential linear-quadratic programming Sequential linear-quadratic programming (SLQP) is an iterative method for nonlinear optimization problems where objective function and constraints are twice continuously differentiable. Similarly to sequential quadratic programming (SQP), SLQP pr ...

Software

Open source and non-free/commercial implementations of the augmented Lagrangian method: * Accord.NET (C# implementation of augmented Lagrangian optimizer) * ALGLIB (C# and C++ implementations of preconditioned augmented Lagrangian solver) *

PENNON A pennon, also known as a pennant or pendant, is a long narrow flag which is larger at the hoist than at the fly. It can have several shapes, such as triangular, tapering (square tail) or triangular swallowtail (forked tail), etc. In maritime ...

(GPL 3, commercial license available) *

(free "internal use" license, paid commercial options) *

MINOS In Greek mythology, Minos (; grc-gre, Μίνως, ) was a King of Crete, son of Zeus and Europa. Every nine years, he made King Aegeus pick seven young boys and seven young girls to be sent to Daedalus's creation, the labyrinth, to be eat ...

(also uses an augmented Lagrangian method for some types of problems). * The code for Apache 2.0 licensed

REASON Reason is the capacity of Consciousness, consciously applying logic by Logical consequence, drawing conclusions from new or existing information, with the aim of seeking the truth. It is closely associated with such characteristically human activ ...

is available online. * ALGENCAN (Fortran implementation of augmented Lagrangian method with safeguards). Available online. * NLOPT (C++ implementation of augmented Lagrangian optimizer, accessible from different programming languages) * PyProximal (Python implementation of augmented Lagrangian method).

References

Bibliography

* * * {{DEFAULTSORT:Augmented Lagrangian Method Optimization algorithms and methods

General method

Alternating direction method of multipliers

Stochastic optimization

Alternative approaches

Software

See also

References

Bibliography