Augmented Lagrangian methods are a certain class of
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s for solving
constrained optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
problems. They have similarities to
penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the
objective; the difference is that the augmented Lagrangian method adds yet another term, designed to mimic a
Lagrange multiplier
In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied ...
. The augmented Lagrangian is related to, but not identical with the
method of Lagrange multipliers.
Viewed differently, the unconstrained objective is the
Lagrangian
Lagrangian may refer to:
Mathematics
* Lagrangian function, used to solve constrained minimization problems in optimization theory; see Lagrange multiplier
** Lagrangian relaxation, the method of approximating a difficult constrained problem with ...
of the constrained problem, with an additional penalty term (the augmentation).
The method was originally known as the method of multipliers, and was studied much in the 1970 and 1980s as a good alternative to penalty methods. It was first discussed by
Magnus Hestenes, and by
Michael Powell
Michael Latham Powell (30 September 1905 – 19 February 1990) was an English filmmaker, celebrated for his partnership with Emeric Pressburger. Through their production company The Archers, they together wrote, produced and directed a serie ...
in 1969. The method was studied by
R. Tyrrell Rockafellar in relation to
Fenchel duality In mathematics, Fenchel's duality theorem is a result in the theory of convex functions named after Werner Fenchel.
Let ''ƒ'' be a proper convex function on R''n'' and let ''g'' be a proper concave function on R''n''. Then, if regularity con ...
, particularly in relation to proximal-point methods,
Moreau–Yosida regularization, and
maximal monotone operators: These methods were used in
structural optimization. The method was also studied by
Dimitri Bertsekas
Dimitri Panteli Bertsekas (born 1942, Athens, el, Δημήτρης Παντελής Μπερτσεκάς) is an applied mathematician, electrical engineer, and computer scientist, a McAfee Professor at the Department of Electrical Engineeri ...
, notably in his 1982 book, together with extensions involving nonquadratic regularization functions, such as
entropic regularization, which gives rise to the "exponential method of multipliers," a method that handles inequality constraints with a twice differentiable augmented Lagrangian function.
Since the 1970s,
sequential quadratic programming
Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable.
SQP me ...
(SQP) and
interior point method
Interior-point methods (also referred to as barrier methods or IPMs) are a certain class of algorithms that solve linear and nonlinear convex optimization problems.
An interior point method was discovered by Soviet mathematician I. I. Dikin in 1 ...
s (IPM) have had increasing attention, in part because they more easily use
sparse matrix
In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse b ...
subroutine
In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.
Functions ma ...
s from
numerical software libraries
In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subro ...
, and in part because IPMs have proven complexity results via the theory of
self-concordant functions. The augmented Lagrangian method was rejuvenated by the optimization systems
LANCELOT
Lancelot du Lac (French for Lancelot of the Lake), also written as Launcelot and other variants (such as early German ''Lanzelet'', early French ''Lanselos'', early Welsh ''Lanslod Lak'', Italian ''Lancillotto'', Spanish ''Lanzarote del Lago' ...
, ALGENCAN
and
AMPL
AMPL (A Mathematical Programming Language) is an algebraic modeling language to describe and solve high-complexity problems for large-scale mathematical computing (i.e., large-scale optimization and scheduling-type problems).
It was developed b ...
, which allowed sparse matrix techniques to be used on seemingly dense but "partially separable" problems. The method is still useful for some problems.
[, chapter 17]
Around 2007, there was a resurgence of augmented Lagrangian methods in fields such as
total-variation denoising and
compressed sensing
Compressed sensing (also known as compressive sensing, compressive sampling, or sparse sampling) is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This ...
. In particular, a variant of the standard augmented Lagrangian method that uses partial updates (similar to the
Gauss–Seidel method for solving linear equations) known as the
alternating direction method of multipliers Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems ...
or ADMM gained some attention.
General method
Let us say we are solving the following constrained problem:
:
subject to
:
where
denotes the indices for equality constraints.
This problem can be solved as a series of unconstrained minimization problems. For reference, we first list the ''k''th step of the
penalty method approach:
:
The penalty method solves this problem, then at the next iteration it re-solves the problem using a larger value of
(and using the old solution as the initial guess or "warm-start").
The augmented Lagrangian method uses the following unconstrained objective:
:
and after each iteration, in addition to updating
, the variable
is also updated according to the rule
:
where
is the solution to the unconstrained problem at the ''k''th step, i.e.
The variable
is an estimate of the
Lagrange multiplier
In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied ...
, and the accuracy of this estimate improves at every step. The major advantage of the method is that unlike the
penalty method, it is not necessary to take
in order to solve the original constrained problem. Instead, because of the presence of the Lagrange multiplier term,
can stay much smaller, thus avoiding ill-conditioning.
Nevertheless, it is common in practical implementations to project multipliers estimates in a large bounded set (safeguards), avoiding numerical instabilities and leading to a strong theoretical convergence.
The method can be extended to handle inequality constraints. For a discussion of practical improvements, see.
Alternating direction method of multipliers
The alternating direction method of multipliers (ADMM) is a variant of the augmented Lagrangian scheme that uses partial updates for the dual variables. This method is often applied to solve problems such as
:
This is equivalent to the constrained problem
:
Though this change may seem trivial, the problem can now be attacked using methods of constrained optimization (in particular, the augmented Lagrangian method), and the objective function is separable in ''x'' and ''y''. The dual update requires solving a proximity function in ''x'' and ''y'' at the same time; the ADMM technique allows this problem to be solved approximately by first solving for ''x'' with ''y'' fixed, and then solving for ''y'' with ''x'' fixed. Rather than iterate until convergence (like the
Jacobi method), the algorithm proceeds directly to updating the dual variable and then repeating the process. This is not equivalent to the exact minimization, but it can still be shown that this method converges to the right answer under some assumptions. Because of this approximation, the algorithm is distinct from the pure augmented Lagrangian method.
The ADMM can be viewed as an application of the
Douglas-Rachford splitting algorithm, and the Douglas-Rachford algorithm is in turn an instance of the
Proximal point algorithm; details can be found here. There are several modern software packages that solve
Basis pursuit and variants and use the ADMM; such packages include YALL1 (2009), SpaRSA (2009) and SALSA (2009). There are also packages that use the ADMM to solve more general problems, some of which can exploit multiple computing cores SNAPVX (2015), parADMM (2016).
Stochastic optimization
Stochastic optimization considers the problem of minimizing a loss function with access to noisy samples of the (gradient of the) function. The goal is to have an estimate of the optimal parameter (minimizer) per new sample. ADMM is originally a batch method. However, with some modifications it can also be used for stochastic optimization. Since in a stochastic setting we only have access to noisy samples of gradient, we use an inexact approximation of the Lagrangian as
where
is a time-varying step size.
The alternating direction method of multipliers (ADMM) is a popular method for online and distributed optimization on a large scale, and is employed in many applications, e.g.
ADMM is often applied to solve regularized problems, where the function optimization and regularization can be carried out locally, and then coordinated globally via constraints.
Regularized optimization problems are especially relevant in the high dimensional regime since regularization is a natural mechanism to overcome ill-poseidness and to encourage parsimony in the optimal solution, e.g., sparsity and low rank. Due to the efficiency of ADMM in solving regularized problems, it has a good potential for stochastic optimization in high dimensions.
Alternative approaches
*
Sequential quadratic programming
Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable.
SQP me ...
*
Sequential linear programming
*
Sequential linear-quadratic programming
Sequential linear-quadratic programming (SLQP) is an iterative method for nonlinear optimization problems where objective function and constraints are twice continuously differentiable. Similarly to sequential quadratic programming (SQP), SLQP pr ...
Software
Open source and non-free/commercial implementations of the augmented Lagrangian method:
*
Accord.NET (C# implementation of augmented Lagrangian optimizer)
*
ALGLIB (C# and C++ implementations of preconditioned augmented Lagrangian solver)
*
PENNON
A pennon, also known as a pennant or pendant, is a long narrow flag which is larger at the hoist than at the fly. It can have several shapes, such as triangular, tapering (square tail) or triangular swallowtail (forked tail), etc. In maritime ...
(GPL 3, commercial license available)
*
LANCELOT
Lancelot du Lac (French for Lancelot of the Lake), also written as Launcelot and other variants (such as early German ''Lanzelet'', early French ''Lanselos'', early Welsh ''Lanslod Lak'', Italian ''Lancillotto'', Spanish ''Lanzarote del Lago' ...
(free "internal use" license, paid commercial options)
*
MINOS
In Greek mythology, Minos (; grc-gre, Μίνως, ) was a King of Crete, son of Zeus and Europa. Every nine years, he made King Aegeus pick seven young boys and seven young girls to be sent to Daedalus's creation, the labyrinth, to be eat ...
(also uses an augmented Lagrangian method for some types of problems).
* The code for Apache 2.0 licensed
REASON
Reason is the capacity of Consciousness, consciously applying logic by Logical consequence, drawing conclusions from new or existing information, with the aim of seeking the truth. It is closely associated with such characteristically human activ ...
is available online.
* ALGENCAN (Fortran implementation of augmented Lagrangian method with safeguards). Available online.
* NLOPT (C++ implementation of augmented Lagrangian optimizer, accessible from different programming languages)
* PyProximal (Python implementation of augmented Lagrangian method).
See also
*
Barrier function
*
Interior-point method
*
Lagrange multiplier
In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied ...
*
Penalty method
References
Bibliography
*
*
*
{{DEFAULTSORT:Augmented Lagrangian Method
Optimization algorithms and methods