Convex optimization is a subfield of

mathematical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...

that studies the problem of minimizing

convex function In mathematics, a real-valued function is called convex if the line segment between any two points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph (the set of poin ...

s over

convex set In geometry, a subset of a Euclidean space, or more generally an affine space over the reals, is convex if, given any two points in the subset, the subset contains the whole line segment that joins them. Equivalently, a convex set or a convex ...

s (or, equivalently, maximizing

concave functions In mathematics, a concave function is the negative of a convex function. A concave function is also synonymously called concave downwards, concave down, convex upwards, convex cap, or upper convex. Definition A real-valued function f on an i ...

over convex sets). Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general

NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...

. Convex optimization has applications in a wide range of disciplines, such as automatic

control systems A control system manages, commands, directs, or regulates the behavior of other devices or systems using control loops. It can range from a single home heating controller using a thermostat controlling a domestic boiler to large industrial ...

, estimation and

signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...

, communications and networks, electronic

circuit design The process of circuit design can cover systems ranging from complex electronic systems down to the individual transistors within an integrated circuit. One person can often do the design process without needing a planned or structured design ...

, data analysis and modeling, finance, statistics (

optimal experimental design In the design of experiments, optimal designs (or optimum designs) are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statistic ...

), and

structural optimization Shape optimization is part of the field of optimal control theory. The typical problem is to find the shape which is optimal in that it minimizes a certain cost functional while satisfying given constraints. In many cases, the functional being ...

, where the approximation concept has proven to be efficient. With recent advancements in computing and optimization algorithms, convex programming is nearly as straightforward as

linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...

Definition

A convex optimization problem is an

optimization problem In mathematics, computer science and economics, an optimization problem is the problem of finding the ''best'' solution from all feasible solutions. Optimization problems can be divided into two categories, depending on whether the variables ...

in which the objective function is a

and the

feasible set In mathematical optimization, a feasible region, feasible set, search space, or solution space is the set of all possible points (sets of values of the choice variables) of an optimization problem that satisfy the problem's constraints, potent ...

is a

. A function

f

mapping some subset of

\mathbb^n

into

\mathbb \cup \

is convex if its domain is convex and for all

\theta \in

, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...

/math> and all

x, y

in its domain, the following condition holds:

f(\theta x + (1 - \theta)y) \leq \theta f(x) + (1 - \theta) f(y)

. A set S is convex if for all members

x, y \in S

and all

\theta \in

/math>, we have that

\theta x + (1 - \theta) y \in S

. Concretely, a convex optimization problem is the problem of finding some

\mathbf \in C

attaining :

\inf \

, where the objective function

f :\mathcal D \subseteq \mathbb^n \to \mathbb

is convex, as is the feasible set

C

. If such a point exists, it is referred to as an ''optimal point'' or ''solution''; the set of all optimal points is called the ''optimal set''. If

f

is unbounded below over

C

or the infimum is not attained, then the optimization problem is said to be ''unbounded''. Otherwise, if

C

is the empty set, then the problem is said to be ''infeasible''.

Standard form

A convex optimization problem is in ''standard form'' if it is written as :

\begin
&\underset& & f(\mathbf) \\
&\operatorname
& &g_i(\mathbf) \leq 0, \quad i = 1, \dots, m \\
&&&h_i(\mathbf) = 0, \quad i = 1, \dots, p,
\end

where: *

\mathbf \in \mathbb^n

is the optimization variable; * The objective function

f: \mathcal D \subseteq \mathbb^n \to \mathbb

is a

; * The inequality constraint functions

g_i : \mathbb^n \to \mathbb

i=1, \ldots, m

, are convex functions; * The equality constraint functions

h_i : \mathbb^n \to \mathbb

i=1, \ldots, p

, are

affine transformation In Euclidean geometry, an affine transformation or affinity (from the Latin, ''affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. More generall ...

s, that is, of the form:

h_i(\mathbf) = \mathbf\cdot \mathbf - b_i

, where

\mathbf

is a vector and

b_i

is a scalar. This notation describes the problem of finding

\mathbf \in \mathbb^n

that minimizes

f(\mathbf)

among all

\mathbf

satisfying

g_i(\mathbf) \leq 0

i=1, \ldots, m

and

h_i(\mathbf) = 0

i=1, \ldots, p

. The function

f

is the objective function of the problem, and the functions

g_i

and

h_i

are the constraint functions. The feasible set

C

of the optimization problem consists of all points

\mathbf \in \mathcal

satisfying the constraints. This set is convex because

\mathcal

is convex, the

sublevel set In mathematics, a level set of a real-valued function of real variables is a set where the function takes on a given constant value , that is: : L_c(f) = \left\~, When the number of independent variables is two, a level set is cal ...

s of convex functions are convex, affine sets are convex, and the intersection of convex sets is convex. A solution to a convex optimization problem is any point

\mathbf \in C

attaining

\inf \

. In general, a convex optimization problem may have zero, one, or many solutions. Many optimization problems can be equivalently formulated in this standard form. For example, the problem of maximizing a

concave function In mathematics, a concave function is the negative of a convex function. A concave function is also synonymously called concave downwards, concave down, convex upwards, convex cap, or upper convex. Definition A real-valued function f on an ...

f

can be re-formulated equivalently as the problem of minimizing the convex function

-f

. The problem of maximizing a concave function over a convex set is commonly called a convex optimization problem.

Properties

The following are useful properties of convex optimization problems: * every

local minimum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given r ...

is a

global minimum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given r ...

; * the optimal set is convex; * if the objective function is ''strictly'' convex, then the problem has at most one optimal point. These results are used by the theory of convex minimization along with geometric notions from

functional analysis Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (e.g. inner product, norm, topology, etc.) and the linear functions defined ...

(in Hilbert spaces) such as the

Hilbert projection theorem In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector x in a Hilbert space In mathematics, Hilbert spaces (named after David Hilbert) allow generalizing the methods of linear a ...

, the

separating hyperplane theorem In geometry, the hyperplane separation theorem is a theorem about disjoint convex sets in ''n''-dimensional Euclidean space. There are several rather similar versions. In one version of the theorem, if both these sets are closed and at least on ...

, and

Farkas' lemma Farkas' lemma is a solvability theorem for a finite system of linear inequalities in mathematics. It was originally proven by the Hungarian mathematician Gyula Farkas. Farkas' lemma is the key result underpinning the linear programming duality an ...

Applications

The following problem classes are all convex optimization problems, or can be reduced to convex optimization problems via simple transformations: Hierarchy compact convex

Least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...

Linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...

* Convex quadratic minimization with linear constraints * Quadratic minimization with convex quadratic constraints *

Conic optimization Conic optimization is a subfield of convex optimization that studies problems consisting of minimizing a convex function over the intersection of an affine subspace and a convex cone. The class of conic optimization problems includes some of the ...

Geometric programming A geometric program (GP) is an optimization problem of the form : \begin \mbox & f_0(x) \\ \mbox & f_i(x) \leq 1, \quad i=1, \ldots, m\\ & g_i(x) = 1, \quad i=1, \ldots, p, \end where f_0,\dots,f_m are posynomials and g_1,\dots,g_p are monomials. ...

* Second order cone programming *

Semidefinite programming Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function (a user-specified function that the user wants to minimize or maximize) over the intersection of the cone of positiv ...

Entropy maximization The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data (such as a proposition ...

with appropriate constraints Convex optimization has practical applications for the following. *

Portfolio optimization Portfolio optimization is the process of selecting the best portfolio (asset distribution), out of the set of all portfolios being considered, according to some objective. The objective typically maximizes factors such as expected return, and minim ...

. * Worst-case risk analysis. * Optimal advertising. * Variations of statistical regression (including

regularization Regularization may refer to: * Regularization (linguistics) * Regularization (mathematics) * Regularization (physics) * Regularization (solid modeling) * Regularization Law, an Israeli law intended to retroactively legalize settlements See also ...

and

quantile regression Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares estimates the conditional ''mean'' of the response variable across values of the predictor variables, quantile regress ...

). * Model fitting (particularly

multiclass classification In machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying instances into one of two classes is called binary c ...

). *

Electricity generation Electricity generation is the process of generating electric power from sources of primary energy. For utilities in the electric power industry, it is the stage prior to its delivery ( transmission, distribution, etc.) to end users or its stor ...

optimization. *

Combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...

. * Non-probabilistic modelling of

uncertainty Uncertainty refers to Epistemology, epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially ...

. * Localization using wireless signals

Lagrange multipliers

Consider a convex minimization problem given in standard form by a cost function

f(x)

and inequality constraints

g_i(x)\leq 0

for

1 \leq i \leq m

. Then the domain

\mathcal

is: :

\mathcal = \left\.

The Lagrangian function for the problem is :

L(x,\lambda_,\lambda_1, \ldots ,\lambda_)=\lambda_ f(x) + \lambda_ g_ (x)+\cdots + \lambda_ g_ (x).

For each point

x

X

that minimizes

f

over

X

, there exist real numbers

\lambda_,\lambda_1, \ldots, \lambda_,

called

Lagrange multipliers In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied ...

, that satisfy these conditions simultaneously: #

x

minimizes

L(y,\lambda_,\lambda_,\ldots ,\lambda_)

over all

y \in X,

\lambda_,\lambda_,\ldots ,\lambda_ \geq 0,

with at least one

\lambda_ > 0,

\lambda_g_(x)=\cdots = \lambda_g_(x) = 0

(complementary slackness). If there exists a "strictly feasible point", that is, a point

z

satisfying :

g_(z), \ldots, g_(z)<0,

then the statement above can be strengthened to require that

\lambda_=1

. Conversely, if some

x

X

satisfies (1)–(3) for

scalar Scalar may refer to: *Scalar (mathematics), an element of a field, which is used to define a vector space, usually the field of real numbers *Scalar (physics), a physical quantity that can be described by a single element of a number field such a ...

\lambda_,\ldots,\lambda_

with

\lambda_=1

then

x

is certain to minimize

f

over

X

Algorithms

Unconstrained convex optimization can be easily solved with

gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...

(a special case of

steepest descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of t ...

) or

Newton's method In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real ...

, combined with

line search In optimization, the line search strategy is one of two basic iterative approaches to find a local minimum \mathbf^* of an objective function f:\mathbb R^n\to\mathbb R. The other approach is trust region. The line search approach first finds a ...

for an appropriate step size; these can be mathematically proven to converge quickly, especially the latter method. Convex optimization with linear equality constraints can also be solved using KKT matrix techniques if the objective function is a

quadratic function In mathematics, a quadratic polynomial is a polynomial of degree two in one or more variables. A quadratic function is the polynomial function defined by a quadratic polynomial. Before 20th century, the distinction was unclear between a polynomi ...

(which generalizes to a variation of Newton's method, which works even if the point of initialization does not satisfy the constraints), but can also generally be solved by eliminating the equality constraints with

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matric ...

or solving the dual problem. Finally, convex optimization with both linear equality constraints and convex inequality constraints can be solved by applying an unconstrained convex optimization technique to the objective function plus logarithmic barrier terms. (When the starting point is not feasible - that is, satisfying the constraints - this is preceded by so-called ''phase I'' methods, which either find a feasible point or show that none exist. Phase I methods generally consist of reducing the search in question to yet another convex optimization problem.) Convex optimization problems can also be solved by the following contemporary methods: * Bundle methods (Wolfe, Lemaréchal, Kiwiel), and * Subgradient projection methods (Polyak), * Interior-point methods, which make use of

self-concordant In optimization, a self-concordant function is a function f:\mathbb \rightarrow \mathbb for which : , f(x), \leq 2 f''(x)^ or, equivalently, a function f:\mathbb \rightarrow \mathbb that, wherever f''(x) > 0, satisfies : \left, \frac \frac ...

barrier functions and self-regular barrier functions. *

Cutting-plane methods In mathematical optimization, the cutting-plane method is any of a variety of optimization methods that iteratively refine a feasible set or objective function by means of linear inequalities, termed ''cuts''. Such procedures are commonly used ...

Ellipsoid method In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions. When specialized to solving feasible linear optimization problems with rational data, the ellipsoid method is an algorithm which finds an ...

Subgradient method Subgradient methods are iterative methods for solving convex minimization problems. Originally developed by Naum Z. Shor and others in the 1960s and 1970s, subgradient methods are convergent when applied even to a non-differentiable objective funct ...

* Dual subgradients and the drift-plus-penalty method Subgradient methods can be implemented simply and so are widely used.Bertsekas Dual subgradient methods are subgradient methods applied to a

dual problem In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then t ...

. The drift-plus-penalty method is similar to the dual subgradient method, but takes a time average of the primal variables.

Implementations

Convex optimization and related algorithms have been implemented in the following software programs:

Extensions

Extensions of convex optimization include the optimization of biconvex, pseudo-convex, and quasiconvex functions. Extensions of the theory of

convex analysis Convex analysis is the branch of mathematics devoted to the study of properties of convex functions and convex sets, often with applications in convex minimization, a subdomain of optimization theory. Convex sets A subset C \subseteq X of ...

and iterative methods for approximately solving non-convex minimization problems occur in the field of

generalized convexity A generalization is a form of abstraction whereby common properties of specific instances are formulated as general concepts or claims. Generalizations posit the existence of a domain or set of elements, as well as one or more common character ...

, also known as abstract convex analysis.

Notes

References

* * * * * * Hiriart-Urruty, Jean-Baptiste, and Lemaréchal, Claude. (2004). ''Fundamentals of Convex analysis''. Berlin: Springer. * * * * * * Nesterov, Yurii. (2004).
Introductory Lectures on Convex Optimization
', Kluwer Academic Publishers * * * Schmit, L.A.; Fleury, C. 1980: ''Structural synthesis by combining approximation concepts and dual methods''. J. Amer. Inst. Aeronaut. Astronaut 18, 1252-1260

External links

EE364a: Convex Optimization I
an
EE364b: Convex Optimization II
Stanford course homepages
6.253: Convex Analysis and Optimization
an MIT OCW course homepage * Brian Borchers
An overview of software for convex optimizationConvex Optimization Book by Lieven Vandenberghe and Stephen P. Boyd
{{Convex analysis and variational analysis Convex analysis Mathematical optimization