In
mathematical optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
, the ellipsoid method is an
iterative method
In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an " ...
for
minimizing convex function
In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of a function, graph of the function lies above or on the graph between the two points. Equivalently, a function is conve ...
s over
convex set
In geometry, a set of points is convex if it contains every line segment between two points in the set.
For example, a solid cube (geometry), cube is a convex set, but anything that is hollow or has an indent, for example, a crescent shape, is n ...
s. The ellipsoid method generates a sequence of
ellipsoid
An ellipsoid is a surface that can be obtained from a sphere by deforming it by means of directional Scaling (geometry), scalings, or more generally, of an affine transformation.
An ellipsoid is a quadric surface; that is, a Surface (mathemat ...
s whose volume uniformly decreases at every step, thus enclosing a minimizer of a
convex function
In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of a function, graph of the function lies above or on the graph between the two points. Equivalently, a function is conve ...
.
When specialized to solving feasible
linear optimization problems with rational data, the ellipsoid method is an
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
which finds an optimal solution in a number of steps that is polynomial in the input size.
History
The ellipsoid method has a long history. As an
iterative method
In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an " ...
, a preliminary version was introduced by
Naum Z. Shor. In 1972, an
approximation algorithm
In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned sol ...
for real
convex minimization was studied by
Arkadi Nemirovski and David B. Yudin (Judin).
As an algorithm for solving
linear programming
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...
problems with rational data, the ellipsoid algorithm was studied by
Leonid Khachiyan; Khachiyan's achievement was to prove the
polynomial-time
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations p ...
solvability of linear programs. This was a notable step from a theoretical perspective: The standard algorithm for solving linear problems at the time was the
simplex algorithm
In mathematical optimization, Dantzig's simplex algorithm (or simplex method) is a popular algorithm for linear programming.
The name of the algorithm is derived from the concept of a simplex and was suggested by T. S. Motzkin. Simplices are ...
, which has a run time that ''typically'' is linear in the size of the problem, but for which examples exist for which it is ''exponential'' in the size of the problem. As such, having an algorithm that is ''guaranteed'' to be polynomial for all cases was a theoretical breakthrough.
Khachiyan's work showed, for the first time, that there can be algorithms for solving linear programs whose runtime can be proven to be polynomial. In practice, however, the algorithm is fairly slow and of little practical interest, though it provided inspiration for later work that turned out to be of much greater practical use. Specifically,
Karmarkar's algorithm, an
interior-point method, is much faster than the ellipsoid method in practice. Karmarkar's algorithm is also faster in the worst case.
The ellipsoidal algorithm allows
complexity theorists to achieve (worst-case) bounds that depend on the dimension of the problem and on the size of the data, but not on the number of rows, so it remained important in
combinatorial optimization
Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combina ...
theory for many years.
Only in the 21st century have interior-point algorithms with similar complexity properties appeared.
Description
A convex minimization problem consists of the following ingredients.
* A
convex function
In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of a function, graph of the function lies above or on the graph between the two points. Equivalently, a function is conve ...
to be minimized over the vector ''
'' (containing ''n'' variables);
* Convex inequality constraints of the form
, where the functions
are convex; these constraints define a
convex set
In geometry, a set of points is convex if it contains every line segment between two points in the set.
For example, a solid cube (geometry), cube is a convex set, but anything that is hollow or has an indent, for example, a crescent shape, is n ...
''
''.
* Linear equality constraints of the form
.
We are also given an initial
ellipsoid
An ellipsoid is a surface that can be obtained from a sphere by deforming it by means of directional Scaling (geometry), scalings, or more generally, of an affine transformation.
An ellipsoid is a quadric surface; that is, a Surface (mathemat ...
defined as
:
containing a minimizer
, where
and
is the center of
.
Finally, we require the existence of a
separation oracle for the convex set ''
. ''Given a point
, the oracle should return one of two answers:
* "The point
is in
", or -
* "The point
is not in
, and moreover, here is a hyperplane that separates
from
", that is, a vector
such that
for all
.
The output of the ellipsoid method is either:
* Any point in the polytope ''
'' (i.e., any feasible point), or -
* A proof that ''
'' is empty.
Inequality-constrained minimization of a function that is zero everywhere corresponds to the problem of simply identifying any feasible point. It turns out that any linear programming problem can be reduced to a linear feasibility problem (i.e. minimize the zero function subject to some linear inequality and equality constraints). One way to do this is by combining the primal and dual linear programs together into one program, and adding the additional (linear) constraint that the value of the primal solution is
no worse than the value of the dual solution.
Another way is to treat the objective of the linear program as an additional constraint, and use binary search to find the optimum value.
Unconstrained minimization
At the ''k''-th iteration of the algorithm, we have a point
at the center of an ellipsoid
:
We query the cutting-plane oracle to obtain a vector
such that
:
We therefore conclude that
:
We set
to be the ellipsoid of minimal volume containing the half-ellipsoid described above and compute
. The update is given by
:
where
:
The stopping criterion is given by the property that
:
Inequality-constrained minimization
At the ''k''-th iteration of the algorithm for constrained minimization, we have a point
at the center of an ellipsoid
as before. We also must maintain a list of values
recording the smallest objective value of feasible iterates so far. Depending on whether or not the point
is feasible, we perform one of two tasks:
*If
is feasible, perform essentially the same update as in the unconstrained case, by choosing a subgradient
that satisfies
::
*If
is infeasible and violates the ''j''-th constraint, update the ellipsoid with a feasibility cut. Our feasibility cut may be a subgradient
of
which must satisfy
::
for all feasible ''z''.
Performance in convex programs
Theoretical run-time complexity guarantee
The run-time complexity guarantee of the ellipsoid method in the
real RAM model is given by the following theorem.
Consider a family of
convex optimization
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization problems ...
problems of the form: minimize ''f''(''x'') s.t. ''x'' is in ''G'', where ''f'' is a convex function and ''G'' is a convex set (a subset of an Euclidean space ''R
n''). Each problem ''p'' in the family is represented by a data-vector Data(''p''), e.g., the real-valued coefficients in matrices and vectors representing the function ''f'' and the feasible region ''G''. The ''size'' of a problem ''p'', Size(''p''), is defined as the number of elements (real numbers) in Data(''p''). The following assumptions are needed:
# ''G'' (the feasible region) is:
#* Bounded;
#* Has a non-empty interior (so there is a strictly-feasible point);
# Given Data(''p''), one can compute using poly(Size(p)) arithmetic operations:
#* An ellipsoid that contains ''G'';
#* A lower bound 'MinVol(p)>0' of the volume ''G''.
# Given Data(''p'') and a point ''x'' in ''R
n'', one can compute using poly(Size(p)) arithmetic operations:
#* A
separation oracle for ''G'' (that is: either assert that ''x'' is in ''G'', or return a hyperplane separating ''x'' from ''G'').
#* A first-order oracle for ''f'' (that is: compute the value of ''f''(''x'') and a subgradient ''f(''x'')).
Under these assumptions, the ellipsoid method is "R-polynomial". This means that there exists a polynomial Poly such that, for every problem-instance ''p'' and every approximation-ratio ''ε''>0, the method finds a solution x satisfying :
,
using at most the following number of arithmetic operations on real numbers:
where ''V''(''p'') is a data-dependent quantity. Intuitively, it means that the number of operations required for each additional digit of accuracy is polynomial in Size(''p''). In the case of the ellipsoid method, we have:
.
The ellipsoid method requires at most
steps, and each step requires Poly(Size(p)) arithmetic operations.
Practical performance
The ellipsoid method is used on low-dimensional problems, such as planar location problems, where it is
numerically stable. Nemirovsky and BenTal
say that it is efficient if the number of variables is at most 20-30; this is so even if there are thousands of constraints, as the number of iterations does not depend on the number of constraints. However, in problems with many variables, the ellipsoid method is very inefficient, as the number of iterations grows as O(''n''
2).
Even on "small"-sized problems, it suffers from numerical instability and poor performance in practice .
Theoretical importance
The ellipsoid method is an important theoretical technique in
combinatorial optimization
Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combina ...
. In
computational complexity theory
In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and explores the relationships between these classifications. A computational problem ...
, the ellipsoid algorithm is attractive because its complexity depends on the number of columns and the digital size of the coefficients, but not on the number of rows.
The ellipsoid method can be used to show that many
algorithmic problems on convex sets are polynomial-time equivalent.
Performance in linear programs
Leonid Khachiyan applied the ellipsoid method to the special case of
linear programming
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...
: minimize c
Tx s.t. ''Ax ≤ b'', where all coefficients in A,b,c are rational numbers. He showed that linear programs can be solved in polynomial time. Here is a sketch of Khachiyan's theorem.
Step 1: reducing optimization to search. The theorem of
linear programming duality says that we can reduce the above minimization problem to the search problem: find ''x,y'' s.t. ''Ax ≤ b ; A
Ty = c ; y ≤ 0 ; c
Tx=b
Ty.'' The first problem is solvable iff the second problem is solvable; in case the problem is solvable, the ''x''-components of the solution to the second problem are an optimal solution to the first problem. Therefore, from now on, we can assume that we need to solve the following problem: find ''z'' ≥ 0 s.t. ''Rz'' ≤ ''r''. Multiplying all rational coefficients by the common denominator, we can assume that all coefficients are integers.
Step 2: reducing search to feasibility-check. The problem find ''z'' ≥ 0 s.t. ''Rz'' ≤ ''r'' can be reduced to the binary decision problem: "is there a ''z ≥ 0'' such that ''Rz'' ≤ ''r''?". This can be done as follows. If the answer to the decision problem is "no", then the answer to the search problem is "None", and we are done. Otherwise, take the first inequality constraint ''R
1z'' ≤ ''r
1''; replace it with an equality ''R
1z'' = ''r
1''; and apply the decision problem again. If the answer is "yes", we keep the equality; if the answer is "no", it means that the inequality is redundant, and we can remove it. Then we proceed to the next inequality constraint. For each constraint, we either convert it to equality or remove it. Finally, we have only equality constraints, which can be solved by any method for solving a system of linear equations.
Step 3: the decision problem can be reduced to a different optimization problem. Define the ''residual function'' f(z) := max
1-r1, (Rz)2-r2, (Rz)3-r3,...">Rz)1-r1, (Rz)2-r2, (Rz)3-r3,... Clearly, ''f''(''z'')≤0 iff ''Rz'' ≤ ''r''. Therefore, to solve the decision problem, it is sufficient to solve the minimization problem: min
''z'' ''f''(''z''). The function ''f'' is convex (it is a maximum of linear functions). Denote the minimum value by ''f''*. Then the answer to the decision problem is "yes" iff f*≤0.
Step 4: In the optimization problem min
''z'' ''f''(''z''), we can assume that ''z'' is in a box of side-length 2''
L'', where ''L'' is the bit length of the problem data. Thus, we have a bounded convex program, that can be solved up to any accuracy ε by the ellipsoid method, in time polynomial in ''L''.
Step 5: It can be proved that, if f*>0, then f*>2
-poly(L), for some polynomial. Therefore, we can pick the accuracy ε=2
-poly(L). Then, the ε-approximate solution found by the ellipsoid method will be positive, iff f*>0, iff the decision problem is unsolvable.
Variants
The ellipsoid method has several variants, depending on what cuts exactly are used in each step.
Different cuts
In the central-cut ellipsoid method,
the cuts are always through the center of the current ellipsoid. The input is a rational number ''ε''>0, a convex body ''K'' given by a weak separation oracle, and a number ''R'' such that S(0,''R'') (the ball of radius R around the origin) contains ''K''. The output is one of the following:
* (a) A vector at a distance of at most ''ε'' from K, or --
* (b) A positive definite matrix A and a point a such that the ellipsoid E(A,a) contains ''K'', and the volume of E(A,a) is at most ''ε''.
The number of steps is
, the number of required accuracy digits is ''p'' := 8''N'', and the required accuracy of the separation oracle is ''d'' := 2
−''p''.
In the deep-cut ellipsoid method,
the cuts remove more than half of the ellipsoid in each step. This makes it faster to discover that ''K'' is empty. However, when ''K'' is nonempty, there are examples in which the central-cut method finds a feasible point faster. The use of deep cuts does not change the order of magnitude of the run-time.
In the shallow-cut ellipsoid method,
the cuts remove less than half of the ellipsoid in each step. This variant is not very useful in practice, but it has theoretical importance: it allows to prove results that cannot be derived from other variants. The input is a rational number ''ε''>0, a convex body ''K'' given by a shallow separation oracle, and a number ''R'' such that S(0,''R'') contains ''K''. The output is a positive definite matrix A and a point a such that one of the following holds:
* (a) The ellipsoid E(A,a) has been declared "tough" by the oracle, or -
* (b) ''K'' is contained in E(A,a) and the volume of E(A,a) is at most ''ε''.
The number of steps is
, and the number of required accuracy digits is ''p'' := 8''N.''
Different ellipsoids
There is also a distinction between the circumscribed ellipsoid and the inscribed ellipsoid methods:
* In the
circumscribed ellipsoid method, each iteration finds an ellipsoid of ''smallest'' volume that ''contains'' the remaining part of the previous ellipsoid. This method was developed by Yudin and Nemirovskii.
* In the
Inscribed
An inscribed triangle of a circle
In geometry, an inscribed planar shape or solid is one that is enclosed by and "fits snugly" inside another geometric shape or solid. To say that "figure F is inscribed in figure G" means precisely the same th ...
ellipsoid method, each iteration finds an ellipsoid of ''largest'' volume that is ''contained'' the remaining part of the previous ellipsoid. This method was developed by Tarasov, Khachian and Erlikh.
The methods differ in their runtime complexity (below, ''n'' is the number of variables and epsilon is the accuracy):
* The circumscribed method requires
iterations, where each iteration consists of finding a separating hyperplane and finding a new circumscribed ellipsoid. Finding a circumscribed ellipsoid requires
time.
* The inscribed method requires
iterations, where each iteration consists of finding a separating hyperplane and finding a new inscribed ellipsoid. Finding an inscribed ellipsoid requires
time for some small
.
The relative efficiency of the methods depends on the time required for finding a separating hyperplane, which depends on the application: if the runtime is
for
then the circumscribed method is more efficient, but if
then the inscribed method is more efficient.
Related methods
* The
center-of-gravity method
The center-of-gravity method is a theoretic algorithm for convex optimization. It can be seen as a generalization of the bisection method from one-dimensional functions to multi-dimensional functions. It is theoretically important as it attains th ...
is a conceptually simpler method, that requires fewer steps. However, each step is computationally expensive, as it requires to compute the center of gravity of the current feasible polytope.
*
Interior point methods, too, allow solving convex optimization problems in polynomial time, but their practical performance is much better than the ellipsoid method.
Notes
Further reading
* Dmitris Alevras and Manfred W. Padberg, ''Linear Optimization and Extensions: Problems and Extensions'', Universitext, Springer-Verlag, 2001. (Problems from Padberg with solutions.)
* V. Chandru and M.R.Rao, Linear Programming, Chapter 31 in ''Algorithms and Theory of Computation Handbook'', edited by M.J.Atallah, CRC Press 1999, 31-1 to 31-37.
* V. Chandru and M.R.Rao, Integer Programming, Chapter 32 in '' Algorithms and Theory of Computation Handbook'', edited by M.J.Atallah, CRC Press 1999, 32-1 to 32-45.
*
George B. Dantzig and Mukund N. Thapa. 1997. ''Linear programming 1: Introduction''. Springer-Verlag.
*
George B. Dantzig and Mukund N. Thapa. 2003. ''Linear Programming 2: Theory and Extensions''. Springer-Verlag.
*
L. Lovász: ''An Algorithmic Theory of Numbers, Graphs, and Convexity'', CBMS-NSF Regional Conference Series in Applied Mathematics 50, SIAM, Philadelphia, Pennsylvania, 1986
* Kattta G. Murty, ''Linear Programming'', Wiley, 1983.
*
M. Padberg, ''Linear Optimization and Extensions'', Second Edition, Springer-Verlag, 1999.
*
Christos H. Papadimitriou and Kenneth Steiglitz, ''Combinatorial Optimization: Algorithms and Complexity'', Corrected republication with a new preface, Dover.
*
Alexander Schrijver, ''Theory of Linear and Integer Programming''. John Wiley & sons, 1998,
External links
EE364b a Stanford course homepage
{{optimization algorithms, convex
Combinatorial optimization
Convex optimization
Linear programming
Ellipsoids