Maximum Variance Unfolding (MVU), also known as Semidefinite Embedding (SDE), is an
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
in
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
that uses
semidefinite programming
Semidefinite programming (SDP) is a subfield of mathematical programming concerned with the optimization of a linear objective function (a user-specified function that the user wants to minimize or maximize)
over the intersection of the cone of po ...
to perform
non-linear dimensionality reduction of high-dimensional
vector
Vector most often refers to:
* Euclidean vector, a quantity with a magnitude and a direction
* Disease vector, an agent that carries and transmits an infectious pathogen into another living organism
Vector may also refer to:
Mathematics a ...
ial input data.
It is motivated by the observation that
kernel Principal Component Analysis
In the field of multivariate statistics, kernel principal component analysis (kernel PCA)
is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are performe ...
(kPCA) does not reduce the data dimensionality, as it leverages the
Kernel trick
In machine learning, kernel machines are a class of algorithms for pattern analysis, whose best known member is the support-vector machine (SVM). These methods involve using linear classifiers to solve nonlinear problems. The general task of pa ...
to non-linearly map the original data into an
inner-product space
In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, ofte ...
.
Algorithm
MVU creates a mapping from the high dimensional input vectors to some low dimensional
Euclidean vector
In mathematics, physics, and engineering, a Euclidean vector or simply a vector (sometimes called a geometric vector or spatial vector) is a geometric object that has magnitude (or length) and direction. Euclidean vectors can be added and scal ...
space in the following steps:
# A
neighbourhood
A neighbourhood (Commonwealth English) or neighborhood (American English) is a geographically localized community within a larger town, city, suburb or rural area, sometimes consisting of a single street and the buildings lining it. Neighbourh ...
graph is created. Each input is connected with its k-nearest input vectors (according to
Euclidean distance
In mathematics, the Euclidean distance between two points in Euclidean space is the length of the line segment between them. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, and therefore is o ...
metric) and all k-nearest neighbors are connected with each other. If the data is sampled well enough, the resulting graph is a discrete approximation of the underlying manifold.
# The neighbourhood graph is "unfolded" with the help of semidefinite programming. Instead of learning the output vectors directly, the semidefinite programming aims to find an inner product matrix that maximizes the pairwise distances between any two inputs that are not connected in the neighbourhood graph while preserving the nearest neighbors distances.
# The low-dimensional embedding is finally obtained by application of
multidimensional scaling
Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a data set. MDS is used to translate distances between each pair of n objects in a set into a configuration of n points mapped into an ...
on the learned inner product matrix.
The steps of applying semidefinite programming followed by a linear dimensionality reduction step to recover a low-dimensional embedding into a Euclidean space were first proposed by
Linial, London, and Rabinovich.
Optimization formulation
Let
be the original input and
be the embedding. If
are two neighbors, then the local isometry constraint that needs to be satisfied is:
:
Let
be the
Gram matrices of
and
(i.e.:
). We can express the above constraint for every neighbor points
in term of
:
:
In addition, we also want to constrain the embedding
to center at the origin:
As described above, except the distances of neighbor points are preserved, the algorithm aims to maximize the pairwise distance of every pair of points. The objective function to be maximized is:
Intuitively, maximizing the function above is equivalent to pulling the points as far away from each other as possible and therefore "unfold" the manifold. The local isometry constraint
Let
where
prevents the objective function from diverging (going to infinity).
Since the graph has N points, the distance between any two points
. We can then bound the objective function as follows:
:
The objective function can be rewritten purely in the form of the Gram matrix:
:
Finally, the optimization can be formulated as:
After the Gram matrix
is learned by semidefinite programming, the output
can be obtained via
Cholesky decomposition
In linear algebra, the Cholesky decomposition or Cholesky factorization (pronounced ) is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for eff ...
.
In particular, the Gram matrix can be written as
where
is the i-th element of eigenvector
of the eigenvalue
.
It follows that the
-th element of the output
is
.
See also
*
Locally linear embedding
*
Isometry (disambiguation)
*
Local Tangent Space Alignment
*
Riemannian manifold
In differential geometry, a Riemannian manifold is a geometric space on which many geometric notions such as distance, angles, length, volume, and curvature are defined. Euclidean space, the N-sphere, n-sphere, hyperbolic space, and smooth surf ...
*
Energy minimization
In the field of computational chemistry, energy minimization (also called energy optimization, geometry minimization, or geometry optimization) is the process of finding an arrangement in space of a collection of atoms where, according to some com ...
Notes
References
*
*
*
*
* {{cite journal, last=Lawrence, first=Neil D, title=A unifying probabilistic perspective for spectral dimensionality reduction: insights and new models, year=2012, pages=1612, volume=13, issue=May, journal=
Journal of Machine Learning Research
The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Bac ...
, url=http://www.jmlr.org/papers/v13/lawrence12a.html, bibcode=2010arXiv1010.4830L, arxiv=1010.4830
Additional material
Kilian Q. Weinberger's MVU Matlab code
Computational statistics
Dimension reduction
Optimization algorithms and methods