Stress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of ''

n

'' ''

m

''-dimensional data items, a configuration ''

X

'' of

n

points in ''

r

(\ll m)

''-dimensional space is sought that minimizes the so-called ''stress'' function

\sigma(X)

. Usually ''

r

'' is

2

3

, i.e. the ''

(n\times r)

'' matrix ''

X

'' lists points in

2-

3-

dimensional

Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's Elements, Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics ther ...

so that the result may be visualised (i.e. an

MDS plot Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of n objects or individuals" into a configurati ...

). The function

\sigma

is a cost or

loss function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ...

that measures the squared differences between ideal (

m

-dimensional) distances and actual distances in ''r''-dimensional space. It is defined as: :

\sigma(X)=\sum_w_(d_(X)-\delta_)^2

where

w_\ge 0

is a weight for the measurement between a pair of points

(i,j)

d_(X)

is the

euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...

between

i

and

j

and

\delta_

is the ideal distance between the points (their separation) in the

m

-dimensional data space. Note that

w_

can be used to specify a degree of confidence in the similarity between points (e.g. 0 can be specified if there is no information for a particular pair). A configuration

X

which minimizes

\sigma(X)

gives a plot in which points that are close together correspond to points that are also close together in the original

m

-dimensional data space. There are many ways that

\sigma(X)

could be minimized. For example, Kruskal recommended an iterative steepest descent approach. However, a significantly better (in terms of guarantees on, and rate of, convergence) method for minimizing stress was introduced by

Jan de Leeuw Jan de Leeuw (born December 19, 1945) is a Dutch statistician and psychometrician. He is distinguished professor emeritus of statistics and founding chair of the Department of Statistics, University of California, Los Angeles. In addition, he is t ...

.. De Leeuw's ''iterative majorization'' method at each step minimizes a simple convex function which both bounds

\sigma

from above and touches the surface of

\sigma

at a point

Z

, called the ''supporting point''. In convex analysis such a function is called a ''majorizing'' function. This iterative majorization process is also referred to as the SMACOF algorithm ("Scaling by MAjorizing a COmplicated Function").

The SMACOF algorithm

The stress function

\sigma

can be expanded as follows: :

\sigma(X)=\sum_w_(d_(X)-\delta_)^2
=\sum_w_\delta_^2 + \sum_w_d_^2(X)-2\sum_w_\delta_d_(X)

Note that the first term is a constant

C

and the second term is quadratic in

X

(i.e. for the Hessian matrix

V

the second term is equivalent to tr

X'VX

) and therefore relatively easily solved. The third term is bounded by: :

\sum_w_\delta_d_(X)=\,\operatorname\, X'B(X)X \ge \,\operatorname\, X'B(Z)Z

where

B(Z)

has: :

b_=-\frac

for

d_(Z)\ne 0, i \ne j

and

b_=0

for

d_(Z)=0, i\ne j

and

b_=-\sum_^n b_

. Proof of this inequality is by the Cauchy-Schwarz inequality, see Borg. (pp. 152–153). Thus, we have a simple quadratic function

\tau(X,Z)

that majorizes stress: :

\sigma(X)=C+\,\operatorname\, X'VX - 2 \,\operatorname\, X'B(X)X

\le C+\,\operatorname\, X' V X - 2 \,\operatorname\, X'B(Z)Z = \tau(X,Z)

The iterative minimization procedure is then: * at the

k^

step we set

Z\leftarrow X^

X^k\leftarrow \min_X \tau(X,Z)

* stop if

\sigma(X^)-\sigma(X^)<\epsilon

otherwise repeat. This algorithm has been shown to decrease stress monotonically (see de Leeuw).

Use in graph drawing

Stress majorization and algorithms similar to SMACOF also have application in the field of

graph drawing Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graph (discrete mathematics), graphs arising from applications such a ...

. That is, one can find a reasonably aesthetically appealing layout for a network or graph by minimizing a stress function over the positions of the nodes in the graph. In this case, the

\delta_

are usually set to the graph-theoretic distances between nodes ''

i

'' and ''

j

'' and the weights

w_

are taken to be

\delta_^

. Here,

\alpha

is chosen as a trade-off between preserving long- or short-range ideal distances. Good results have been shown for

\alpha=2

References

{{reflist Graph drawing Dimension reduction Mathematical optimization Mathematical analysis