Distance geometry is the branch of mathematics concerned with characterizing and studying sets of points based ''only'' on given values of the

distance Distance is a numerical or occasionally qualitative measurement of how far apart objects or points are. In physics or everyday usage, distance may refer to a physical length or an estimation based on other criteria (e.g. "two counties over"). ...

s between pairs of points. More abstractly, it is the study of semimetric spaces and the isometric transformations between them. In this view, it can be considered as a subject within

general topology In mathematics, general topology is the branch of topology that deals with the basic set-theoretic definitions and constructions used in topology. It is the foundation of most other branches of topology, including differential topology, geometr ...

. Historically, the first result in distance geometry is Heron's formula in 1st century AD. The modern theory began in 19th century with work by

Arthur Cayley Arthur Cayley (; 16 August 1821 – 26 January 1895) was a prolific British mathematician who worked mostly on algebra. He helped found the modern British school of pure mathematics. As a child, Cayley enjoyed solving complex maths problems ...

, followed by more extensive developments in the 20th century by Karl Menger and others. Distance geometry problems arise whenever one needs to infer the shape of a configuration of points (

relative position In geometry, a position or position vector, also known as location vector or radius vector, is a Euclidean vector that represents the position of a point ''P'' in space in relation to an arbitrary reference origin ''O''. Usually denoted x, r, or ...

s) from the distances between them, such as in

biology Biology is the scientific study of life. It is a natural science with a broad scope but has several unifying themes that tie it together as a single, coherent field. For instance, all organisms are made up of cells that process hereditary ...

, sensor network, surveying, navigation,

cartography Cartography (; from grc, χάρτης , "papyrus, sheet of paper, map"; and , "write") is the study and practice of making and using maps. Combining science, aesthetics and technique, cartography builds on the premise that reality (or an im ...

, and physics.

Introduction and definitions

The concepts of distance geometry will first be explained by describing two particular problems. Hyperbolic_Navigation

First problem: hyperbolic navigation

Consider three ground radio stations A, B, C, whose locations are known. A radio receiver is at an unknown location. The times it takes for a radio signal to travel from the stations to the receiver,

t_A,t_B,t_C

, are unknown, but the time differences,

t_A-t_B

and

t_A-t_C

, are known. From them, one knows the distance differences

c(t_A-t_B)

and

c(t_A-t_C)

, from which the position of the receiver can be found.

Second problem:
dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...

data analysis Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, enc ...

, one is often given a list of data represented as vectors

\mathbf = (x_1, \ldots, x_n)\in \mathbb^n

, and one needs to find out whether they lie within a low-dimensional affine subspace. A low-dimensional representation of data has many advantages, such as saving storage space, computation time, and giving better insight into data.

Definitions

Now we formalize some definitions that naturally arise from considering our problems.

Semimetric space

Given a list of points on

R = \

n \ge 0

, we can arbitrarily specify the distances between pairs of points by a list of

d_> 0

0 \le i < j \le n

. This defines a semimetric space: a metric space without triangle inequality. Explicitly, we define a semimetric space as a nonempty set

R

equipped with a semimetric

d: R\times R \to''a fortiori'' a semimetric space. In particular, \mathbb^k, the k -dimensional Euclidean space, is the Canonical form, canonical metric space in distance geometry.

The triangle inequality is omitted in the definition, because we do not want to enforce more constraints on the distances d_than the mere requirement that they be positive.

In practice, semimetric spaces naturally arise from inaccurate measurements. For example, given three points A, B, C on a line, with d_ = 1, d_ = 1, d_ = 2, an inaccurate measurement could give d_ = 0.99, d_ = 0.98, d_ = 2.00, violating the triangle inequality.

Isometric embedding

Given two semimetric spaces,

(R, d), (R', d')

, an isometric embedding from

R

R'

is a map

f: R \to R'

that preserves the semimetric, that is, for all

x, y\in R

d(x, y) = d'(f(x), f(y))

. For example, given the finite semimetric space

(R, d)

defined above, an isometric embedding into is defined by points

A_0, A_1,\ldots, A_n \in \mathbb R^k

, such that

d(A_i, A_j) = d_

for all

0 \le i < j \le n

Affine independence

Given the points

A_0, A_1,\ldots, A_n \in \mathbb R^k

, they are defined to be affinely independent,

iff In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicond ...

they cannot fit inside a single

l

-dimensional affine subspace of

\mathbb^k

, for any

\ell < n

, iff the

n

''-'' simplex they span,

v_n

, has positive

n

-volume, that is,

\operatorname_n(v_n) > 0

. In general, when

k\ge n

, they are affinely independent, since a

generic Generic or generics may refer to: In business * Generic term, a common name used for a range or class of similar things not protected by trademark * Generic brand, a brand for a product that does not have an associated brand or trademark, other ...

''n''-simplex is nondegenerate. For example, 3 points in the plane, in general, are not collinear, because the triangle they span does not degenerate into a line segment. Similarly, 4 points in space, in general, are not coplanar, because the tetrahedron they span does not degenerate into a flat triangle. When

n > k

, they must be affinely dependent. This can be seen by noting that any

n

-simplex that can fit inside

\mathbb^k

must be "flat".

Cayley–Menger determinants

Cayley–Menger determinants, named after Arthur Cayley and Karl Menger, are determinants of matrices of distances between sets of points. Let

A_0, A_1,\ldots, A_n

be ''n'' + 1 points in a semimetric space, their Cayley–Menger determinant is defined by :

\operatorname(A_0, \cdots, A_n) = \begin 
0 & d_^2 & d_^2 & \cdots & d_^2 & 1 \\
d_^2 & 0 & d_^2 & \cdots & d_^2 & 1 \\
d_^2 & d_^2 & 0 & \cdots & d_^2 & 1 \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
d_^2 & d_^2 & d_^2 & \cdots & 0 & 1 \\
1 & 1 & 1 & \cdots & 1 & 0
\end

A_0, A_1,\ldots, A_n \in \mathbb R^k

, then they make up the vertices of a possibly degenerate ''n''-simplex

v_n

\mathbb^k

. It can be shown that the ''n''-dimensional volume of the simplex

v_n

satisfies :

\operatorname_n(v_n)^2 = \frac \operatorname(A_0, \ldots, A_n).

Note that, for the case of

n=0

, we have

\operatorname_0(v_0) = 1

, meaning the "0-dimensional volume" of a 0-simplex is 1, that is, there is 1 point in a 0-simplex.

A_0, A_1,\ldots, A_n

are affinely independent iff

\operatorname_n(v_n) > 0

, that is,

(-1)^ \operatorname(A_0, \ldots, A_n) > 0

. Thus Cayley–Menger determinants give a computational way to prove affine independence. If

k < n

, then the points must be affinely dependent, thus

\operatorname(A_0, \ldots, A_n) = 0

. Cayley's 1841 paper studied the special case of

k = 3, n = 4

, that is, any five points

A_0, \ldots, A_4

in 3-dimensional space must have

\operatorname(A_0, \ldots, A_4) = 0

History

The first result in distance geometry is Heron's formula, from 1st century AD, which gives the area of a triangle from the distances between its 3 vertices.

Brahmagupta's formula In Euclidean geometry, Brahmagupta's formula is used to find the area of any cyclic quadrilateral (one that can be inscribed in a circle) given the lengths of the sides; its generalized version ( Bretschneider's formula) can be used with non-cycli ...

, from 7th century AD, generalizes it to

cyclic quadrilateral In Euclidean geometry, a cyclic quadrilateral or inscribed quadrilateral is a quadrilateral whose vertices all lie on a single circle. This circle is called the ''circumcircle'' or ''circumscribed circle'', and the vertices are said to be ''c ...

Tartaglia Tartaglia may refer to: *Tartaglia (commedia dell'arte), Commedia dell'arte stock character *Angelo Tartaglia (1350 or 1370–1421), Italian condottiero * Niccolò Fontana Tartaglia (1499/1500–1557), Venetian mathematician and engineer *Ivo Tarta ...

, from 16th century AD, generalized it to give the volume of tetrahedron from the distances between its 4 vertices. The modern theory of distance geometry began with

and Karl Menger. Cayley published the Cayley determinant in 1841, which is a special case of the general Cayley–Menger determinant. Menger proved in 1928 a characterization theorem of all semimetric spaces that are isometrically embeddable in the ''n''-dimensional Euclidean space

\mathbb^n

. In 1931, Menger used distance relations to give an axiomatic treatment of Euclidean geometry.

Leonard Blumenthal Leonard Mascot Blumenthal (February 27, 1901 – August 1984) was a Jewish American mathematician. He received his Ph.D. in 1927 from Johns Hopkins University, under the supervision of Frank Morley; his dissertation was titled ''Lagrange Re ...

's book gives a general overview for distance geometry at the graduate level, a large part of which is treated in English for the first time when it was published.

Menger characterization theorem

Menger proved the following

characterization theorem In mathematics, a characterization of an object is a set of conditions that, while different from the definition of the object, is logically equivalent to it. To say that "Property ''P'' characterizes object ''X''" is to say that not only does ''X ...

of semimetric spaces:

A semimetric space $(R, d)$ is isometrically embeddable in the $n$ -dimensional Euclidean space $\mathbb^n$ , but not in $\mathbb^m$ for any $0 \le m < n$ , if and only if: # $R$ contains an $(n+1)$ -point subset $S$ that is isometric with an affinely independent $(n+1)$ -point subset of $\mathbb^n$ ; # any $(n+3)$ -point subset $S'$ , obtained by adding any two additional points of $R$ to $S$ , is congruent to an $(n+3)$ -point subset of $\mathbb^n$ .

A proof of this theorem in a slightly weakened form (for metric spaces instead of semimetric spaces) is in.

Characterization via Cayley–Menger determinants

The following results are proved in Blumethal's book.

Embedding $n+1$ points in $\mathbb^n$

Given a semimetric space

(S,d)

, with

S = \

, and

d(P_i, P_j) = d_\ge 0

0 \le i < j \le n

, an isometric embedding of

(S, d)

into

\mathbb^n

is defined by

A_0, A_1,\ldots, A_n \in \mathbb R^n

, such that

d(A_i, A_j) = d_

for all

0 \le i < j \le n

. Again, one asks whether such an isometric embedding exists for

(S,d)

. A necessary condition is easy to see: for all

k = 1, \ldots, n

, let

v_k

be the ''k''-simplex formed by

A_0, A_1,\ldots, A_k

, then :

(-1)^ \operatorname(P_0, \ldots, P_k) = (-1)^ \operatorname(A_0, \ldots, A_k) = 2^k (k!)^k \operatorname_k(v_k)^2 \ge 0

The converse also holds. That is, if for all

k = 1, \ldots, n

, :

(-1)^\operatorname(P_0, \ldots, P_k) \ge 0,

then such an embedding exists. Further, such embedding is unique up to isometry in

\mathbb^n

. That is, given any two isometric embeddings defined by

A_0, A_1,\ldots, A_n

, and

A'_0, A'_1,\ldots, A'_n

, there exists a (not necessarily unique) isometry

T :  \mathbb R^n \to \mathbb R^n

, such that

T(A_k) = A'_k

for all

k = 0, \ldots, n

. Such

T

is unique if and only if

\operatorname(P_0, \ldots, P_n) \neq 0

, that is,

A_0, A_1,\ldots, A_n

are affinely independent.

Embedding $n+2$ and $n+3$ points

n+2

points

P_0, \ldots, P_

can be embedded in

\mathbb^n

A_0, \ldots, A_

, then other than the conditions above, an additional necessary condition is that the

(n+1)

-simplex formed by

A_0, A_1,\ldots, A_

, must have no

(n+1)

-dimensional volume. That is,

\operatorname(P_0, \ldots, P_n, P_) = 0

. The converse also holds. That is, if for all

k = 1, \ldots, n

, :

(-1)^ \operatorname(P_0, \ldots, P_k) \ge 0,

and :

\operatorname(P_0, \ldots, P_n, P_) = 0,

then such an embedding exists. For embedding

n+3

points in

\mathbb^n

, the necessary and sufficient conditions are similar: # For all

k = 1, \ldots, n

(-1)^ \operatorname(P_0, \ldots, P_k) \ge 0

; #

\operatorname(P_0, \ldots, P_n, P_) = 0;

\operatorname(P_0, \ldots, P_n, P_) = 0;

\operatorname(P_0, \ldots, P_n, P_,  P_) = 0.

Embedding arbitrarily many points

The

n+3

case turns out to be sufficient in general. In general, given a semimetric space

(R, d)

, it can be isometrically embedded in

\mathbb^n

if and only if there exists

P_0, \ldots, P_n\in R

, such that, for all

k = 1, \ldots, n

(-1)^ \operatorname(P_0, \ldots, P_k) \ge 0

, and for any

P_, P_ \in R

, #

\operatorname(P_0, \ldots, P_n, P_) = 0;

\operatorname(P_0, \ldots, P_n, P_) = 0;

\operatorname(P_0, \ldots, P_n, P_,  P_) = 0.

And such embedding is unique up to isometry in

\mathbb^n

. Further, if

\operatorname(P_0, \ldots, P_n) \neq 0

, then it cannot be isometrically embedded in any

\mathbb^m, m < n

. And such embedding is unique up to unique isometry in

\mathbb^n

. Thus, Cayley–Menger determinants give a concrete way to calculate whether a semimetric space can be embedded in

\mathbb^n

, for some finite

n

, and if so, what is the minimal

n

Applications

There are many applications of distance geometry. In telecommunication networks such as

GPS The Global Positioning System (GPS), originally Navstar GPS, is a satellite-based radionavigation system owned by the United States government and operated by the United States Space Force. It is one of the global navigation satellite sy ...

, the positions of some sensors are known (which are called anchors) and some of the distances between sensors are also known: the problem is to identify the positions for all sensors. Hyperbolic navigation is one pre-GPS technology that uses distance geometry for locating ships based on the time it takes for signals to reach anchors. There are many applications in chemistry. Techniques such as

NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...

can measure distances between pairs of atoms of a given molecule, and the problem is to infer the 3-dimensional shape of the molecule from those distances. Some software packages for applications are:
DGSOL
Solves large distance geometry problems in macromolecular modeling.
Xplor-NIH
Based on

X-PLOR X-PLOR is a computer software package for computational structural biology originally developed by Axel T. Brunger at Yale University. It was first published in 1987 as an offshoot of CHARMM - a similar program that ran on supercomputers made by C ...

, to determine the structure of molecules based on data from NMR experiments. It solves distance geometry problems with heuristic methods (such as

simulated annealing Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. It ...

) and local search methods (such as conjugate gradient minimization).
TINKER
Molecular modeling and design. It can solve distance geometry problems.
SNLSDPclique
MATLAB code for locating sensors in a sensor network based on the distances between the sensors.

References

{{DEFAULTSORT:Distance Geometry Metric geometry Determinants

Introduction and definitions

First problem: hyperbolic navigation

Second problem: dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...

Definitions

Semimetric space

Isometric embedding

Affine independence

Cayley–Menger determinants

History

Menger characterization theorem

Characterization via Cayley–Menger determinants

Embedding n+1 points in \mathbb^n

Embedding n+2 and n+3 points

Embedding arbitrarily many points

Applications

See also

References

Second problem:
dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...

Embedding $n+1$ points in $\mathbb^n$

Embedding $n+2$ and $n+3$ points