Łukaszyk–Karmowski metric
   HOME

TheInfoList



OR:

In
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
, the Łukaszyk–Karmowski metric is a function defining a distance between two
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s or two random vectors. This function is not a metric as it does not satisfy the identity of indiscernibles condition of the metric, that is for two identical arguments its value is greater than zero. The concept is named after Szymon Łukaszyk and Wojciech Karmowski.


Continuous random variables

The Łukaszyk–Karmowski metric ''D'' between two continuous independent
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s ''X'' and ''Y'' is defined as: :D(X, Y) = \int_^\infty \int_^\infty , x-y, f(x)g(y) \, dx\, dy where ''f''(''x'') and ''g''(''y'') are the probability density functions of ''X'' and ''Y'' respectively. One may easily show that such ''metrics'' above do not satisfy the identity of indiscernibles condition required to be satisfied by the metric of the metric space. In fact they satisfy this condition if and only if both arguments ''X'', ''Y'' are certain events described by Dirac delta density
probability distribution function Probability distribution function may refer to: * Probability distribution * Cumulative distribution function * Probability mass function * Probability density function In probability theory, a probability density function (PDF), or density ...
s. In such a case: :D_(X, Y) = \int_^\infty \int_^\infty , x-y, \delta(x-\mu_x)\delta(y-\mu_y) \, dx\, dy = , \mu_x-\mu_y, the Łukaszyk–Karmowski metric simply transforms into the metric between
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
s \mu_x, \mu_y of the variables ''X'' and ''Y'' and obviously: :D_(X, X) = , \mu_x-\mu_x, = 0. For all the other real cases however: :D\left(X, X\right) > 0. \, The Łukaszyk–Karmowski metric satisfies the remaining non-negativity and
symmetry Symmetry (from grc, συμμετρία "agreement in dimensions, due proportion, arrangement") in everyday language refers to a sense of harmonious and beautiful proportion and balance. In mathematics, "symmetry" has a more precise definit ...
conditions of metric directly from its definition (symmetry of modulus), as well as
subadditivity In mathematics, subadditivity is a property of a function that states, roughly, that evaluating the function for the sum of two elements of the domain always returns something less than or equal to the sum of the function's values at each element. ...
/ triangle inequality condition: :\begin & D(X, Z) = \int_^\infty \int_^\infty , x-z, f(x)h(z) \, dx\, dz\ = \int_^\infty \int_^\infty , x-z, f(x)h(z) \, dx\, dz \int_^\infty g(y) dy\ \\ & = \int_^\infty \int_^\infty \int_^\infty , (x-y)+(y-z), f(x)g(y)h(z) \, dx\, dy\, dz\ \\ & \le \int_^\infty \int_^\infty \int_^\infty (, x-y, +, y-z, )f(x)g(y)h(z) \, dx\, dy\, dz\ \\ & = \int_^\infty \int_^\infty \int_^\infty , x-y, f(x)g(y)h(z) \, dx\, dy\, dz\ + \int_^\infty \int_^\infty \int_^\infty , y-z, f(x)g(y)h(z) \, dx\, dy\, dz\ \\ & = \int_^\infty \int_^\infty , x-y, f(x)g(y) \, dx\, dy\ + \int_^\infty \int_^\infty , y-z, g(y)h(z) \, dy\, dz\ \\ & = D(X, Y) + D(Y, Z) \end Thus : D(X, Z) \le D(X, Y)+D(Y, Z). \, In the case where ''X'' and ''Y'' are dependent on each other, having a joint probability density function ''f''(''x'', ''y''), the L–K metric has the following form: :\int_^\infty \int_^\infty , x-y, f(x, y) \, dx\, dy.


Example: two continuous random variables with normal distributions (NN)

If both random variables ''X'' and ''Y'' have
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
s with the same
standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
σ, and if moreover ''X'' and ''Y'' are independent, then ''D''(''X'', ''Y'') is given by : D_(X, Y) = \mu_ + \frac\operatorname\left(-\frac\right)-\mu_ \operatorname \left(\frac\right) , where :\mu_ = \left, \mu_x-\mu_y\, where erfc(''x'') is the complementary
error function In mathematics, the error function (also called the Gauss error function), often denoted by , is a complex function of a complex variable defined as: :\operatorname z = \frac\int_0^z e^\,\mathrm dt. This integral is a special (non-elementary ...
and where the subscripts NN indicate the type of the L–K metric. In this case, the lowest possible value of the function D_(X, Y) is given by :\lim_ D_(X, Y) = D_(X, X) = \frac.


Example: two continuous random variables with uniform distributions (RR)

When both random variables ''X'' and ''Y'' have uniform distributions (''R'') of the same
standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
σ, ''D''(''X'', ''Y'') is given by :D_(X, Y) = \begin \frac, & \mu_<2\sqrt\sigma, \\ \mu_, & \mu_ \ge 2\sqrt\sigma. \end The minimal value of this kind of L–K metric is :D_(X, X) = \frac.


Discrete random variables

In case the random variables ''X'' and ''Y'' are characterized by discrete probability distribution the Łukaszyk–Karmowski metric ''D'' is defined as: :D(X, Y) = \sum_ \sum_ , x_i-y_j, P(X=x_i)P(Y=y_j).\, For example for two discrete
Poisson-distributed In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known c ...
random variables ''X'' and ''Y'' the equation above transforms into: :D_(X, Y) = \sum_^n\sum_^n , x-y, \frac.


Random vectors

The Łukaszyk–Karmowski metric of random variables may be easily extended into metric ''D''(X, Y) of random vectors X, Y by substituting , x-y, with any metric operator ''d''(x,y): :D(\mathbf, \mathbf) =\int_ \int_ d(\mathbf, \mathbf)F(\mathbf)G(\mathbf)\, d\Omega_x \, d\Omega_y. For example substituting ''d''(x,y) with an Euclidean metric and assuming two-dimensionality of random vectors X, Y would yield: :D(\mathbf, \mathbf) =\int_ \int_\Omega \sqrt F( x_1, x_2)G(y_1, y_2) \, dx_1\, dx_2\, dy_1\, dy_2. This form of L–K metric is also greater than zero for the same vectors being measured (with the exception of two vectors having Dirac delta coefficients) and satisfies non-negativity and symmetry conditions of metric. The proofs are analogous to the ones provided for the L–K metric of random variables discussed above. In case random vectors X and Y are dependent on each other, sharing common
joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
''F''(X, Y) the L–K metric has the form: :D(\mathbf, \mathbf) =\int_ \int_ d(\mathbf, \mathbf)F(\mathbf, \mathbf) \, d\Omega_x \, d \Omega_y.


Random vectors – the Euclidean form

If the random vectors X and Y are not also only mutually independent but also all components of each vector are
mutually independent Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of o ...
, the Łukaszyk–Karmowski metric for random vectors is defined as: :D_^(\mathbf, \mathbf) = \left( \right)^ where: :D_(X_i, Y_i)\, is a particular form of L–K metric of random variables chosen in dependence of the distributions of particular coefficients X_i and Y_i of vectors X, Y . Such a form of L–K metric also shares the common properties of all L–K metrics. * It does not satisfy the identity of indiscernibles condition: :\forall\ D_^(\mathbf, \mathbf) = 0 \ \nLeftrightarrow \ \mathbf = \mathbf \, :since: :D_^(\mathbf, \mathbf) = 0 \Leftrightarrow \ \forall \ D_(X_i, X_i) = 0 :but from the properties of L–K metric for random variables it follows that: :\exists\ X_i\ D_(X_i, X_i) > 0 * It is non-negative and symmetric since the particular coefficients are also non-negative and symmetric: :\forall\ i \ D_(X_i, Y_i) > 0 \, :\forall\ i \ D_(X_i, Y_i) = D_(Y_i, X_i) * It satisfies the triangle inequality: :\forall\ \mathbf, \mathbf, \mathbf \ D_^(\mathbf, \mathbf) \le D_^(\mathbf, \mathbf) + D_^(\mathbf, \mathbf) :since (cf. Minkowski inequality): :\begin & \left( \right)^ + \left( \right)^\ \ge \\ & \ge \left( \right)^ \ge \\ & \ge \left( \right)^ \end


Physical interpretation

The Łukaszyk–Karmowski metric may be considered as a distance between quantum mechanics particles described by wavefunctions ''ψ'', where the probability ''dP'' that given particle is present in given volume of space ''dV'' amounts: :dP = , \psi(x, y, z), ^2 dV. \,


A quantum particle in a box

For example the wavefunction of a quantum particle (''X'') in a box of length ''L'' has the form: :\psi_m(x) = \sqrt \sin, \, In this case the L–K metric between this particle and any point \xi \in (0, L)\, of the box amounts: :\begin & D(X, \xi) = \int\limits_^L , x-\xi, , \psi_m(x), ^2dx = \\ & = \frac - \xi +L\left(\frac-\frac\right). \end From the properties of the L–K metric it follows that the sum of distances between the edge of the box (''ξ'' = 0 or ''ξ''= ''L'') and any given point and the L–K metric between this point and the particle ''X'' is greater than L–K metric between the edge of the box and the particle. E.g. for a quantum particle ''X'' at an energy level ''m'' = 2 and point ''ξ'' = 0.2: :d(0,0.2L) + D(0.2L, X) \approx 0.2L + 0.3171L = 0.517L \neq D(0, X) = 0.5L = d(0,0.5L).\, Obviously the L–K metric between the particle and the edge of the box (D(0, X) or D(''L'', X)) amounts 0.5''L'' and is independent on the particle's energy level.


Two quantum particles in a box

A distance between two particles bouncing in a one-dimensional box of length ''L'' having time-independent wavefunctions: :\psi_m(x) = \sqrt \sin, \, :\psi_n(y) = \sqrt \sin, \, may be defined in terms of Łukaszyk–Karmowski metric of independent random variables as: :\begin & D(X, Y) = \int\limits_^L \int\limits_0^L , x-y, , \psi_m(x), ^2, \psi_n(y), ^2 \, dx\, dy \\ & = \begin L\left(\frac \right) & m=n, \\ L\left(\frac \right) & m \neq n \end\end The distance between particles ''X'' and ''Y'' is minimal for ''m'' = 1 i ''n'' = 1, that is for the minimum energy levels of these particles and amounts: :\min(D(X, Y)) = L\left(\frac \right) \approx 0.2067L. \, According to properties of this function, the minimum distance is nonzero. For greater energy levels ''m'', ''n'' it approaches to ''L''/3.


Popular explanation

Suppose we have to measure the distance between point ''µx'' and point ''µy'', which are collinear with some point ''0''. Suppose further that we instructed this task to two independent and large groups of surveyors equipped with tape measures, wherein each surveyor of the first group will measure distance between ''0'' and ''µx'' and each surveyor of the second group will measure distance between ''0'' and ''µy''. Under the following assumptions we may consider the two sets of received observations ''xi'', ''yj'' as random variables ''X'' and ''Y'' having
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
of the same variance ''σ'' 2 and distributed over "factual locations" of points ''µx'', ''µy''. Calculating the
arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the ''average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The colle ...
for all pairs , ''xi'' − ''yj'', we should then obtain the value of L–K metric ''DNN''(''X'', ''Y''). Its characteristic curvilinearity arises from the symmetry of modulus and overlapping of distributions ''f''(''x''), ''g''(''y'') when their means approach each other. An interesting experiment the results of which coincide with the properties of L–K metric was performed in 1967 by Robert Moyer and
Thomas Landauer Dr. Thomas K. Landauer (April 25, 1932 – March 26, 2014) was a Professor Emeritus at the Department of Psychology of the University of Colorado. He received his doctorate in 1960 from Harvard University, and also held academic appointments at Harv ...
who measured the precise time an adult took to decide which of two Arabic digits was the largest. When the two digits were numerically distanced such as 2 and 9. subjects responded quickly and accurately. But their response time slowed by more than 100 milliseconds when they were closer such as 5 and 6, and subjects then erred as often as once in every ten trials. The distance effect was present both among highly intelligent persons, as well as those who were trained to escape it.


Practical applications

A Łukaszyk–Karmowski metric may be used instead of a metric operator (commonly the Euclidean distance) in various numerical methods, and in particular in approximation algorithms such us radial basis function networks, inverse distance weighting or
Kohonen A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised learning, unsupervised machine learning technique used to produce a dimensionality reduction, low-dimensional (typically two-dimensional) representation of a hig ...
self-organizing maps. This approach is physically based, allowing the real uncertainty in the location of the sample points to be considered. Łukaszyk–Karmowski metric is the only metric that can be used In the context of observer-dependent measurements.Massimiliano Proietti, Alexander Pickston, Francesco Graffitti, Peter Barrow, Dmytro Kundys, Cyril Branciard, Martin Ringbauer, Alessandro Fedrizzi
''Experimental test of local observer independence''
Science Advances 20 Sep 2019, Vol. 5, no. 9, eaaw9832, DOI: 10.1126/sciadv.aaw9832
It is zero only for two measurements having the same spatiotemporal coordinates for a given observer.


See also

* Probabilistic metric space *
Statistical distance In statistics, probability theory, and information theory, a statistical distance quantifies the distance between two statistical objects, which can be two random variables, or two probability distributions or samples, or the distance can be be ...


References

{{DEFAULTSORT:Lukaszyk-Karmowski metric Statistical distance