In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, an exchangeable sequence of random variables (also sometimes interchangeable)
is a sequence ''X''
1, ''X''
2, ''X''
3, ... (which may be finitely or infinitely long) whose
joint probability distribution
A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw- ...
does not change when the positions in the sequence in which finitely many of them appear are altered. In other words, the joint distribution is invariant to finite permutation. Thus, for example the sequences
:
both have the same joint probability distribution.
It is closely related to the use of
independent and identically distributed random variables
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artis ...
in statistical models. Exchangeable sequences of random variables arise in cases of
simple random sampling
In statistics, a simple random sample (or SRS) is a subset of individuals (a sample (statistics), sample) chosen from a larger Set (mathematics), set (a statistical population, population) in which a subset of individuals are chosen randomization, ...
.
Definition
Formally, an exchangeable sequence of random variables is a finite or infinite sequence ''X''
1, ''X''
2, ''X''
3, ... of
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s such that for any finite
permutation
In mathematics, a permutation of a set can mean one of two different things:
* an arrangement of its members in a sequence or linear order, or
* the act or process of changing the linear order of an ordered set.
An example of the first mean ...
σ of the indices 1, 2, 3, ..., (the permutation acts on only finitely many indices, with the rest fixed), the
joint probability distribution
A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw- ...
of the permuted sequence
:
is the same as the joint probability distribution of the original sequence.
[In short, the order of the sequence of random variables does not affect its joint probability distribution.
* Chow, Yuan Shih and Teicher, Henry, ''Probability theory. Independence, interchangeability, martingales,'' Springer Texts in Statistics, 3rd ed., Springer, New York, 1997. xxii+488 pp. ]
(A sequence ''E''
1, ''E''
2, ''E''
3, ... of events is said to be exchangeable precisely if the sequence of its
indicator function
In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , then the indicator functio ...
s is exchangeable.) The distribution function ''F''
''X''1,...,''X''''n''(''x''
1, ..., ''x''
''n'') of a finite sequence of exchangeable random variables is symmetric in its arguments
Olav Kallenberg provided an appropriate definition of exchangeability for continuous-time stochastic processes.
[ Kallenberg, O., ''Probabilistic symmetries and invariance principles''. Springer-Verlag, New York (2005). 510 pp. .]
History
The concept was introduced by
William Ernest Johnson in his 1924 book ''Logic, Part III: The Logical Foundations of Science''. Exchangeability is equivalent to the concept of
statistical control introduced by
Walter Shewhart also in 1924.
Exchangeability and the i.i.d. statistical model
The property of exchangeability is closely related to the use of
independent and identically distributed
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
(i.i.d.) random variables in statistical models. A sequence of random variables that are i.i.d, conditional on some underlying distributional form, is exchangeable. This follows directly from the structure of the joint probability distribution generated by the i.i.d. form.
Mixtures of exchangeable sequences (in particular, sequences of i.i.d. variables) are exchangeable. The converse can be established for infinite sequences, through an important
representation theorem by
Bruno de Finetti
Bruno de Finetti (13 June 1906 – 20 July 1985) was an Italian probabilist statistician and actuary, noted for the "operational subjective" conception of probability. The classic exposition of his distinctive theory is the 1937 , which discuss ...
(later extended by other probability theorists such as
Halmos and
Savage). The extended versions of the theorem show that in any infinite sequence of exchangeable random variables, the random variables are conditionally
independent and identically-distributed, given the underlying distributional form. This theorem is stated briefly below. (De Finetti's original theorem only showed this to be true for random indicator variables, but this was later extended to encompass all sequences of random variables.) Another way of putting this is that
de Finetti's theorem characterizes exchangeable sequences as mixtures of i.i.d. sequences—while an exchangeable sequence need not itself be unconditionally i.i.d., it can be expressed as a mixture of underlying i.i.d. sequences.
This means that infinite sequences of exchangeable random variables can be regarded equivalently as sequences of conditionally i.i.d. random variables, based on some underlying distributional form. (Note that this equivalence does not quite hold for finite exchangeability. However, for finite vectors of random variables there is a close approximation to the i.i.d. model.) An infinite exchangeable sequence is
strictly stationary and so a
law of large numbers
In probability theory, the law of large numbers is a mathematical law that states that the average of the results obtained from a large number of independent random samples converges to the true value, if it exists. More formally, the law o ...
in the form of
Birkhoff–Khinchin theorem applies.
This means that the underlying distribution can be given an operational interpretation as the limiting empirical distribution of the sequence of values. The close relationship between exchangeable sequences of random variables and the i.i.d. form means that the latter can be justified on the basis of infinite exchangeability. This notion is central to
Bruno de Finetti's development of
predictive inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
and to
Bayesian statistics
Bayesian statistics ( or ) is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about ...
. It can also be shown to be a useful foundational assumption in
frequentist statistics and to link the two paradigms.
The representation theorem: This statement is based on the presentation in O'Neill (2009) in references below. Given an infinite sequence of random variables
we define the limiting
empirical distribution function
In statistics, an empirical distribution function ( an empirical cumulative distribution function, eCDF) is the Cumulative distribution function, distribution function associated with the empirical measure of a Sampling (statistics), sample. Th ...
by
:
(This is the
Cesàro limit of the indicator functions. In cases where the Cesàro limit does not exist this function can actually be defined as the
Banach limit of the indicator functions, which is an extension of this limit. This latter limit always exists for sums of indicator functions, so that the empirical distribution is always well-defined.) This means that for any vector of random variables in the sequence we have joint distribution function given by
:
If the distribution function
is indexed by another parameter
then (with densities appropriately defined) we have
:
These equations show the joint distribution or density characterised as a mixture distribution based on the underlying limiting empirical distribution (or a parameter indexing this distribution).
Note that not all finite exchangeable sequences are mixtures of i.i.d. To see this, consider sampling without replacement from a
finite set
In mathematics, particularly set theory, a finite set is a set that has a finite number of elements. Informally, a finite set is a set which one could in principle count and finish counting. For example,
is a finite set with five elements. Th ...
until no elements are left. The resulting sequence is exchangeable, but not a mixture of i.i.d. Indeed, conditioned on all other elements in the sequence, the remaining element is known.
Covariance and correlation
Exchangeable sequences have some basic
covariance and correlation
In probability theory and statistics, the mathematical concepts of covariance and correlation are very similar. Both describe the degree to which two random variables or sets of random variables tend to deviate from their expected values in sim ...
properties which mean that they are generally positively correlated. For infinite sequences of exchangeable random variables, the covariance between the random variables is equal to the variance of the mean of the underlying distribution function.
For finite exchangeable sequences the covariance is also a fixed value which does not depend on the particular random variables in the sequence. There is a weaker lower bound than for infinite exchangeability and it is possible for negative correlation to exist.
Covariance for exchangeable sequences (infinite): If the sequence
is exchangeable, then
:
Covariance for exchangeable sequences (finite): If
is exchangeable with
, then
:
The finite sequence result may be proved as follows. Using the fact that the values are exchangeable, we have
:
We can then solve the inequality for the covariance yielding the stated lower bound. The non-negativity of the covariance for the infinite sequence can then be obtained as a limiting result from this finite sequence result.
Equality of the lower bound for finite sequences is achieved in a simple urn model: An urn contains 1 red marble and ''n'' − 1 green marbles, and these are sampled without replacement until the urn is empty. Let ''X''
''i'' = 1 if the red marble is drawn on the ''i''-th trial and 0 otherwise. A finite sequence that achieves the lower covariance bound cannot be extended to a longer exchangeable sequence.
Examples
* Any
convex combination or
mixture distribution of
iid sequences of random variables is exchangeable. A converse proposition is
de Finetti's theorem.
* Suppose an
urn contains
red and
blue marbles. Suppose marbles are drawn without replacement until the urn is empty. Let
be the indicator random variable of the event that the
-th marble drawn is red. Then
is an exchangeable sequence. This sequence cannot be extended to any longer exchangeable sequence.
* Suppose an urn contains
red and
blue marbles. Further suppose a marble is drawn from the urn and then replaced, with an extra marble of the same colour. Let
be the indicator random variable of the event that the
-th marble drawn is red. Then
is an exchangeable sequence. This model is called
Polya's urn.
* Let
have a
bivariate normal distribution with parameters
,
and an arbitrary
correlation coefficient
A correlation coefficient is a numerical measure of some type of linear correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two c ...
. The random variables
and
are then exchangeable, but independent only if
. The
density function is
Applications
The
von Neumann extractor is a
randomness extractor
A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weak entropy source, together with a short, uniformly random seed, generates a highly random output that appears Independent and identic ...
that depends on exchangeability: it gives a method to take an exchangeable sequence of 0s and 1s (
Bernoulli trials), with some probability ''p'' of 0 and
of 1, and produce a (shorter) exchangeable sequence of 0s and 1s with probability 1/2.
Partition the sequence into non-overlapping pairs: if the two elements of the pair are equal (00 or 11), discard it; if the two elements of the pair are unequal (01 or 10), keep the first. This yields a sequence of Bernoulli trials with
as, by exchangeability, the odds of a given pair being 01 or 10 are equal.
Exchangeable random variables arise in the study of
U statistics, particularly in the Hoeffding decomposition.
Exchangeability is a key assumption of the distribution-free inference method of
conformal prediction.
See also
*
De Finetti theorem
In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in hon ...
*
Hewitt-Savage zero-one law
*
Resampling
* , statistical tests based on exchanging between groups
References
Further reading
* Aldous, David J., ''Exchangeability and related topics'', in: École d'Été de Probabilités de Saint-Flour XIII — 1983, Lecture Notes in Math. 1117, pp. 1–198, Springer, Berlin, 1985.
* Chow, Yuan Shih and Teicher, Henry, ''Probability theory. Independence, interchangeability, martingales,'' Springer Texts in Statistics, 3rd ed., Springer, New York, 1997. xxii+488 pp.
*
*
Kallenberg, O., ''Probabilistic symmetries and invariance principles''. Springer-Verlag, New York (2005). 510 pp. .
* Kingman, J. F. C., ''Uses of exchangeability'', Ann. Probability 6 (1978) 83–197
* O'Neill, B. (2009) Exchangeability, Correlation and Bayes' Effect. ''International Statistical Review'' 77(2), pp. 241–250.
*
{{DEFAULTSORT:Exchangeable Random Variables
Statistical randomness
Types of probability distributions