HOME

TheInfoList



OR:

Independence is a fundamental notion in
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, as in
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
and the theory of
stochastic processes In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appe ...
. Two
event Event may refer to: Gatherings of people * Ceremony, an event of ritual significance, performed on a special occasion * Convention (meeting), a gathering of individuals engaged in some common interest * Event management, the organization of e ...
s are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the
odds Odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of events that produce that outcome to the number that do not. Odds are commonly used in gambling and statistics. Odds also have ...
. Similarly, two
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s are independent if the realization of one does not affect the
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of the other. When dealing with collections of more than two events, two notions of independence need to be distinguished. The events are called
pairwise independent In probability theory, a pairwise independent collection of random variables is a set of random variables any two of which are independent. Any collection of mutually independent random variables is pairwise independent, but some pairwise indepen ...
if any two events in the collection are independent of each other, while mutual independence (or collective independence) of events means, informally speaking, that each event is independent of any combination of other events in the collection. A similar notion exists for collections of random variables. Mutual independence implies pairwise independence, but not the other way around. In the standard literature of probability theory, statistics, and stochastic processes, independence without further qualification usually refers to mutual independence.


Definition


For events


Two events

Two events A and B are independent (often written as A \perp B or A \perp\!\!\!\perp B, where the latter symbol often is also used for
conditional independence In probability theory, conditional independence describes situations wherein an observation is irrelevant or redundant when evaluating the certainty of a hypothesis. Conditional independence is usually formulated in terms of conditional probabil ...
) if and only if their
joint probability Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considere ...
equals the product of their probabilities: A \cap B \neq \emptyset indicates that two independent events A and B have common elements in their
sample space In probability theory, the sample space (also called sample description space, possibility space, or outcome space) of an experiment or random trial is the set of all possible outcomes or results of that experiment. A sample space is usually den ...
so that they are not
mutually exclusive In logic and probability theory, two events (or propositions) are mutually exclusive or disjoint if they cannot both occur at the same time. A clear example is the set of outcomes of a single coin toss, which can result in either heads or tails ...
(mutually exclusive iff A \cap B = \emptyset). Why this defines independence is made clear by rewriting with
conditional probabilities In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occur ...
P(A \mid B) = \frac as the probability at which the event A occurs provided that the event B has or is assumed to have occurred: :\mathrm(A \cap B) = \mathrm(A)\mathrm(B) \iff \mathrm(A\mid B) = \frac = \mathrm(A). and similarly :\mathrm(A \cap B) = \mathrm(A)\mathrm(B) \iff\mathrm(B\mid A) = \frac = \mathrm(B). Thus, the occurrence of B does not affect the probability of A, and vice versa. In other words, A and B are independent to each other. Although the derived expressions may seem more intuitive, they are not the preferred definition, as the conditional probabilities may be undefined if \mathrm(A) or \mathrm(B) are 0. Furthermore, the preferred definition makes clear by symmetry that when A is independent of B, B is also independent of A.


Log probability and information content

Stated in terms of
log probability In probability theory and computer science, a log probability is simply a logarithm of a probability. The use of log probabilities means representing probabilities on a logarithmic scale, instead of the standard , 1/math> unit interval. Since t ...
, two events are independent if and only if the log probability of the joint event is the sum of the log probability of the individual events: :\log \mathrm(A \cap B) = \log \mathrm(A) + \log \mathrm(B) In
information theory Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of information. The field was originally established by the works of Harry Nyquist a ...
, negative log probability is interpreted as
information content In information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the probability of a particular event occurring from a random variable. It can be thought of as an alternative wa ...
, and thus two events are independent if and only if the information content of the combined event equals the sum of information content of the individual events: :\mathrm(A \cap B) = \mathrm(A) + \mathrm(B) See for details.


Odds

Stated in terms of
odds Odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of events that produce that outcome to the number that do not. Odds are commonly used in gambling and statistics. Odds also have ...
, two events are independent if and only if the
odds ratio An odds ratio (OR) is a statistic that quantifies the strength of the association between two events, A and B. The odds ratio is defined as the ratio of the odds of A in the presence of B and the odds of A in the absence of B, or equivalently (due ...
of and is unity (1). Analogously with probability, this is equivalent to the conditional odds being equal to the unconditional odds: :O(A \mid B) = O(A) \text O(B \mid A) = O(B), or to the odds of one event, given the other event, being the same as the odds of the event, given the other event not occurring: :O(A \mid B) = O(A \mid \neg B) \text O(B \mid A) = O(B \mid \neg A). The odds ratio can be defined as :O(A \mid B) : O(A \mid \neg B), or symmetrically for odds of given , and thus is 1 if and only if the events are independent.


More than two events

A finite set of events \ _^ is pairwise independent if every pair of events is independent—that is, if and only if for all distinct pairs of indices m,k, A finite set of events is mutually independent if every event is independent of any intersection of the other events—that is, if and only if for every k \leq n and for every k indices 1\le i_1 < \dots < i_k \le n, This is called the ''multiplication rule'' for independent events. Note that it is not a single condition involving only the product of all the probabilities of all single events; it must hold true for all subsets of events. For more than two events, a mutually independent set of events is (by definition) pairwise independent; but the converse is not necessarily true.


For real valued random variables


Two random variables

Two random variables X and Y are independent
if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicondi ...
(iff) the elements of the π-system generated by them are independent; that is to say, for every x and y, the events \ and \ are independent events (as defined above in ). That is, X and Y with
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
s F_X(x) and F_Y(y), are independent
iff In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicon ...
the combined random variable (X,Y) has a
joint A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw ...
cumulative distribution function or equivalently, if the probability densities f_X(x) and f_Y(y) and the joint probability density f_(x,y) exist, :f_(x,y) = f_X(x) f_Y(y) \quad \text x,y.


More than two random variables

A finite set of n random variables \ is
pairwise independent In probability theory, a pairwise independent collection of random variables is a set of random variables any two of which are independent. Any collection of mutually independent random variables is pairwise independent, but some pairwise indepen ...
if and only if every pair of random variables is independent. Even if the set of random variables is pairwise independent, it is not necessarily mutually independent as defined next. A finite set of n random variables \ is mutually independent if and only if for any sequence of numbers \, the events \, \ldots, \ are mutually independent events (as defined above in ). This is equivalent to the following condition on the joint cumulative distribution function A finite set of n random variables \ is mutually independent if and only if Notice that it is not necessary here to require that the probability distribution factorizes for all possible subsets as in the case for n events. This is not required because e.g. F_(x_1,x_2,x_3) = F_(x_1) \cdot F_(x_2) \cdot F_(x_3) implies F_(x_1,x_3) = F_(x_1) \cdot F_(x_3). The measure-theoretically inclined may prefer to substitute events \ for events \ in the above definition, where A is any
Borel set In mathematics, a Borel set is any set in a topological space that can be formed from open sets (or, equivalently, from closed sets) through the operations of countable union, countable intersection, and relative complement. Borel sets are named ...
. That definition is exactly equivalent to the one above when the values of the random variables are
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every real ...
s. It has the advantage of working also for complex-valued random variables or for random variables taking values in any
measurable space In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured. Definition Consider a set X and a σ-algebra \mathcal A on X. Then the ...
(which includes
topological space In mathematics, a topological space is, roughly speaking, a geometrical space in which closeness is defined but cannot necessarily be measured by a numeric distance. More specifically, a topological space is a set whose elements are called points ...
s endowed by appropriate σ-algebras).


For real valued random vectors

Two random vectors \mathbf=(X_1,\ldots,X_m)^\mathrm and \mathbf=(Y_1,\ldots,Y_n)^\mathrm are called independent if where F_(\mathbf) and F_(\mathbf) denote the cumulative distribution functions of \mathbf and \mathbf and F_(\mathbf) denotes their joint cumulative distribution function. Independence of \mathbf and \mathbf is often denoted by \mathbf \perp\!\!\!\perp \mathbf. Written component-wise, \mathbf and \mathbf are called independent if :F_(x_1,\ldots,x_m,y_1,\ldots,y_n) = F_(x_1,\ldots,x_m) \cdot F_(y_1,\ldots,y_n) \quad \text x_1,\ldots,x_m,y_1,\ldots,y_n.


For stochastic processes


For one stochastic process

The definition of independence may be extended from random vectors to a
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
. Therefore, it is required for an independent stochastic process that the random variables obtained by sampling the process at any n times t_1,\ldots,t_n are independent random variables for any n. Formally, a stochastic process \left\_ is called independent, if and only if for all n\in \mathbb and for all t_1,\ldots,t_n\in\mathcal where Independence of a stochastic process is a property ''within'' a stochastic process, not between two stochastic processes.


For two stochastic processes

Independence of two stochastic processes is a property between two stochastic processes \left\_ and \left\_ that are defined on the same probability space (\Omega,\mathcal,P). Formally, two stochastic processes \left\_ and \left\_ are said to be independent if for all n\in \mathbb and for all t_1,\ldots,t_n\in\mathcal, the random vectors (X(t_1),\ldots,X(t_n)) and (Y(t_1),\ldots,Y(t_n)) are independent, i.e. if


Independent σ-algebras

The definitions above ( and ) are both generalized by the following definition of independence for σ-algebras. Let (\Omega, \Sigma, \mathrm) be a probability space and let \mathcal and \mathcal be two sub-σ-algebras of \Sigma. \mathcal and \mathcal are said to be independent if, whenever A \in \mathcal and B \in \mathcal, :\mathrm(A \cap B) = \mathrm(A) \mathrm(B). Likewise, a finite family of σ-algebras (\tau_i)_, where I is an
index set In mathematics, an index set is a set whose members label (or index) members of another set. For instance, if the elements of a set may be ''indexed'' or ''labeled'' by means of the elements of a set , then is an index set. The indexing consists ...
, is said to be independent if and only if :\forall \left(A_i\right)_ \in \prod\nolimits_\tau_i \ : \ \mathrm\left(\bigcap\nolimits_A_i\right) = \prod\nolimits_\mathrm\left(A_i\right) and an infinite family of σ-algebras is said to be independent if all its finite subfamilies are independent. The new definition relates to the previous ones very directly: * Two events are independent (in the old sense)
if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicondi ...
the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by an event E \in \Sigma is, by definition, ::\sigma(\) = \. * Two random variables X and Y defined over \Omega are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by a random variable X taking values in some
measurable space In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured. Definition Consider a set X and a σ-algebra \mathcal A on X. Then the ...
S consists, by definition, of all subsets of \Omega of the form X^(U), where U is any measurable subset of S. Using this definition, it is easy to show that if X and Y are random variables and Y is constant, then X and Y are independent, since the σ-algebra generated by a constant random variable is the trivial σ-algebra \. Probability zero events cannot affect independence so independence also holds if Y is only Pr-
almost surely In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (or Lebesgue measure 1). In other words, the set of possible exceptions may be non-empty, but it has probability 0. ...
constant.


Properties


Self-independence

Note that an event is independent of itself if and only if :\mathrm(A) = \mathrm(A \cap A) = \mathrm(A) \cdot \mathrm(A) \iff \mathrm(A) = 0 \text \mathrm(A) = 1. Thus an event is independent of itself if and only if it
almost surely In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (or Lebesgue measure 1). In other words, the set of possible exceptions may be non-empty, but it has probability 0. ...
occurs or its
complement A complement is something that completes something else. Complement may refer specifically to: The arts * Complement (music), an interval that, when added to another, spans an octave ** Aggregate complementation, the separation of pitch-clas ...
almost surely occurs; this fact is useful when proving
zero–one law In probability theory, a zero–one law is a result that states that an event must have probability 0 or 1 and no intermediate value. Sometimes, the statement is that the limit of certain probabilities must be 0 or 1. It may refer to: * Borel–C ...
s.


Expectation and covariance

If X and Y are independent random variables, then the
expectation operator In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
\operatorname has the property :\operatorname Y= \operatorname \operatorname and the
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the les ...
\operatorname ,Y/math> is zero, as follows from :\operatorname ,Y= \operatorname Y- \operatorname \operatorname The converse does not hold: if two random variables have a covariance of 0 they still may be not independent. See
uncorrelated In probability theory and statistics, two real-valued random variables, X, Y, are said to be uncorrelated if their covariance, \operatorname ,Y= \operatorname Y- \operatorname \operatorname /math>, is zero. If two variables are uncorrelated, there ...
. Similarly for two stochastic processes \left\_ and \left\_: If they are independent, then they are uncorrelated.


Characteristic function

Two random variables X and Y are independent if and only if the
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts: * The indicator function of a subset, that is the function ::\mathbf_A\colon X \to \, :which for a given subset ''A'' of ''X'', has value 1 at points ...
of the random vector (X,Y) satisfies :\varphi_(t,s) = \varphi_(t)\cdot \varphi_(s). In particular the characteristic function of their sum is the product of their marginal characteristic functions: :\varphi_(t) = \varphi_X(t)\cdot\varphi_Y(t), though the reverse implication is not true. Random variables that satisfy the latter condition are called subindependent.


Examples


Rolling dice

The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are ''independent''. By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trial is 8 are ''not'' independent.


Drawing cards

If two cards are drawn ''with'' replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are ''independent''. By contrast, if two cards are drawn ''without'' replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are ''not'' independent, because a deck that has had a red card removed has proportionately fewer red cards.


Pairwise and mutual independence

Consider the two probability spaces shown. In both cases, \mathrm(A) = \mathrm(B) = 1/2 and \mathrm(C) = 1/4. The random variables in the first space are pairwise independent because \mathrm(A, B) = \mathrm(A, C)=1/2=\mathrm(A), \mathrm(B, A) = \mathrm(B, C)=1/2=\mathrm(B), and \mathrm(C, A) = \mathrm(C, B)=1/4=\mathrm(C); but the three random variables are not mutually independent. The random variables in the second space are both pairwise independent and mutually independent. To illustrate the difference, consider conditioning on two events. In the pairwise independent case, although any one event is independent of each of the other two individually, it is not independent of the intersection of the other two: :\mathrm(A, BC) = \frac = \tfrac \ne \mathrm(A) :\mathrm(B, AC) = \frac = \tfrac \ne \mathrm(B) :\mathrm(C, AB) = \frac = \tfrac \ne \mathrm(C) In the mutually independent case, however, :\mathrm(A, BC) = \frac = \tfrac = \mathrm(A) :\mathrm(B, AC) = \frac = \tfrac = \mathrm(B) :\mathrm(C, AB) = \frac = \tfrac = \mathrm(C)


Triple-independence but no pairwise-independence

It is possible to create a three-event example in which :\mathrm(A \cap B \cap C) = \mathrm(A)\mathrm(B)\mathrm(C), and yet no two of the three events are pairwise independent (and hence the set of events are not mutually independent).George, Glyn, "Testing for the independence of three events," ''Mathematical Gazette'' 88, November 2004, 568
PDF
/ref> This example shows that mutual independence involves requirements on the products of probabilities of all combinations of events, not just the single events as in this example.


Conditional independence


For events

The events A and B are conditionally independent given an event C when \mathrm(A \cap B \mid C) = \mathrm(A \mid C) \cdot \mathrm(B \mid C).


For random variables

Intuitively, two random variables X and Y are conditionally independent given Z if, once Z is known, the value of Y does not add any additional information about X. For instance, two measurements X and Y of the same underlying quantity Z are not independent, but they are conditionally independent given Z (unless the errors in the two measurements are somehow connected). The formal definition of conditional independence is based on the idea of
conditional distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the ...
s. If X, Y, and Z are
discrete random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s, then we define X and Y to be ''conditionally independent given'' Z if :\mathrm(X \le x, Y \le y\;, \;Z = z) = \mathrm(X \le x\;, \;Z = z) \cdot \mathrm(Y \le y\;, \;Z = z) for all x, y and z such that \mathrm(Z=z)>0. On the other hand, if the random variables are
continuous Continuity or continuous may refer to: Mathematics * Continuity (mathematics), the opposing concept to discreteness; common examples include ** Continuous probability distribution or random variable in probability and statistics ** Continuous ...
and have a joint
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
f_(x,y,z), then X and Y are
conditionally independent In probability theory, conditional independence describes situations wherein an observation is irrelevant or redundant when evaluating the certainty of a hypothesis. Conditional independence is usually formulated in terms of conditional probabil ...
given Z if :f_(x, y , z) = f_(x , z) \cdot f_(y , z) for all real numbers x, y and z such that f_Z(z)>0. If discrete X and Y are conditionally independent given Z, then :\mathrm(X = x , Y = y , Z = z) = \mathrm(X = x , Z = z) for any x, y and z with \mathrm(Z=z)>0. That is, the conditional distribution for X given Y and Z is the same as that given Z alone. A similar equation holds for the conditional probability density functions in the continuous case. Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.


See also

*
Copula (statistics) In probability theory and statistics, a copula is a multivariate cumulative distribution function for which the marginal probability distribution of each variable is uniform on the interval  , 1 Copulas are used to describe/model the ...
*
Independent and identically distributed random variables In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
*
Mutually exclusive events In logic and probability theory, two events (or propositions) are mutually exclusive or disjoint if they cannot both occur at the same time. A clear example is the set of outcomes of a single coin toss, which can result in either heads or tails ...
* Pairwise independent events *
Subindependence In probability theory and statistics, subindependence is a weak form of independence. Two random variables ''X'' and ''Y'' are said to be subindependent if the characteristic function of their sum is equal to the product of their marginal character ...
*
Conditional independence In probability theory, conditional independence describes situations wherein an observation is irrelevant or redundant when evaluating the certainty of a hypothesis. Conditional independence is usually formulated in terms of conditional probabil ...
*
Normally distributed and uncorrelated does not imply independent In probability theory, although simple examples illustrate that linear uncorrelatedness of two random variables does not in general imply their independence, it is sometimes mistakenly thought that it does imply that when the two random variables a ...
* Mean dependence


References


External links

*{{Commons category-inline Experiment (probability theory)