HOME

TheInfoList



OR:

The proposition in
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
known as the law of total expectation, the law of iterated expectations (LIE), Adam's law, the tower rule, and the smoothing theorem, among other names, states that if X is a
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
whose expected value \operatorname(X) is defined, and Y is any random variable on the same
probability space In probability theory, a probability space or a probability triple (\Omega, \mathcal, P) is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models t ...
, then :\operatorname (X) = \operatorname ( \operatorname ( X \mid Y)), i.e., the
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
of the
conditional expected value In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value – the value it would take “on average” over an arbitrarily large number of occurrences – given ...
of X given Y is the same as the expected value of X. One special case states that if _i is a finite or
countable In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural numbers ...
partition of the
sample space In probability theory, the sample space (also called sample description space, possibility space, or outcome space) of an experiment or random trial is the set of all possible outcomes or results of that experiment. A sample space is usually den ...
, then :\operatorname (X) = \sum_i. Note: The
conditional expected value In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value – the value it would take “on average” over an arbitrarily large number of occurrences – given ...
E(''X'' , ''Z'') is a random variable whose value depend on the value of ''Z''. Note that the conditional expected value of ''X'' given the ''event'' ''Z'' = ''z'' is a function of ''z''. If we write E(''X'' , ''Z'' = ''z'') = ''g''(''z'') then the random variable E(''X'' , ''Z'') is ''g''(''Z''). Similar comments apply to the conditional covariance.


Example

Suppose that only two factories supply
light bulb An electric light, lamp, or light bulb is an electrical component that produces light. It is the most common form of artificial lighting. Lamps usually have a base made of ceramic, metal, glass, or plastic, which secures the lamp in the soc ...
s to the market. Factory X's bulbs work for an average of 5000 hours, whereas factory Y's bulbs work for an average of 4000 hours. It is known that factory X supplies 60% of the total bulbs available. What is the expected length of time that a purchased bulb will work for? Applying the law of total expectation, we have: : \begin \operatorname (L) &= \operatorname(L \mid X) \operatorname(X)+\operatorname(L \mid Y) \operatorname(Y) \\ pt&= 5000(0.6)+4000(0.4)\\ pt&=4600 \end where * \operatorname (L) is the expected life of the bulb; * \operatorname(X)= is the probability that the purchased bulb was manufactured by factory X; * \operatorname(Y)= is the probability that the purchased bulb was manufactured by factory Y; * \operatorname(L \mid X)=5000 is the expected lifetime of a bulb manufactured by X; * \operatorname(L \mid Y)=4000 is the expected lifetime of a bulb manufactured by Y. Thus each purchased light bulb has an expected lifetime of 4600 hours.


Proof in the finite and countable cases

Let the random variables X and Y, defined on the same probability space, assume a finite or countably infinite set of finite values. Assume that \operatorname /math> is defined, i.e. \min (\operatorname _+ \operatorname _- < \infty. If \ is a partition of the probability space \Omega, then :\operatorname (X) = \sum_i. Proof. : \begin \operatorname \left( \operatorname (X \mid Y) \right) &= \operatorname \Bigg \sum_x x \cdot \operatorname(X=x \mid Y) \Bigg\\ pt&=\sum_y \Bigg \sum_x x \cdot \operatorname(X=x \mid Y=y) \Bigg\cdot \operatorname(Y=y) \\ pt&=\sum_y \sum_x x \cdot \operatorname(X=x, Y=y). \end If the series is finite, then we can switch the summations around, and the previous expression will become : \begin \sum_x \sum_y x \cdot \operatorname(X=x, Y=y)&=\sum_x x\sum_y \operatorname(X=x, Y=y)\\ pt&=\sum_x x \cdot \operatorname(X=x)\\ pt&=\operatorname(X). \end If, on the other hand, the series is infinite, then its convergence cannot be
conditional Conditional (if then) may refer to: *Causal conditional, if X then Y, where X is a cause of Y *Conditional probability, the probability of an event A given that another event B has occurred *Conditional proof, in logic: a proof that asserts a co ...
, due to the assumption that \min (\operatorname _+ \operatorname _-) < \infty. The series converges absolutely if both \operatorname _+/math> and \operatorname _-/math> are finite, and diverges to an infinity when either \operatorname _+/math> or \operatorname _-/math> is infinite. In both scenarios, the above summations may be exchanged without affecting the sum.


Proof in the general case

Let (\Omega,\mathcal,\operatorname) be a probability space on which two sub σ-algebras \mathcal_1 \subseteq \mathcal_2 \subseteq \mathcal are defined. For a random variable X on such a space, the smoothing law states that if \operatorname /math> is defined, i.e. \min(\operatorname _+ \operatorname _-<\infty, then : \operatorname \operatorname[X_\mid_\mathcal_2\mid_\mathcal_1.html" ;"title="_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1">_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1= \operatorname[X \mid \mathcal_1]\quad\text. Proof. Since a conditional expectation is a Radon–Nikodym theorem, Radon–Nikodym derivative, verifying the following two properties establishes the smoothing law: * \operatorname \operatorname[X_\mid_\mathcal_2\mid_\mathcal_1.html" ;"title="_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1">_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1\mbox \mathcal_1-measurable * \int_ \operatorname \operatorname[X_\mid_\mathcal_2\mid_\mathcal_1.html" ;"title="_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1">_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1d\operatorname = \int_ X d\operatorname, for all G_1 \in \mathcal_1. The first of these properties holds by definition of the conditional expectation. To prove the second one, : \begin \min\left(\int_X_+\, d\operatorname, \int_X_-\, d\operatorname\right) &\leq \min\left(\int_\Omega X_+\, d\operatorname, \int_\Omega X_-\, d\operatorname\right)\\[4pt] &=\min(\operatorname _+ \operatorname _- < \infty, \end so the integral \textstyle \int_X\, d\operatorname is defined (not equal \infty - \infty). The second property thus holds since G_1 \in \mathcal_1 \subseteq \mathcal_2 implies : \int_ \operatorname \operatorname[X_\mid_\mathcal_2\mid_\mathcal_1.html" ;"title="_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1">_\mid_\mathcal_2.html" ;"title="\operatorname[X \mid \mathcal_2">\operatorname[X \mid \mathcal_2\mid \mathcal_1d\operatorname = \int_ \operatorname[X \mid \mathcal_2] d\operatorname = \int_ X d\operatorname. Corollary. In the special case when \mathcal_1 = \ and \mathcal_2 = \sigma(Y), the smoothing law reduces to : \operatorname \operatorname[X_\mid_Y_=_\operatorname[X.html" ;"title="_\mid_Y.html" ;"title="\operatorname[X \mid Y">\operatorname[X \mid Y = \operatorname[X">_\mid_Y.html" ;"title="\operatorname[X \mid Y">\operatorname[X \mid Y = \operatorname[X Alternative proof for \operatorname \operatorname[X_\mid_Y_=_\operatorname[X.html" ;"title="_\mid_Y.html" ;"title="\operatorname[X \mid Y">\operatorname[X \mid Y = \operatorname[X">_\mid_Y.html" ;"title="\operatorname[X \mid Y">\operatorname[X \mid Y = \operatorname[X This is a simple consequence of the measure-theoretic definition of conditional expectation. By definition, \operatorname[X \mid Y] := \operatorname[X \mid \sigma(Y)] is a \sigma(Y)-measurable random variable that satisfies : \int_\operatorname[X \mid Y] d\operatorname = \int_ X d\operatorname, for every measurable set A \in \sigma(Y) . Taking A = \Omega proves the claim.


Proof of partition formula

: \begin \sum\limits_i\operatorname(X\mid A_i)\operatorname(A_i) &=\sum\limits_i\int\limits_\Omega X(\omega)\operatorname(d\omega\mid A_i)\cdot\operatorname(A_i)\\ &=\sum\limits_i\int\limits_\Omega X(\omega)\operatorname(d\omega\cap A_i)\\ &=\sum\limits_i\int\limits_\Omega X(\omega)I_(\omega)\operatorname(d\omega)\\ &=\sum\limits_i\operatorname(XI_), \end where I_ is the
indicator function In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , one has \mathbf_(x)=1 if x\i ...
of the set A_i. If the partition _^n is finite, then, by linearity, the previous expression becomes : \operatorname\left(\sum\limits_^n XI_\right)=\operatorname(X), and we are done. If, however, the partition _^\infty is infinite, then we use the
dominated convergence theorem In measure theory, Lebesgue's dominated convergence theorem provides sufficient conditions under which almost everywhere convergence of a sequence of functions implies convergence in the ''L''1 norm. Its power and utility are two of the primary t ...
to show that : \operatorname\left(\sum\limits_^n XI_\right)\to\operatorname(X). Indeed, for every n\geq 0, : \left, \sum_^n XI_\\leq , X, I_\leq , X, . Since every element of the set \Omega falls into a specific partition A_i, it is straightforward to verify that the sequence _^\infty converges pointwise to X. By initial assumption, \operatorname, X, <\infty. Applying the dominated convergence theorem yields the desired result.


See also

* The fundamental theorem of poker for one practical application. *
Law of total probability In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities. It expresses the total probability of an outcome which can be realized via several distinct eve ...
*
Law of total variance In probability theory, the law of total variance or variance decomposition formula or conditional variance formulas or law of iterated variances also known as Eve's law, states that if X and Y are random variables on the same probability space, and ...
* Law of total covariance *
Law of total cumulance In probability theory and mathematical statistics, the law of total cumulance is a generalization to cumulants of the law of total probability, the law of total expectation, and the law of total variance. It has applications in the analysis o ...
* Product distribution#expectation (application of the Law for proving that the product expectation is the product of expectations)


References

* (Theorem 34.4) *
Christopher Sims Christopher Albert Sims (born October 21, 1942) is an American econometrician and macroeconomist. He is currently the John J.F. Sherrerd '52 University Professor of Economics at Princeton University. Together with Thomas Sargent, he won the ...

"Notes on Random Variables, Expectations, Probability Densities, and Martingales"
especially equations (16) through (18) {{DEFAULTSORT:Law Of Total Expectation Algebra of random variables Theory of probability distributions Statistical laws