HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
, calculation of the sum of normally distributed random variables is an instance of the arithmetic of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s, which can be quite complex based on the
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
s of the random variables involved and their relationships. This is not to be confused with the sum of normal distributions which forms a
mixture distribution In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collectio ...
.


Independent random variables

Let ''X'' and ''Y'' be
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independe ...
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s that are normally distributed (and therefore also jointly so), then their sum is also normally distributed. i.e., if :X \sim N(\mu_X, \sigma_X^2) :Y \sim N(\mu_Y, \sigma_Y^2) :Z=X+Y, then :Z \sim N(\mu_X + \mu_Y, \sigma_X^2 + \sigma_Y^2). This means that the sum of two independent normally distributed random variables is normal, with its mean being the sum of the two means, and its variance being the sum of the two variances (i.e., the square of the standard deviation is the sum of the squares of the standard deviations). In order for this result to hold, the assumption that ''X'' and ''Y'' are independent cannot be dropped, although it can be weakened to the assumption that ''X'' and ''Y'' are jointly, rather than separately, normally distributed. (See here for an example.) The result about the mean holds in all cases, while the result for the variance requires uncorrelatedness, but not independence.


Proofs


Proof using characteristic functions

The
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts: * The indicator function of a subset, that is the function ::\mathbf_A\colon X \to \, :which for a given subset ''A'' of ''X'', has value 1 at points ...
:\varphi_(t) = \operatorname\left(e^\right) of the sum of two independent random variables ''X'' and ''Y'' is just the product of the two separate characteristic functions: :\varphi_X (t) = \operatorname\left(e^\right), \qquad \varphi_Y(t) = \operatorname\left(e^\right) of ''X'' and ''Y''. The characteristic function of the normal distribution with expected value μ and variance σ2 is :\varphi(t) = \exp\left(it\mu - \right). So : \begin \varphi_(t)=\varphi_X(t) \varphi_Y(t) & =\exp\left(it\mu_X - \right) \exp\left(it\mu_Y - \right) \\ pt& = \exp \left( it (\mu_X +\mu_Y) - \right). \end This is the characteristic function of the normal distribution with expected value \mu_X + \mu_Y and variance \sigma_X^2+\sigma_Y^2 Finally, recall that no two distinct distributions can both have the same characteristic function, so the distribution of ''X'' + ''Y'' must be just this normal distribution.


Proof using convolutions

For independent random variables ''X'' and ''Y'', the distribution ''f''''Z'' of ''Z'' = ''X'' + ''Y'' equals the convolution of ''f''''X'' and ''f''''Y'': :f_Z(z) = \int_^\infty f_Y(z-x) f_X(x) \, dx Given that ''f''''X'' and ''f''''Y'' are normal densities, : \begin f_X(x) = \mathcal(x; \mu_X, \sigma_X^2) = \frac e^ \\ ptf_Y(y) = \mathcal(y; \mu_Y, \sigma_Y^2) = \frac e^ \end Substituting into the convolution: : \begin f_Z(z) &= \int_^\infty \frac \exp \left \right \frac \exp \left \right \, dx \\ pt&= \int_^\infty \frac \exp \left \frac\right \, dx \\ pt&= \int_^\infty \frac \exp \left -\frac \right \, dx \\ pt&= \int_^\infty \frac \exp \left -\frac \right \, dx \\ pt\end Defining \sigma_Z = \sqrt, and
completing the square : In elementary algebra, completing the square is a technique for converting a quadratic polynomial of the form :ax^2 + bx + c to the form :a(x-h)^2 + k for some values of ''h'' and ''k''. In other words, completing the square places a perfe ...
: : \begin f_Z(z) &= \int_^\infty \frac \frac \exp \left -\frac \right \, dx \\ pt&= \int_^\infty \frac \frac \exp \left -\frac \right \, dx \\ pt&= \int_^\infty \frac \exp \left -\frac \right \frac \exp \left -\frac \right \, dx \\ pt&= \frac \exp \left - \right \int_^ \frac \exp \left - \frac \right \, dx \end The expression in the integral is a normal density distribution on ''x'', and so the integral evaluates to 1. The desired result follows: :f_Z(z) = \frac \exp \left - \right/math>


= Using the

convolution theorem In mathematics, the convolution theorem states that under suitable conditions the Fourier transform of a convolution of two functions (or signals) is the pointwise product of their Fourier transforms. More generally, convolution in one domain (e.g ...

= It can be shown that the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed ...
of a Gaussian, f_X(x) = \mathcal(x; \mu_X, \sigma_X^2), is :\mathcal\ = F_X(\omega) = \exp \left -j \omega \mu_X \right\exp \left \tfrac \right/math> By the
convolution theorem In mathematics, the convolution theorem states that under suitable conditions the Fourier transform of a convolution of two functions (or signals) is the pointwise product of their Fourier transforms. More generally, convolution in one domain (e.g ...
: : \begin f_Z(z) &= (f_X * f_Y)(z) \\ pt &= \mathcal^\big\ \\ pt &= \mathcal^\big\ \\ pt &= \mathcal^\big\ \\ pt &= \mathcal(z; \mu_X + \mu_Y, \sigma_X^2 + \sigma_Y^2) \end


Geometric proof

First consider the normalized case when ''X'', ''Y'' ~ ''N''(0, 1), so that their
PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
s are :f(x) = \frac 1 e^ and :g(y) = \frac1 e^. Let ''Z'' = ''X'' + ''Y''. Then the CDF for ''Z'' will be :z \mapsto \int_ f(x)g(y) \, dx \, dy. This integral is over the half-plane which lies under the line ''x''+''y'' = ''z''. The key observation is that the function : f(x)g(y) = \frac 1 e^\, is radially symmetric. So we rotate the coordinate plane about the origin, choosing new coordinates x',y' such that the line ''x''+''y'' = ''z'' is described by the equation x' = c where c = c(z) is determined geometrically. Because of the radial symmetry, we have f(x)g(y) = f(x')g(y') , and the CDF for ''Z'' is :\int_ f(x')g(y') \, dx' \, dy'. This is easy to integrate; we find that the CDF for ''Z'' is :\int_^ f(x') \, dx' = \Phi(c(z)). To determine the value c(z), note that we rotated the plane so that the line ''x''+''y'' = ''z'' now runs vertically with ''x''-intercept equal to ''c''. So ''c'' is just the distance from the origin to the line ''x''+''y'' = ''z'' along the perpendicular bisector, which meets the line at its nearest point to the origin, in this case (z/2,z/2)\,. So the distance is c = \sqrt = z/\sqrt\,, and the CDF for ''Z'' is \Phi(z/\sqrt), i.e., Z = X+Y \sim N(0, 2). Now, if ''a'', ''b'' are any real constants (not both zero) then the probability that aX+bY \leq z is found by the same integral as above, but with the bounding line ax+by =z. The same rotation method works, and in this more general case we find that the closest point on the line to the origin is located a (signed) distance : \frac away, so that :aX + bY \sim N(0, a^2 + b^2). The same argument in higher dimensions shows that if :X_i \sim N(0,\sigma_i^2), \qquad i=1, \dots, n, then :X_1+ \cdots + X_n \sim N(0, \sigma_1^2 + \cdots + \sigma_n^2). Now we are essentially done, because :X \sim N(\mu,\sigma^2) \Leftrightarrow \frac (X - \mu) \sim N(0,1). So in general, if :X_i \sim N(\mu_i, \sigma_i^2), \qquad i=1, \dots, n, then : \sum_^n a_i X_i \sim N\left(\sum_^n a_i \mu_i, \sum_^n (a_i \sigma_i)^2 \right).


Correlated random variables

In the event that the variables ''X'' and ''Y'' are jointly normally distributed random variables, then ''X'' + ''Y'' is still normally distributed (see
Multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
) and the mean is the sum of the means. However, the variances are not additive due to the correlation. Indeed, :\sigma_ = \sqrt, where ρ is the
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
. In particular, whenever ρ < 0, then the variance is less than the sum of the variances of ''X'' and ''Y''. Extensions of this result can be made for more than two random variables, using the
covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
.


Proof

In this case (with ''X'' and ''Y'' having zero means), one needs to consider :\frac \iint_ \exp \left -\frac \left(\frac + \frac - \frac\right)\right\delta(z - (x+y))\, \mathrmx\,\mathrmy. As above, one makes the substitution y\rightarrow z-x This integral is more complicated to simplify analytically, but can be done easily using a symbolic mathematics program. The probability distribution ''f''''Z''(''z'') is given in this case by :f_Z(z)=\frac\exp\left(-\frac\right) where :\sigma_+ = \sqrt. If one considers instead ''Z'' = ''X'' − ''Y'', then one obtains :f_Z(z)=\frac\exp\left(-\frac\right) which also can be rewritten with :\sigma_=\sqrt. The standard deviations of each distribution are obvious by comparison with the standard normal distribution.


References

{{reflist


See also

*
Propagation of uncertainty In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of exp ...
*
Algebra of random variables The algebra of random variables in statistics, provides rules for the symbolic manipulation of random variables, while avoiding delving too deeply into the mathematically sophisticated ideas of probability theory. Its symbolism allows the treat ...
*
Stable distribution In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stab ...
*
Standard error (statistics) The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error of ...
*
Ratio distribution A ratio distribution (also known as a quotient distribution) is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two (usually independent) random variables ''X'' ...
*
Product distribution A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables ''X'' and ''Y'', the distribution of ...
*
Slash distribution In probability theory, the slash distribution is the probability distribution of a standard normal variate divided by an independent standard uniform variate. In other words, if the random variable ''Z'' has a normal distribution with zero mean an ...
*
List of convolutions of probability distributions In probability theory, the probability distribution of the sum of two or more independent (probability), independent random variables is the convolution of their individual distributions. The term is motivated by the fact that the probability mass ...
Normal distribution