In probability and statistics, the truncated normal distribution is the probability distribution derived from that of a normally distributed random variable by bounding the random variable from either below or above (or both). The truncated normal distribution has wide applications in statistics and

econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...

Definitions

Suppose

X

has a normal distribution with mean

\mu

and variance

\sigma^2

and lies within the interval

(a,b), \text \; -\infty \leq a < b \leq \infty

. Then

X

conditional on

a < X < b

has a truncated normal distribution. Its

probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...

f

, for

a \leq x \leq b

, is given by :

f(x;\mu,\sigma,a,b) = \frac\,\frac

and by

f=0

otherwise. Here, :

\phi(\xi)=\frac\exp\left(-\frac\xi^2\right)

is the probability density function of the

standard normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...

and

\Phi(\cdot)

is its cumulative distribution function :

\Phi(x)=\frac \left( 1+\operatorname(x/\sqrt) \right).

By definition, if

b=\infty

, then

\Phi\left(\tfrac\right) =1

, and similarly, if

a=-\infty

, then

\Phi\left(\tfrac\right) =0

. The above formulae show that when

-\infty the scale parameter \sigma^2 of the truncated normal distribution is allowed to assume negative values. The parameter \sigma is in this case imaginary, but the function f is nevertheless real, positive, and normalizable. The scale parameter \sigma^2 of the untruncated normal distribution must be positive because the distribution would not be normalizable otherwise. The doubly truncated normal distribution, on the other hand, can in principle have a negative scale parameter (which is different from the variance, see summary formulae), because no such integrability problems arise on a bounded domain. In this case the distribution cannot be interpreted as a untruncated normal conditional on a < X < b, of course, but can still be interpreted as a maximum-entropy distribution with first and second moments as constraints, and has an additional peculiar feature: it presents ''two'' local maxima instead of one, located at x=a and x=b .

Properties

The truncated normal is the

maximum entropy probability distribution In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entro ...

for a fixed mean and variance, with the random variate ''X'' constrained to be in the interval ,b Truncated normals with fixed support form an exponential family. Nielsen reported closed-form formula for calculating the Kullback-Leibler divergence and the Bhattacharyya distance between two truncated normal distributions with the support of the first distribution nested into the support of the second distribution.

Moments

If the random variable has been truncated only from below, some probability mass has been shifted to higher values, giving a first-order stochastically dominating distribution and hence increasing the mean to a value higher than the mean

\mu

of the original normal distribution. Likewise, if the random variable has been truncated only from above, the truncated distribution has a mean less than

\mu.

Regardless of whether the random variable is bounded above, below, or both, the truncation is a

mean-preserving contraction In probability and statistics, a mean-preserving spread (MPS) is a change from one probability distribution A to another probability distribution B, where B is formed by spreading out one or more portions of A's probability density function or pr ...

combined with a mean-changing rigid shift, and hence the variance of the truncated distribution is less than the variance

\sigma^2

of the original normal distribution.

Two sided truncation

Let

\alpha=(a-\mu)/\sigma

and

\beta=(b-\mu)/\sigma

. Then: :

\operatorname(X \mid a and

: \operatorname(X \mid a 1 - \frac
-\left(\frac\right)^2\right! Care must be taken in the numerical evaluation of these formulas, which can result in catastrophic cancellation when the interval,b /math> does not include \mu . There are better ways to rewrite them that avoid this issue.

One sided truncation (of lower tail)

In this case

\; b=\infty, \; \phi(\beta)=0, \; \Phi(\beta)=1,

then :

\operatorname(X \mid X>a) = \mu +\sigma \phi(\alpha)/Z ,\!

and :

\operatorname(X \mid X>a) = \sigma^2 + \alpha  \phi(\alpha)/Z- (\phi(\alpha)/Z)^2

where

Z=1-\Phi(\alpha).

One sided truncation (of upper tail)

In this case

\; a=\alpha=-\infty, \; \phi(\alpha)=0, \; \Phi(\alpha)=0,

then :

\operatorname(X \mid X, 

: \operatorname(X \mid X -\beta \frac- \left(\frac \right)^2\right\! Barr and Sherrill (1999) give a simpler expression for the variance of one sided truncations. Their formula is in terms of the chi-square CDF, which is implemented in standard software libraries.  Bebu and Mathew (2009) provide formulas for (generalized) confidence intervals around the truncated moments.

;A recursive formula

As for the non-truncated case, there is a recursive formula for the truncated moments.

;Multivariate

Computing the moments of a multivariate truncated normal is harder.

Generating values from the truncated normal distribution

A random variate

x

defined as

x = \Phi^( \Phi(\alpha) + U\cdot(\Phi(\beta)-\Phi(\alpha)))\sigma + \mu

with

\Phi

the cumulative distribution function and

\Phi^

its inverse,

U

a uniform random number on

(0, 1)

, follows the distribution truncated to the range

(a, b)

. This is simply the

inverse transform method Inverse transform sampling (also known as inversion sampling, the inverse probability integral transform, the inverse transformation method, Smirnov transform, or the golden ruleAalto University, N. Hyvönen, Computational methods in inverse probl ...

for simulating random variables. Although one of the simplest, this method can either fail when sampling in the tail of the normal distribution, or be much too slow. Thus, in practice, one has to find alternative methods of simulation. One such truncated normal generator (implemented i
Matlab
and in

R (programming language) R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinfor ...

a
trandn.R
) is based on an acceptance rejection idea due to Marsaglia. Despite the slightly suboptimal acceptance rate of Marsaglia (1964) in comparison with Robert (1995), Marsaglia's method is typically faster, because it does not require the costly numerical evaluation of the exponential function. For more on simulating a draw from the truncated normal distribution, see Robert (1995), Lynch (2007) Section 8.1.3 (pages 200–206), Devroye (1986). Th

package in R has a function

that calculates draws from a truncated normal. Th
truncnorm
package in R also has functions to draw from a truncated normal. Chopin (2011) proposed
arXiv
an algorithm inspired from the

Ziggurat algorithm The ziggurat algorithm is an algorithm for pseudo-random number sampling. Belonging to the class of rejection sampling algorithms, it relies on an underlying source of uniformly-distributed random numbers, typically from a pseudo-random number gen ...

of Marsaglia and Tsang (1984, 2000), which is usually considered as the fastest Gaussian sampler, and is also very close to Ahrens’s algorithm (1995). Implementations can be found i
CC++Matlab
an
Python
Sampling from the ''multivariate'' truncated normal distribution is considerably more difficult. Exact or perfect simulation is only feasible in the case of truncation of the normal distribution to a polytope region. In more general cases, Damien and Walker (2001) introduce a general methodology for sampling truncated densities within a

Gibbs sampling In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for obtaining a sequence of observations which are approximated from a specified multivariate probability distribution, when direct sampling is dif ...

framework. Their algorithm introduces one latent variable and, within a Gibbs sampling framework, it is more computationally efficient than the algorithm of Robert (1995).

Notes

References

* * * Norman L. Johnson and Samuel Kotz (1970). ''Continuous univariate distributions-1'', chapter 13. John Wiley & Sons. * * * * * * Nicolas Chopin, "Fast simulation of truncated Gaussian distributions". ''Statistics and Computing'' 21(2): 275-288, 2011, Digital Object Identifier, doi:
10.1007/s11222-009-9168-1
* {{ProbDistributions, continuous-semi-infinite Continuous distributions Normal distribution fr:Loi tronquée#Loi normale tronquée