The folded normal distribution is a

probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...

related to the

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

. Given a normally distributed random variable ''X'' with

mean A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...

''μ'' and

variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...

''σ''², the

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...

''Y'' = , ''X'', has a folded normal distribution. Such a case may be encountered if only the magnitude of some variable is recorded, but not its sign. The distribution is called "folded" because probability mass to the left of ''x'' = 0 is folded over by taking the

absolute value In mathematics, the absolute value or modulus of a real number x, is the non-negative value without regard to its sign. Namely, , x, =x if x is a positive number, and , x, =-x if x is negative (in which case negating x makes -x positive), ...

. In the physics of

heat conduction Thermal conduction is the diffusion of thermal energy (heat) within one material or between materials in contact. The higher temperature object has molecules with more kinetic energy; collisions between molecules distributes this kinetic energy u ...

, the folded normal distribution is a fundamental solution of the

heat equation In mathematics and physics (more specifically thermodynamics), the heat equation is a parabolic partial differential equation. The theory of the heat equation was first developed by Joseph Fourier in 1822 for the purpose of modeling how a quanti ...

on the half space; it corresponds to having a perfect insulator on a

hyperplane In geometry, a hyperplane is a generalization of a two-dimensional plane in three-dimensional space to mathematical spaces of arbitrary dimension. Like a plane in space, a hyperplane is a flat hypersurface, a subspace whose dimension is ...

through the origin.

Definitions

Density

The

probability density function In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...

(PDF) is given by :

f_Y(x;\mu,\sigma^2)=
\frac \, e^
+ \frac \, e^

for ''x'' ≥ 0, and 0 everywhere else. An alternative formulation is given by :

f\left(x \right)=\sqrte^\cosh

, where cosh is the Hyperbolic cosine function. It follows that the

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...

(CDF) is given by: :

F_Y(x; \mu, \sigma^2) = \frac\left \mbox\left(\frac\right) + \mbox\left(\frac\right)\right

for ''x'' ≥ 0, where erf() is the

error function In mathematics, the error function (also called the Gauss error function), often denoted by , is a function \mathrm: \mathbb \to \mathbb defined as: \operatorname z = \frac\int_0^z e^\,\mathrm dt. The integral here is a complex Contour integrat ...

. This expression reduces to the CDF of the

half-normal distribution In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution. Let X follow an ordinary normal distribution, N(0,\sigma^2). Then, Y=, X, follows a half-normal distribution. Thus, the ha ...

when ''μ'' = 0. The mean of the folded distribution is then :

\mu_Y = \sigma \sqrt \,\, \exp\left(\frac\right) + \mu \, \mbox\left(\frac\right)

or :

\mu_Y = \sqrt\sigma e^+\mu\left -2\Phi\left(-\frac\right) \right /math>

where \Phi is the normal cumulative distribution function :

: \Phi(x)\; =\; \frac12\left + \operatorname\left(\frac\right)\right The variance then is expressed easily in terms of the mean:

: \sigma_Y^2 = \mu^2 + \sigma^2 - \mu_Y^2. Both the mean (''μ'') and variance (''σ''

²) of ''X'' in the original normal distribution can be interpreted as the location and scale parameters of ''Y'' in the folded distribution.

Properties

Mode

The mode of the distribution is the value of

x

for which the density is maximised. In order to find this value, we take the first derivative of the density with respect to

x

and set it equal to zero. Unfortunately, there is no closed form. We can, however, write the derivative in a better way and end up with a non-linear equation

\frac=0 \Rightarrow  -\frace^-
\frace^=0

x\left^+e^\right \mu \left^-e^\right 0

x\left(1+e^\right)-\mu\left(1-e^\right)=0

\left(\mu+x\right)e^=\mu-x

x=-\frac\log

. Tsagris et al. (2014) saw from numerical investigation that when

\mu<\sigma

, the maximum is met when

x=0

, and when

\mu

becomes greater than

3\sigma

, the maximum approaches

\mu

. This is of course something to be expected, since, in this case, the folded normal converges to the normal distribution. In order to avoid any trouble with negative variances, the exponentiation of the parameter is suggested. Alternatively, you can add a constraint, such as if the optimiser goes for a negative variance the value of the log-likelihood is NA or something very small.

Characteristic function and other related functions

* The characteristic function is given by

\varphi_x\left(t\right)=e^\Phi\left(\frac+i\sigma t \right) +
e^\Phi\left(-\frac+i\sigma t \right)

. * The moment generating function is given by

M_x\left(t\right)=\varphi_x\left(-it\right)=e^\Phi\left(\frac+\sigma t \right) +
e^\Phi\left(-\frac+\sigma t \right)

. * The cumulant generating function is given by

K_x\left(t\right)=\log=
\left(\frac+\mu t\right) + \log

. * The Laplace transformation is given by

E\left(e^\right)=e^\left -\Phi\left(-\frac+\sigma t \right) \right e^\left -\Phi\left(\frac+\sigma t \right) \right /math>.
* The Fourier transform is given by \hat\left(t\right)=\varphi_x\left(-2\pi t\right)=  e^\left -\Phi\left(-\frac-i2\pi \sigma t \right) \right e^\left -\Phi\left(\frac-i2\pi \sigma t \right) \right .

Related distributions

* When , the distribution of is a

. * The random variable has a noncentral chi-squared distribution with 1 degree of freedom and noncentrality equal to . * The folded normal distribution can also be seen as the limit of the folded non-standardized t distribution as the degrees of freedom go to infinity. * There is a bivariate version developed by Psarakis and Panaretos (2001) as well as a multivariate version developed by Chakraborty and Chatterjee (2013). * The Rice distribution is a multivariate generalization of the folded normal distribution. *

Modified half-normal distribution In probability theory and statistics, the modified half-normal distribution (MHN) is a three-parameter family of continuous probability distributions supported on the positive part of the real line. It can be viewed as a generalization of multiple ...

with the pdf on

(0, \infty)

is given as

f(x)= \frac

, where

\Psi(\alpha,z)=_1\Psi_1\left(\begin\left(\alpha,\frac\right)\\(1,0)\end;z \right)

denotes the Fox–Wright Psi function.

Statistical Inference

Estimation of parameters

There are a few ways of estimating the parameters of the folded normal. All of them are essentially the maximum likelihood estimation procedure, but in some cases, a numerical maximization is performed, whereas in other cases, the root of an equation is being searched. The log-likelihood of the folded normal when a sample

x_i

of size

n

is available can be written in the following way

l = -\frac\log+\sum_^n\log

l = -\frac\log+\sum_^n\log

l = -\frac\log-\sum_^n\frac+\sum_^n\log

R (programming language) R is a programming language for statistical computing and Data and information visualization, data visualization. It has been widely adopted in the fields of data mining, bioinformatics, data analysis, and data science. The core R language is ...

, using the package Rfast one can obtain the MLE really fast (command foldnorm.mle). Alternatively, the command optim or nlm will fit this distribution. The maximisation is easy, since two parameters (

\mu

and

\sigma^2

) are involved. Note, that both positive and negative values for

\mu

are acceptable, since

\mu

belongs to the real line of numbers, hence, the sign is not important because the distribution is symmetric with respect to it. The next code is written in R folded <- function(y) The partial derivatives of the log-likelihood are written as

\frac = \frac-
\frac\sum_^n\frac

\frac = \frac-\frac\sum_^n\frac \ \ \text

\frac = -\frac+\frac+
\frac\sum_^n\frac

\frac = -\frac+\frac+
\frac\sum_^n\frac

. By equating the first partial derivative of the log-likelihood to zero, we obtain a nice relationship

\sum_^n\frac=\frac

. Note that the above equation has three solutions, one at zero and two more with the opposite sign. By substituting the above equation, to the partial derivative of the log-likelihood w.r.t

\sigma^2

and equating it to zero, we get the following expression for the variance

\sigma^2=\frac+\frac=\frac=\frac-\mu^2

, which is the same formula as in the

. A main difference here is that

\mu

and

\sigma^2

are not statistically independent. The above relationships can be used to obtain maximum likelihood estimates in an efficient recursive way. We start with an initial value for

\sigma^2

and find the positive root (

\mu

) of the last equation. Then, we get an updated value of

\sigma^2

. The procedure is being repeated until the change in the log-likelihood value is negligible. Another easier and more efficient way is to perform a search algorithm. Let us write the last equation in a more elegant way

2\sum_^n\frac-
\sum_^n\frac+n\mu = 0

\sum_^n\frac+n\mu = 0

. It becomes clear that the optimization the log-likelihood with respect to the two parameters has turned into a root search of a function. This of course is identical to the previous root search. Tsagris et al. (2014) spotted that there are three roots to this equation for

\mu

, i.e. there are three possible values of

\mu

that satisfy this equation. The

-\mu

and

+\mu

, which are the maximum likelihood estimates and 0, which corresponds to the minimum log-likelihood.

References

* * * * * * * * *

External links

Random (formerly Virtual Laboratories): The Folded Normal Distribution
{{ProbDistributions, continuous-semi-infinite Continuous distributions Normal distribution