In
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, a normal distribution or Gaussian distribution is a type of
continuous probability distribution for a
real-valued random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
. The general form of its
probability density function is
:
The parameter
is the
mean or
expectation
Expectation or Expectations may refer to:
Science
* Expectation (epistemic)
* Expected value, in mathematical probability theory
* Expectation value (quantum mechanics)
* Expectation–maximization algorithm, in statistics
Music
* ''Expectation' ...
of the distribution (and also its
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
and
mode), while the parameter
is its
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
. The
variance of the distribution is
. A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.
Normal distributions are important in
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
and are often used in the
natural and
social sciences to represent real-valued
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s whose distributions are not known. Their importance is partly due to the
central limit theorem. It states that, under some conditions, the average of many samples (observations) of a random variable with finite mean and variance is itself a random variable—whose distribution
converges to a normal distribution as the number of samples increases. Therefore, physical quantities that are expected to be the sum of many independent processes, such as
measurement errors, often have distributions that are nearly normal.
Moreover, Gaussian distributions have some unique properties that are valuable in analytic studies. For instance, any linear combination of a fixed collection of normal deviates is a normal deviate. Many results and methods, such as
propagation of uncertainty and
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the res ...
parameter fitting, can be derived analytically in explicit form when the relevant variables are normally distributed.
A normal distribution is sometimes informally called a bell curve.
However, many other distributions are bell-shaped (such as the
Cauchy,
Student's ''t'', and
logistic distributions). For other names, see
Naming.
The univariate probability distribution is generalized for vectors in the
multivariate normal distribution and for matrices in the
matrix normal distribution.
Definitions
Standard normal distribution
The simplest case of a normal distribution is known as the ''standard normal distribution'' or ''unit normal distribution''. This is a special case when
and
, and it is described by this
probability density function (or density):
:
The variable
has a mean of 0 and a variance and standard deviation of 1. The density
has its peak
at
and
inflection points at
and
.
Although the density above is most commonly known as the ''standard normal,'' a few authors have used that term to describe other versions of the normal distribution.
Carl Friedrich Gauss
Johann Carl Friedrich Gauss (; german: Gauß ; la, Carolus Fridericus Gauss; 30 April 177723 February 1855) was a German mathematician and physicist who made significant contributions to many fields in mathematics and science. Sometimes refe ...
, for example, once defined the standard normal as
:
which has a variance of 1/2, and
Stephen Stigler once defined the standard normal as
:
which has a simple functional form and a variance of
General normal distribution
Every normal distribution is a version of the standard normal distribution, whose domain has been stretched by a factor
(the standard deviation) and then translated by
(the mean value):
:
The probability density must be scaled by
so that the integral is still 1.
If
is a
standard normal deviate
A standard normal deviate is a normally distributed deviate. It is a realization of a standard normal random variable, defined as a random variable with expected value 0 and variance 1.Dodge, Y. (2003) The Oxford Dictionary of Statis ...
, then
will have a normal distribution with expected value
and standard deviation
. This is equivalent to saying that the "standard" normal distribution
can be scaled/stretched by a factor of
and shifted by
to yield a different normal distribution, called
. Conversely, if
is a normal deviate with parameters
and
, then this
distribution can be re-scaled and shifted via the formula
to convert it to the "standard" normal distribution. This variate is also called the standardized form of
.
Notation
The probability density of the standard Gaussian distribution (standard normal distribution, with zero mean and unit variance) is often denoted with the Greek letter
(
phi). The alternative form of the Greek letter phi,
, is also used quite often.
The normal distribution is often referred to as
or
. Thus when a random variable
is normally distributed with mean
and standard deviation
, one may write
:
Alternative parameterizations
Some authors advocate using the
precision as the parameter defining the width of the distribution, instead of the deviation
or the variance
. The precision is normally defined as the reciprocal of the variance,
. The formula for the distribution then becomes
:
This choice is claimed to have advantages in numerical computations when
is very close to zero, and simplifies formulas in some contexts, such as in the
Bayesian inference
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
of variables with
multivariate normal distribution.
Alternatively, the reciprocal of the standard deviation
might be defined as the ''precision'', in which case the expression of the normal distribution becomes
:
According to Stigler, this formulation is advantageous because of a much simpler and easier-to-remember formula, and simple approximate formulas for the
quantiles of the distribution.
Normal distributions form an
exponential family with
natural parameters
and
, and natural statistics ''x'' and ''x''
2. The dual expectation parameters for normal distribution are and .
Cumulative distribution functions
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
(CDF) of the standard normal distribution, usually denoted with the capital Greek letter
(
phi), is the integral
:
The related
error function
In mathematics, the error function (also called the Gauss error function), often denoted by , is a complex function of a complex variable defined as:
:\operatorname z = \frac\int_0^z e^\,\mathrm dt.
This integral is a special (non-elementary ...
gives the probability of a random variable, with normal distribution of mean 0 and variance 1/2 falling in the range