In
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, a normal distribution or Gaussian distribution is a type of
continuous probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
for a
real-valued
In mathematics, value may refer to several, strongly related notions.
In general, a mathematical value may be any definite mathematical object. In elementary mathematics, this is most often a number – for example, a real number such as or an i ...
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
. The general form of its
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
is
:
The parameter
is the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set.
For a data set, the ''arithme ...
or
expectation of the distribution (and also its
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
and
mode
Mode ( la, modus meaning "manner, tune, measure, due measure, rhythm, melody") may refer to:
Arts and entertainment
* '' MO''D''E (magazine)'', a defunct U.S. women's fashion magazine
* ''Mode'' magazine, a fictional fashion magazine which is ...
), while the parameter
is its
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
. The
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
of the distribution is
. A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.
Normal distributions are important in
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
and are often used in the
natural
Nature, in the broadest sense, is the physical world or universe. "Nature" can refer to the phenomena of the physical world, and also to life in general. The study of nature is a large, if not the only, part of science. Although humans are p ...
and
social science
Social science is one of the branches of science, devoted to the study of societies and the relationships among individuals within those societies. The term was formerly used to refer to the field of sociology, the original "science of soc ...
s to represent real-valued
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s whose distributions are not known. Their importance is partly due to the
central limit theorem
In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselv ...
. It states that, under some conditions, the average of many samples (observations) of a random variable with finite mean and variance is itself a random variable—whose distribution
converges to a normal distribution as the number of samples increases. Therefore, physical quantities that are expected to be the sum of many independent processes, such as
measurement error
Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a " mistake ...
s, often have distributions that are nearly normal.
Moreover, Gaussian distributions have some unique properties that are valuable in analytic studies. For instance, any linear combination of a fixed collection of normal deviates is a normal deviate. Many results and methods, such as
propagation of uncertainty
In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of exp ...
and
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the res ...
parameter fitting, can be derived analytically in explicit form when the relevant variables are normally distributed.
A normal distribution is sometimes informally called a bell curve.
However, many other distributions are bell-shaped (such as the
Cauchy
Baron Augustin-Louis Cauchy (, ; ; 21 August 178923 May 1857) was a French mathematician, engineer, and physicist who made pioneering contributions to several branches of mathematics, including mathematical analysis and continuum mechanics. He w ...
,
Student's ''t'', and
logistic distributions). For other names, see
Naming
Naming is assigning a name to something.
Naming may refer to:
* Naming (parliamentary procedure), a procedure in certain parliamentary bodies
* Naming ceremony, an event at which an infant is named
* Product naming, the discipline of deciding wha ...
.
The univariate probability distribution is generalized for vectors in the
multivariate normal distribution
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
and for matrices in the
matrix normal distribution
In statistics, the matrix normal distribution or matrix Gaussian distribution is a probability distribution that is a generalization of the multivariate normal distribution to matrix-valued random variables.
Definition
The probability density ...
.
Definitions
Standard normal distribution
The simplest case of a normal distribution is known as the ''standard normal distribution'' or ''unit normal distribution''. This is a special case when
and
, and it is described by this
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
(or density):
:
The variable
has a mean of 0 and a variance and standard deviation of 1. The density
has its peak
at
and
inflection point
In differential calculus and differential geometry, an inflection point, point of inflection, flex, or inflection (British English: inflexion) is a point on a smooth plane curve at which the curvature changes sign. In particular, in the case of ...
s at
and
.
Although the density above is most commonly known as the ''standard normal,'' a few authors have used that term to describe other versions of the normal distribution.
Carl Friedrich Gauss
Johann Carl Friedrich Gauss (; german: Gauß ; la, Carolus Fridericus Gauss; 30 April 177723 February 1855) was a German mathematician and physicist who made significant contributions to many fields in mathematics and science. Sometimes refer ...
, for example, once defined the standard normal as
:
which has a variance of 1/2, and
Stephen Stigler
Stephen Mack Stigler (born August 10, 1941) is Ernest DeWitt Burton Distinguished Service Professor at the Department of Statistics of the University of Chicago. He has authored several books on the history of statistics; he is the son of the e ...
once defined the standard normal as
:
which has a simple functional form and a variance of
General normal distribution
Every normal distribution is a version of the standard normal distribution, whose domain has been stretched by a factor
(the standard deviation) and then translated by
(the mean value):
:
The probability density must be scaled by
so that the integral is still 1.
If
is a
standard normal deviate
A standard normal deviate is a normally distributed deviate. It is a realization of a standard normal random variable, defined as a random variable with expected value 0 and variance 1.Dodge, Y. (2003) The Oxford Dictionary of Statis ...
, then
will have a normal distribution with expected value
and standard deviation
. This is equivalent to saying that the "standard" normal distribution
can be scaled/stretched by a factor of
and shifted by
to yield a different normal distribution, called
. Conversely, if
is a normal deviate with parameters
and
, then this
distribution can be re-scaled and shifted via the formula
to convert it to the "standard" normal distribution. This variate is also called the standardized form of
.
Notation
The probability density of the standard Gaussian distribution (standard normal distribution, with zero mean and unit variance) is often denoted with the Greek letter
(
phi
Phi (; uppercase Φ, lowercase φ or ϕ; grc, ϕεῖ ''pheî'' ; Modern Greek: ''fi'' ) is the 21st letter of the Greek alphabet.
In Archaic and Classical Greek (c. 9th century BC to 4th century BC), it represented an aspirated voicele ...
). The alternative form of the Greek letter phi,
, is also used quite often.
The normal distribution is often referred to as
or
. Thus when a random variable
is normally distributed with mean
and standard deviation
, one may write
:
Alternative parameterizations
Some authors advocate using the
precision
Precision, precise or precisely may refer to:
Science, and technology, and mathematics Mathematics and computing (general)
* Accuracy and precision, measurement deviation from true value and its scatter
* Significant figures, the number of digit ...
as the parameter defining the width of the distribution, instead of the deviation
or the variance
. The precision is normally defined as the reciprocal of the variance,
. The formula for the distribution then becomes
:
This choice is claimed to have advantages in numerical computations when
is very close to zero, and simplifies formulas in some contexts, such as in the
Bayesian inference
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
of variables with
multivariate normal distribution
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
.
Alternatively, the reciprocal of the standard deviation
might be defined as the ''precision'', in which case the expression of the normal distribution becomes
:
According to Stigler, this formulation is advantageous because of a much simpler and easier-to-remember formula, and simple approximate formulas for the
quantile
In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in the same way. There is one fewer quantile tha ...
s of the distribution.
Normal distributions form an
exponential family
In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ...
with
natural parameter
In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ...
s
and
, and natural statistics ''x'' and ''x''
2. The dual expectation parameters for normal distribution are and .
Cumulative distribution functions
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
(CDF) of the standard normal distribution, usually denoted with the capital Greek letter
(
phi
Phi (; uppercase Φ, lowercase φ or ϕ; grc, ϕεῖ ''pheî'' ; Modern Greek: ''fi'' ) is the 21st letter of the Greek alphabet.
In Archaic and Classical Greek (c. 9th century BC to 4th century BC), it represented an aspirated voicele ...
), is the integral
:
The related
error function
In mathematics, the error function (also called the Gauss error function), often denoted by , is a complex function of a complex variable defined as:
:\operatorname z = \frac\int_0^z e^\,\mathrm dt.
This integral is a special (non-elementary ...
gives the probability of a random variable, with normal distribution of mean 0 and variance 1/2 falling in the range