In
probability
Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
and
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, a mixture distribution is the
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
of a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection according to given probabilities of selection, and then the value of the selected random variable is realized. The underlying random variables may be random real numbers, or they may be
random vector
In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...
s (each having the same dimension), in which case the mixture distribution is a
multivariate distribution.
In cases where each of the underlying random variables is
continuous, the outcome variable will also be continuous and its
probability density function
In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
is sometimes referred to as a mixture density. The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ever ...
(and the
probability density function
In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
if it exists) can be expressed as a
convex combination
In convex geometry and Vector space, vector algebra, a convex combination is a linear combination of point (geometry), points (which can be vector (geometric), vectors, scalar (mathematics), scalars, or more generally points in an affine sp ...
(i.e. a weighted sum, with non-negative weights that sum to 1) of other distribution functions and density functions. The individual distributions that are combined to form the mixture distribution are called the mixture components, and the probabilities (or weights) associated with each component are called the mixture weights. The number of components in a mixture distribution is often restricted to being finite, although in some cases the components may be
countably infinite
In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural numbe ...
in number. More general cases (i.e. an
uncountable
In mathematics, an uncountable set, informally, is an infinite set that contains too many elements to be countable. The uncountability of a set is closely related to its cardinal number: a set is uncountable if its cardinal number is larger tha ...
set of component distributions), as well as the countable case, are treated under the title of
compound distributions.
A distinction needs to be made between a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
whose distribution function or density is the sum of a set of components (i.e. a mixture distribution) and a random variable whose value is the sum of the values of two or more underlying random variables, in which case the distribution is given by the
convolution
In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
operator. As an example, the sum of two
jointly normally distributed random variables, each with different means, will still have a normal distribution. On the other hand, a mixture density created as a mixture of two normal distributions with different means will have two peaks provided that the two means are far enough apart, showing that this distribution is radically different from a normal distribution.
Mixture distributions arise in many contexts in the literature and arise naturally where a
statistical population
In statistics, a population is a set of similar items or events which is of interest for some question or experiment. A statistical population can be a group of existing objects (e.g. the set of all stars within the Milky Way galaxy) or a hyp ...
contains two or more
subpopulations. They are also sometimes used as a means of representing non-normal distributions. Data analysis concerning
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...
s involving mixture distributions is discussed under the title of
mixture model
In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observati ...
s, while the present article concentrates on simple probabilistic and statistical properties of mixture distributions and how these relate to properties of the underlying distributions.
Finite and countable mixtures
Given a finite set of probability density functions , ..., , or corresponding cumulative distribution functions ..., and weights , ..., such that and , the mixture distribution can be represented by writing either the density, , or the distribution function, , as a sum (which in both cases is a convex combination):
This type of mixture, being a finite sum, is called a finite mixture, and in applications, an unqualified reference to a "mixture density" usually means a finite mixture. The case of a countably infinite set of components is covered formally by allowing
.
Uncountable mixtures
Where the set of component distributions is
uncountable
In mathematics, an uncountable set, informally, is an infinite set that contains too many elements to be countable. The uncountability of a set is closely related to its cardinal number: a set is uncountable if its cardinal number is larger tha ...
, the result is often called a
compound probability distribution
In probability and statistics, a compound probability distribution (also known as a mixture distribution or contagious distribution) is the probability distribution that results from assuming that a random variable is distributed according to some ...
. The construction of such distributions has a formal similarity to that of mixture distributions, with either infinite summations or integrals replacing the finite summations used for finite mixtures.
Consider a probability density function for a variable , parameterized by . That is, for each value of in some set , is a probability density function with respect to . Given a probability density function (meaning that is nonnegative and integrates to 1), the function
is again a probability density function for . A similar integral can be written for the cumulative distribution function. Note that the formulae here reduce to the case of a finite or infinite mixture if the density is allowed to be a
generalized function
In mathematics, generalized functions are objects extending the notion of functions on real or complex numbers. There is more than one recognized theory, for example the theory of distributions. Generalized functions are especially useful for tr ...
representing the "derivative" of the cumulative distribution function of a
discrete distribution
In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spac ...
.
Mixtures within a parametric family
The mixture components are often not arbitrary probability distributions, but instead are members of a
parametric family
In mathematics and its applications, a parametric family or a parameterized family is a indexed family, family of objects (a set of related objects) whose differences depend only on the chosen values for a set of parameters.
Common examples are p ...
(such as normal distributions), with different values for a parameter or parameters. In such cases, assuming that it exists, the density can be written in the form of a sum as:
for one parameter, or
for two parameters, and so forth.
Properties
Convexity
A general
linear combination
In mathematics, a linear combination or superposition is an Expression (mathematics), expression constructed from a Set (mathematics), set of terms by multiplying each term by a constant and adding the results (e.g. a linear combination of ''x'' a ...
of probability density functions is not necessarily a probability density, since it may be negative or it may integrate to something other than 1. However, a
convex combination
In convex geometry and Vector space, vector algebra, a convex combination is a linear combination of point (geometry), points (which can be vector (geometric), vectors, scalar (mathematics), scalars, or more generally points in an affine sp ...
of probability density functions preserves both of these properties (non-negativity and integrating to 1), and thus mixture densities are themselves probability density functions.
Moments
Let , ..., denote random variables from the component distributions, and let denote a random variable from the mixture distribution. Then, for any function for which