HOME

TheInfoList



OR:

In
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
and statistics, a compound probability distribution (also known as a
mixture distribution In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection ...
or contagious distribution) is the
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
that results from assuming that a
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
is distributed according to some parametrized distribution, with (some of) the parameters of that distribution themselves being random variables. If the parameter is a scale parameter, the resulting mixture is also called a scale mixture. The compound distribution ("unconditional distribution") is the result of
marginalizing Social exclusion or social marginalisation is the social disadvantage and relegation to the fringe of society. It is a term that has been used widely in Europe and was first used in France in the late 20th century. It is used across discipline ...
(integrating) over the ''latent'' random variable(s) representing the parameter(s) of the parametrized distribution ("conditional distribution").


Definition

A compound probability distribution is the probability distribution that results from assuming that a random variable X is distributed according to some parametrized distribution F with an unknown parameter \theta that is again distributed according to some other distribution G. The resulting distribution H is said to be the distribution that results from compounding F with G. The parameter's distribution G is also called the mixing distribution or latent distribution. Technically, the ''unconditional'' distribution H results from ''
marginalizing Social exclusion or social marginalisation is the social disadvantage and relegation to the fringe of society. It is a term that has been used widely in Europe and was first used in France in the late 20th century. It is used across discipline ...
'' over G, i.e., from integrating out the unknown parameter(s) \theta. Its
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
is given by: :p_H(x) = The same formula applies analogously if some or all of the variables are vectors. From the above formula, one can see that a compound distribution essentially is a special case of a
marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
: The '' joint distribution'' of x and \theta is given by p(x,\theta)=p(x, \theta)p(\theta), and the compound results as its marginal distribution: . If the domain of \theta is discrete, then the distribution is again a special case of a
mixture distribution In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection ...
.


Properties

The compound distribution H will depend on the specific expression of each distribution, as well as which parameter of F is distributed according to the distribution G, and the parameters of H will include any parameters of G that are not marginalized, or integrated, out. The support of H is the same as that of F, and if the latter is a two-parameter distribution parameterized with the mean and variance, some general properties exist. The compound distribution's first two moments are given by: \operatorname_H = \operatorname_G\bigl \thetabigr">.html" ;"title="operatorname_F \thetabigr \operatorname_H(X) = \operatorname_G\bigl \theta)\bigr+ \operatorname_G\bigl(\operatorname_F[X">\thetabigr) (Law of total variance">operatorname_F(X">\theta)\bigr+ \operatorname_G\bigl(\operatorname_F[X">\thetabigr) (Law of total variance) If the mean of F is distributed as G, which in turn has mean \mu and variance \sigma^2 the expressions above imply \operatorname_H = \operatorname_G
theta Theta (, ; uppercase: Θ or ; lowercase: θ or ; grc, ''thē̂ta'' ; Modern: ''thī́ta'' ) is the eighth letter of the Greek alphabet, derived from the Phoenician letter Teth . In the system of Greek numerals, it has a value of 9. ...
= \mu and \operatorname_H(X) = \operatorname_F(X, \theta) + \operatorname_G(Y) = \tau^2 + \sigma^2, where \tau^2 is the variance of F.


Proof

let F and G be probability distributions parameterized with mean a variance as\begin x &\sim \mathcal(\theta,\tau^2) \\ \theta &\sim \mathcal(\mu,\sigma^2) \end then denoting the probability density functions as f(x, \theta) = p_F(x, \theta) and g(\theta) = p_G(\theta) respectively, and h(x) being the probability density of H we have\begin \operatorname_H = \int_F x h(x)dx &= \int_F x \int_G f(x, \theta) g(\theta) d\theta dx \\ &= \int_G \int_F x f(x, \theta) dx\ g(\theta) d\theta \\ &= \int_G \operatorname_F \thetag(\theta) d\theta \end and we have from the parameterization \mathcal and \mathcal that\begin \operatorname_F \theta&= \int_F x f(x, \theta)dx = \theta \\ \operatorname_G
theta Theta (, ; uppercase: Θ or ; lowercase: θ or ; grc, ''thē̂ta'' ; Modern: ''thī́ta'' ) is the eighth letter of the Greek alphabet, derived from the Phoenician letter Teth . In the system of Greek numerals, it has a value of 9. ...
&= \int_G \theta g(\theta)d\theta = \mu \end and therefore the mean of the compound distribution \operatorname_H = \mu as per the expression for its first moment above. The variance of H is given by \operatorname_H ^2- (\operatorname_H ^2, and\begin \operatorname_H ^2= \int_F x^2 h(x)dx &= \int_F x^2 \int_G f(x, \theta) g(\theta) d\theta dx \\ &= \int_G g(\theta)\int_F x^2 f(x, \theta) dx\ d\theta \\ &= \int_G g(\theta)(\tau^2+\theta^2)d\theta\\ &= \tau^2\int_G g(\theta)d\theta+\int_Gg(\theta)\theta^2d\theta\\ &= \tau^2+(\sigma^2+\mu^2), \end given the fact that \int_F x^2 f(x\mid \theta) dx=\operatorname_F ^2\mid \theta\operatorname_F(X\mid\theta)+(\operatorname_F \mid \theta^2 and \int_G \theta^2 g(\theta)d\theta=\operatorname_G theta^2 \operatorname_G(\theta) + (\operatorname_G
theta Theta (, ; uppercase: Θ or ; lowercase: θ or ; grc, ''thē̂ta'' ; Modern: ''thī́ta'' ) is the eighth letter of the Greek alphabet, derived from the Phoenician letter Teth . In the system of Greek numerals, it has a value of 9. ...
^2 . Finally we get\begin \operatorname_H(X) &= \operatorname_H ^2- (\operatorname_H ^2 \\ &= \tau^2 + \sigma^2 \end


Applications


Testing

Distributions of common
test statistic A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specifie ...
s result as compound distributions under their null hypothesis, for example in
Student's t-test A ''t''-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of ...
(where the test statistic results as the ratio of a normal and a chi-squared random variable), or in the
F-test An ''F''-test is any statistical test in which the test statistic has an ''F''-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model ...
(where the test statistic is the ratio of two chi-squared random variables).


Overdispersion modeling

Compound distributions are useful for modeling outcomes exhibiting overdispersion, i.e., a greater amount of variability than would be expected under a certain model. For example, count data are commonly modeled using the
Poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known ...
, whose variance is equal to its mean. The distribution may be generalized by allowing for variability in its rate parameter, implemented via a
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
, which results in a marginal
negative binomial distribution In probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expr ...
. This distribution is similar in its shape to the Poisson distribution, but it allows for larger variances. Similarly, a
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no qu ...
may be generalized to allow for additional variability by compounding it with a beta distribution for its success probability parameter, which results in a beta-binomial distribution.


Bayesian inference

Besides ubiquitous marginal distributions that may be seen as special cases of compound distributions, in Bayesian inference, compound distributions arise when, in the notation above, ''F'' represents the distribution of future observations and ''G'' is the posterior distribution of the parameters of ''F'', given the information in a set of observed data. This gives a
posterior predictive distribution Posterior may refer to: * Posterior (anatomy), the end of an organism opposite to its head ** Buttocks, as a euphemism * Posterior horn (disambiguation) * Posterior probability, the conditional probability that is assigned when the relevant evi ...
. Correspondingly, for the prior predictive distribution, ''F'' is the distribution of a new data point while ''G'' is the prior distribution of the parameters.


Convolution

Convolution In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions ( and ) that produces a third function (f*g) that expresses how the shape of one is modified by the other. The term ''convolution' ...
of probability distributions (to derive the probability distribution of sums of random variables) may also be seen as a special case of compounding; here the sum's distribution essentially results from considering one summand as a random
location parameter In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...
for the other summand.


Computation

Compound distributions derived from exponential family distributions often have a closed form. If analytical integration is not possible, numerical methods may be necessary. Compound distributions may relatively easily be investigated using
Monte Carlo method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deter ...
s, i.e., by generating random samples. It is often easy to generate random numbers from the distributions p(\theta) as well as p(x, \theta) and then utilize these to perform '' collapsed Gibbs sampling'' to generate samples from p(x). A compound distribution may usually also be approximated to a sufficient degree by a
mixture distribution In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection ...
using a finite number of mixture components, allowing to derive approximate density, distribution function etc. Parameter estimation ( maximum-likelihood or maximum-a-posteriori estimation) within a compound distribution model may sometimes be simplified by utilizing the EM-algorithm.


Examples

* Gaussian scale mixtures: ** Compounding a
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
with
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
distributed according to an
inverse gamma distribution In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to ...
(or equivalently, with precision distributed as a
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
) yields a non-standardized Student's t-distribution. This distribution has the same symmetrical shape as a normal distribution with the same central point, but has greater variance and heavy tails. ** Compounding a Gaussian (or normal) distribution with variance distributed according to an
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
(or with standard deviation according to a Rayleigh distribution) yields a
Laplace distribution In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two expo ...
. More generally, compounding a Gaussian (or normal) distribution with variance distributed according to a
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
yields a variance-gamma distribution. ** Compounding a
Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
with variance distributed according to an
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
whose rate parameter is itself distributed according to a
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
yields a Normal-exponential-gamma distribution. (This involves two compounding stages. The variance itself then follows a Lomax distribution; see below.) ** Compounding a
Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
with standard deviation distributed according to a (standard) inverse uniform distribution yields a Slash distribution. * other Gaussian mixtures: ** Compounding a
Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
with
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set. For a data set, the '' ari ...
distributed according to another
Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
yields (again) a
Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
. ** Compounding a
Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
with
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set. For a data set, the '' ari ...
distributed according to a shifted
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
yields an exponentially modified Gaussian distribution. * Compounding a
Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,James Victor Uspensky: ''Introduction to Mathematical Probability'', McGraw-Hill, New York 1937, page 45 is the discrete probab ...
with probability of success p distributed according to a distribution X that has a defined expected value yields a Bernoulli distribution with success probability E /math>. An interesting consequence is that the dispersion of X does not influence the dispersion of the resulting compound distribution. * Compounding a
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no qu ...
with probability of success distributed according to a beta distribution yields a beta-binomial distribution. It possesses three parameters, a parameter n (number of samples) from the binomial distribution and shape parameters \alpha and \beta from the beta distribution. * Compounding a
multinomial distribution In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a ''k''-sided dice rolled ''n'' times. For ''n'' independent trials each of w ...
with probability vector distributed according to a Dirichlet distribution yields a Dirichlet-multinomial distribution. * Compounding a
Poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known ...
with rate parameter distributed according to a
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
yields a
negative binomial distribution In probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expr ...
. * Compounding a
Poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known ...
with rate parameter distributed according to a
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
yields a
geometric distribution In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions: * The probability distribution of the number ''X'' of Bernoulli trials needed to get one success, supported on the set \; ...
. * Compounding an
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
with its rate parameter distributed according to a
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
yields a Lomax distribution. * Compounding a
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
with inverse scale parameter distributed according to another
gamma distribution In probability theory and statistics, the gamma distribution is a two- parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma dis ...
yields a three-parameter beta prime distribution. * Compounding a
half-normal distribution In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution. Let X follow an ordinary normal distribution, N(0,\sigma^2). Then, Y=, X, follows a half-normal distribution. Thus, the ha ...
with its scale parameter distributed according to a Rayleigh distribution yields an
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
. This follows immediately from the
Laplace distribution In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two expo ...
resulting as a normal scale mixture; see above. The roles of conditional and mixing distributions may also be exchanged here; consequently, compounding a Rayleigh distribution with its scale parameter distributed according to a
half-normal distribution In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution. Let X follow an ordinary normal distribution, N(0,\sigma^2). Then, Y=, X, follows a half-normal distribution. Thus, the ha ...
''also'' yields an
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
. * A Gamma(k=2,θ) - distributed random variable whose scale parameter θ again is
uniformly Uniform distribution may refer to: * Continuous uniform distribution * Discrete uniform distribution * Uniform distribution (ecology) * Equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be ...
distributed marginally yields an
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
.


Similar terms

The notion of "compound distribution" as used e.g. in the definition of a Compound Poisson distribution or
Compound Poisson process A compound Poisson process is a continuous-time (random) stochastic process with jumps. The jumps arrive randomly according to a Poisson process and the size of the jumps is also random, with a specified probability distribution. A compound Poisso ...
is different from the definition found in this article. The meaning in this article corresponds to what is used in e.g. Bayesian hierarchical modeling. The special case for compound probability distributions where the parametrized distribution F is the
Poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known ...
is also called mixed Poisson distribution.


See also

*
Mixture distribution In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection ...
* Mixed Poisson distribution * Bayesian hierarchical modeling *
Marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
*
Conditional distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the c ...
* Joint distribution *
Convolution In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions ( and ) that produces a third function (f*g) that expresses how the shape of one is modified by the other. The term ''convolution' ...
* Overdispersion * EM-algorithm


References


Further reading

* * * * {{citation , title=Univariate discrete distributions , last1=Johnson , first1=N. L. , last2=Kemp , first2=A. W. , last3=Kotz , first3=S. , chapter=8 ''Mixture distributions'' , year=2005 , publisher=Wiley , location=New York , isbn=978-0-471-27246-5 Types of probability distributions