HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
and
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the marginal distribution of a
subset In mathematics, Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they are ...
of a
collection Collection or Collections may refer to: * Cash collection, the function of an accounts receivable department * Collection (church), money donated by the congregation during a church service * Collection agency, agency to collect cash * Collectio ...
of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s is the
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of the variables contained in the subset. It gives the probabilities of various values of the variables in the subset without reference to the values of the other variables. This contrasts with a
conditional distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the co ...
, which gives the probabilities contingent upon the values of the other variables. Marginal variables are those variables in the subset of variables being retained. These concepts are "marginal" because they can be found by summing values in a table along rows or columns, and writing the sum in the margins of the table. The distribution of the marginal variables (the marginal distribution) is obtained by marginalizing (that is, focusing on the sums in the margin) over the distribution of the variables being discarded, and the discarded variables are said to have been marginalized out. The context here is that the theoretical studies being undertaken, or the
data analysis Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, enco ...
being done, involves a wider set of random variables but that attention is being limited to a reduced number of those variables. In many applications, an analysis may start with a given collection of random variables, then first extend the set by defining new ones (such as the sum of the original random variables) and finally reduce the number by placing interest in the marginal distribution of a subset (such as the sum). Several different analyses may be done, each treating a different subset of variables as the marginal variables.


Definition


Marginal probability mass function

Given a known
joint distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
of two discrete
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s, say, and , the marginal distribution of either variable – for example — is the
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of when the values of are not taken into consideration. This can be calculated by summing the
joint probability Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considere ...
distribution over all values of . Naturally, the converse is also true: the marginal distribution can be obtained for by summing over the separate values of . :p_X(x_i)=\sum_p(x_i,y_j), and p_Y(y_j)=\sum_p(x_i,y_j) A marginal probability can always be written as an
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
: p_X(x) = \int_y p_(x \mid y) \, p_Y(y) \, \mathrmy = \operatorname_ _(x \mid y);. Intuitively, the marginal probability of ''X'' is computed by examining the conditional probability of ''X'' given a particular value of ''Y'', and then averaging this conditional probability over the distribution of all values of ''Y''. This follows from the definition of
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
(after applying the
law of the unconscious statistician In probability theory and statistics, the law of the unconscious statistician, or LOTUS, is a theorem used to calculate the expected value of a function ''g''(''X'') of a random variable ''X'' when one knows the probability distribution of ''X'' but ...
) \operatorname_Y
(Y) A thumb signal, usually described as a thumbs-up or thumbs-down, is a common hand gesture achieved by a closed fist held with the thumb extended upward or downward in approval or disapproval, respectively. These gestures have become metaphors i ...
= \int_y f(y) p_Y(y) \, \mathrmy. Therefore, marginalization provides the rule for the transformation of the probability distribution of a random variable ''Y'' and another random variable : p_X(x) = \int_y p_(x \mid y) \, p_Y(y) \, \mathrmy = \int_y \delta\big(x - g(y)\big) \, p_Y(y) \, \mathrmy.


Marginal probability density function

Given two continuous
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s ''X'' and ''Y'' whose
joint distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
is known, then the marginal
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
can be obtained by integrating the
joint probability Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considere ...
distribution, , over ''Y,'' and vice versa. That is :f_X(x) = \int_^ f(x,y) \, dy, and f_Y(y) = \int_^ f(x,y) \, dx where x\in ,b/math>, and y\in ,d/math>.


Marginal cumulative distribution function

Finding the marginal
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
from the joint cumulative distribution function is easy. Recall that: * For discrete
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s, F(x,y) = P(X\leq x, Y\leq y) * For continuous random variables, F(x,y) = \int_^ \int_^ f(x',y') \, dy' dx' If ''X'' and ''Y'' jointly take values on 'a'', ''b''× 'c'', ''d''then :F_X(x)=F(x,d) and F_Y(y)=F(b,y) If ''d'' is ∞, then this becomes a limit F_X(x) = \lim_ F(x,y). Likewise for F_Y(y).


Marginal distribution vs. conditional distribution


Definition

The marginal probability is the probability of a single event occurring, independent of other events. A
conditional probability In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occur ...
, on the other hand, is the probability that an event occurs given that another specific event ''has already'' occurred. This means that the calculation for one variable is dependent on another variable. The conditional distribution of a variable given another variable is the joint distribution of both variables divided by the marginal distribution of the other variable. That is, * For discrete
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s,p_(y, x) = P(Y=y \mid X=x) = \frac * For continuous random variables,f_(y, x)=\frac


Example

Suppose there is data from a classroom of 200 students on the amount of time studied (''X'') and the percentage of correct answers (''Y''). Assuming that ''X'' and ''Y'' are discrete random variables, the joint distribution of ''X'' and ''Y'' can be described by listing all the possible values of ''p''(''xi'',''yj''), as shown in Table.3. The marginal distribution can be used to determine how many students scored 20 or below: p_Y(y_1) = P_Y(Y=y_1) = \sum_^4 P(x_i,y_1) = \frac + \frac = \frac, meaning 10 students or 5%. The
conditional distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the co ...
can be used to determine the probability that a student that studied 60 minutes or more obtains a scored of 20 or below: p_(y_1, x_4) = P(Y=y_1, X=x_4) = \frac = \frac = \frac = \frac, meaning there is about a 11% probability of scoring 20 after having studied for at least 60 minutes.


Real-world example

Suppose that the probability that a pedestrian will be hit by a car, while crossing the road at a pedestrian crossing, without paying attention to the traffic light, is to be computed. Let H be a
discrete random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
taking one value from . Let L (for traffic light) be a discrete random variable taking one value from . Realistically, H will be dependent on L. That is, P(H = Hit) will take different values depending on whether L is red, yellow or green (and likewise for P(H = Not Hit)). A person is, for example, far more likely to be hit by a car when trying to cross while the lights for perpendicular traffic are green than if they are red. In other words, for any given possible pair of values for H and L, one must consider the
joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
of H and L to find the probability of that pair of events occurring together if the pedestrian ignores the state of the light. However, in trying to calculate the marginal probability P(H = Hit), what is being sought is the probability that H = Hit in the situation in which the particular value of L is unknown and in which the pedestrian ignores the state of the light. In general, a pedestrian can be hit if the lights are red OR if the lights are yellow OR if the lights are green. So, the answer for the marginal probability can be found by summing P(H ,  L) for all possible values of L, with each value of L weighted by its probability of occurring. Here is a table showing the conditional probabilities of being hit, depending on the state of the lights. (Note that the columns in this table must add up to 1 because the probability of being hit or not hit is 1 regardless of the state of the light.) To find the joint probability distribution, more data is required. For example, suppose P(L = red) = 0.2, P(L = yellow) = 0.1, and P(L = green) = 0.7. Multiplying each column in the conditional distribution by the probability of that column occurring results in the joint probability distribution of H and L, given in the central 2×3 block of entries. (Note that the cells in this 2×3 block add up to 1). The marginal probability P(H = Hit) is the sum 0.572 along the H = Hit row of this joint distribution table, as this is the probability of being hit when the lights are red OR yellow OR green. Similarly, the marginal probability that P(H = Not Hit) is the sum along the H = Not Hit row.


Multivariate distributions

For
multivariate distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
s, formulae similar to those above apply with the symbols ''X'' and/or ''Y'' being interpreted as vectors. In particular, each summation or integration would be over all variables except those contained in ''X''. That means, If ''X''1,''X''2,…,''Xn'' are discrete
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s, then the marginal
probability mass function In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...
should be p_(k)=\sum p(x_1,x_2,\dots,x_,k,x_,\dots,x_n); if ''X''1,''X''2,…,''Xn'' are continuous random variables, then the marginal
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
should be f_(x_i)=\int_^\int_^ \int_^ \cdots \int_^ f(x_1,x_2,\dots,x_n) dx_1 dx_2 \cdots dx_ dx_ \cdots dx_n .


See also

*
Compound probability distribution In probability and statistics, a compound probability distribution (also known as a mixture distribution or contagious distribution) is the probability distribution that results from assuming that a random variable is distributed according to som ...
*
Joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
*
Marginal likelihood A marginal likelihood is a likelihood function that has been integrated over the parameter space. In Bayesian statistics, it represents the probability of generating the observed sample from a prior and is therefore often referred to as model evi ...
*
Wasserstein metric In mathematics, the Leonid Vaseršteĭn, Wasserstein distance or Leonid Kantorovich, Kantorovich–Gennadii Rubinstein, Rubinstein metric is a metric (mathematics), distance function defined between Probability distribution, probability distributi ...
*
Conditional distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the co ...


References


Bibliography

* * {{DEFAULTSORT:Marginal Distribution Theory of probability distributions