probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...

and

statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, the moment-generating function of a real-valued

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...

is an alternative specification of its

probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...

. Thus, it provides the basis of an alternative route to analytical results compared with working directly with

probability density function In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...

s or

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...

s. There are particularly simple results for the moment-generating functions of distributions defined by the weighted sums of random variables. However, not all random variables have moment-generating functions. As its name implies, the moment-

generating function In mathematics, a generating function is a representation of an infinite sequence of numbers as the coefficients of a formal power series. Generating functions are often expressed in closed form (rather than as a series), by some expression invo ...

can be used to compute a distribution’s moments: the -th moment about 0 is the -th derivative of the moment-generating function, evaluated at 0. In addition to univariate real-valued distributions, moment-generating functions can also be defined for vector- or matrix-valued random variables, and can even be extended to more general cases. The moment-generating function of a real-valued distribution does not always exist, unlike the

characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts: * The indicator function of a subset, that is the function \mathbf_A\colon X \to \, which for a given subset ''A'' of ''X'', has value 1 at points ...

. There are relations between the behavior of the moment-generating function of a distribution and properties of the distribution, such as the existence of moments.

Definition

Let

X

be a

with CDF

F_X

. The moment generating function (mgf) of

X

(or

F_X

), denoted by

M_X(t)

, is

M_X(t) = \operatorname E \left^\right

provided this expectation exists for

t

in some open

neighborhood A neighbourhood (Commonwealth English) or neighborhood (American English) is a geographically localized community within a larger town, city, suburb or rural area, sometimes consisting of a single street and the buildings lining it. Neigh ...

of 0. That is, there is an

h > 0

such that for all

t

-h < 0 < h

\operatorname E \left^\right

exists. If the expectation does not exist in an open neighborhood of 0, we say that the moment generating function does not exist. In other words, the moment-generating function of is the expectation of the random variable

e^

. More generally, when

\mathbf X = ( X_1, \ldots, X_n)^

, an

n

-dimensional

random vector In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...

, and

\mathbf t

is a fixed vector, one uses

\mathbf t \cdot \mathbf X = \mathbf t^\mathrm T\mathbf X

instead of

M_(\mathbf t) := \operatorname E \left^\right

M_X(0)

always exists and is equal to 1. However, a key problem with moment-generating functions is that moments and the moment-generating function may not exist, as the integrals need not converge absolutely. By contrast, the

or Fourier transform always exists (because it is the integral of a bounded function on a space of finite measure), and for some purposes may be used instead. The moment-generating function is so named because it can be used to find the moments of the distribution. The series expansion of

e^

e^ = 1 + t X + \frac + \frac + \cdots + \frac + \cdots.

Hence,

& = 1 + t m_1 + \frac + \frac + \cdots + \frac + \cdots, \end

where

m_n

is the moment. Differentiating

M_X(t)

i

times with respect to

t

and setting

t = 0

, we obtain the

i

-th moment about the origin,

m_i

; see below. If

X

is a continuous random variable, the following relation between its moment-generating function

M_X(t)

and the

two-sided Laplace transform In mathematics, the two-sided Laplace transform or bilateral Laplace transform is an integral transform equivalent to probability's moment-generating function. Two-sided Laplace transforms are closely related to the Fourier transform, the Melli ...

of its probability density function

f_X(x)

holds:

M_X(t) = \mathcal\(-t),

since the PDF's two-sided Laplace transform is given as

\mathcal\(s) = \int_^\infty e^ f_X(x)\, dx,

and the moment-generating function's definition expands (by the law of the unconscious statistician) to

= \int_^\infty e^ f_X(x)\, dx.

This is consistent with the characteristic function of

X

being a

Wick rotation In physics, Wick rotation, named after Italian physicist Gian Carlo Wick, is a method of finding a solution to a mathematical problem in Minkowski space from a solution to a related problem in Euclidean space by means of a transformation that sub ...

M_X(t)

when the moment generating function exists, as the characteristic function of a continuous random variable

X

is the

Fourier transform In mathematics, the Fourier transform (FT) is an integral transform that takes a function as input then outputs another function that describes the extent to which various frequencies are present in the original function. The output of the tr ...

of its probability density function

f_X(x)

, and in general when a function

f(x)

is of exponential order, the Fourier transform of

f

is a Wick rotation of its two-sided Laplace transform in the region of convergence. See the relation of the Fourier and Laplace transforms for further information.

Examples

Here are some examples of the moment-generating function and the characteristic function for comparison. It can be seen that the characteristic function is a

of the moment-generating function

M_X(t)

when the latter exists.

Calculation

The moment-generating function is the expectation of a function of the random variable, it can be written as: * For a discrete

probability mass function In probability and statistics, a probability mass function (sometimes called ''probability function'' or ''frequency function'') is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes i ...

M_X(t)=\sum_^\infty e^\, p_i

* For a continuous

M_X(t)  = \int_^\infty e^ f(x)\,dx

* In the general case:

M_X(t) = \int_^\infty e^\,dF(x)

, using the Riemann–Stieltjes integral, and where

F

is the

. This is simply the Laplace-Stieltjes transform of

F

, but with the sign of the argument reversed. Note that for the case where

X

has a continuous

f(x)

M_X(-t)

is the

f(x)

& = 1 + tm_1 + \frac + \cdots + \frac +\cdots, \end

where

m_n

is the

n

th moment.

Linear transformations of random variables

If random variable

X

has moment generating function

M_X(t)

, then

\alpha X + \beta

has moment generating function

M_(t) = e^M_X(\alpha t)

= e^ M_X(\alpha t)

Linear combination of independent random variables

S_n = \sum_^n a_i X_i

, where the are independent random variables and the are constants, then the probability density function for is the

convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...

of the probability density functions of each of the , and the moment-generating function for is given by

M_(t) = M_(a_1t) M_(a_2t) \cdots M_(a_nt) \, .

Vector-valued random variables

For vector-valued random variables

\mathbf X

with real components, the moment-generating function is given by

M_X(\mathbf t) = \operatorname\left^\right

where

\mathbf t

is a vector and

\langle \cdot, \cdot \rangle

is the

dot product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a Scalar (mathematics), scalar as a result". It is also used for other symmetric bilinear forms, for example in a pseudo-Euclidean space. N ...

Important properties

Moment generating functions are positive and log-convex, with ''M''(0) = 1. An important property of the moment-generating function is that it uniquely determines the distribution. In other words, if

X

and

Y

are two random variables and for all values of ,

M_X(t) = M_Y(t),

then

F_X(x) = F_Y(x)

for all values of (or equivalently and have the same distribution). This statement is not equivalent to the statement "if two distributions have the same moments, then they are identical at all points." This is because in some cases, the moments exist and yet the moment-generating function does not, because the limit

\lim_ \sum_^n \frac

may not exist. The

log-normal distribution In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normal distribution, normally distributed. Thus, if the random variable is log-normally distributed ...

is an example of when this occurs.

Calculations of moments

The moment-generating function is so called because if it exists on an open interval around , then it is the exponential generating function of the moments of the

= M_X^(0) = \left. \frac\_.

That is, with being a nonnegative integer, the -th moment about 0 is the -th derivative of the moment generating function, evaluated at .

Other properties

Jensen's inequality provides a simple lower bound on the moment-generating function:

M_X(t) \geq e^,

where

\mu

is the mean of . The moment-generating function can be used in conjunction with

Markov's inequality In probability theory, Markov's inequality gives an upper bound on the probability that a non-negative random variable is greater than or equal to some positive Constant (mathematics), constant. Markov's inequality is tight in the sense that for e ...

to bound the upper tail of a real random variable . This statement is also called the Chernoff bound. Since

x \mapsto e^

is monotonically increasing for

t>0

, we have

\Pr(X \ge a) = \Pr(e^ \ge e^) \le e^ \operatorname\left^\right = e^M_X(t)

for any

t>0

and any , provided

M_X(t)

exists. For example, when is a standard normal distribution and

a > 0

, we can choose

t=a

and recall that

M_X(t)=e^

. This gives

\Pr(X\ge a)\le e^

, which is within a factor of of the exact value. Various lemmas, such as Hoeffding's lemma or Bennett's inequality provide bounds on the moment-generating function in the case of a zero-mean, bounded random variable. When

X

is non-negative, the moment generating function gives a simple, useful bound on the moments:

\le \left(\frac\right)^m M_X(t),

For any

X,m\ge 0

and

t>0

. This follows from the inequality

1+x\le e^x

into which we can substitute

x'=tx/m-1

implies

tx/m\le e^

for any Now, if

t > 0

and

x,m\ge 0

, this can be rearranged to

x^m \le (m/(te))^m e^

. Taking the expectation on both sides gives the bound on

\operatorname^m /math> in terms of \operatorname^/math>.

As an example, consider X\sim\text with k degrees of freedom. Then from the

examples Example may refer to: * ''exempli gratia'' (e.g.), usually read out in English as "for example" * .example, reserved as a domain name that may not be installed as a top-level domain of the Internet ** example.com, example.net, example.org, a ...

M_X(t) = (1-2t)^

. Picking

t=m/(2m+k)

and substituting into the bound:

\le ^ e^ ^m.

We know that in this case the correct bound is

le 2^m \Gamma(m+k/2)/\Gamma(k/2)

. To compare the bounds, we can consider the asymptotics for large

k

. Here the moment-generating function bound is

k^m(1+m^2/k + O(1/k^2))

, where the real bound is

k^m(1+(m^2-m)/k + O(1/k^2))

. The moment-generating function bound is thus very strong in this case.

Relation to other functions

Related to the moment-generating function are a number of other transforms that are common in probability theory: ;

Characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts: * The indicator function of a subset, that is the function \mathbf_A\colon X \to \, which for a given subset ''A'' of ''X'', has value 1 at points ...

: The

\varphi_X(t)

is related to the moment-generating function via

\varphi_X(t) = M_(t) = M_X(it):

the characteristic function is the moment-generating function of ''iX'' or the moment generating function of ''X'' evaluated on the imaginary axis. This function can also be viewed as the

of the

, which can therefore be deduced from it by inverse Fourier transform. ; Cumulant-generating function: The cumulant-generating function is defined as the logarithm of the moment-generating function; some instead define the cumulant-generating function as the logarithm of the

, while others call this latter the ''second'' cumulant-generating function. ;

Probability-generating function In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are often ...

: The

probability-generating function In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are often ...

is defined as

G(z) = \operatorname\left^X\right

This immediately implies that

G(e^t) = \operatorname\left^\right = M_X(t).

References

Citations

Sources

* {{DEFAULTSORT:Moment-Generating Function Moments (mathematics) Generating functions

Definition

Examples

Calculation

Linear transformations of random variables

Linear combination of independent random variables

Vector-valued random variables

Important properties

Calculations of moments

Other properties

Relation to other functions

See also

References

Citations

Sources