The scaled inverse chi-squared distribution is the distribution for ''x'' = 1/''s''
2, where ''s''
2 is a sample mean of the squares of ν independent
normal Normal(s) or The Normal(s) may refer to:
Film and television
* ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson
* ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie
* ''Norma ...
random variables that have mean 0 and inverse variance 1/σ
2 = τ
2. The distribution is therefore parametrised by the two quantities ν and τ
2, referred to as the ''number of chi-squared degrees of freedom'' and the ''scaling parameter'', respectively.
This family of scaled inverse chi-squared distributions is closely related to two other distribution families, those of the
inverse-chi-squared distribution
In probability and statistics, the inverse-chi-squared distribution (or inverted-chi-square distributionBernardo, J.M.; Smith, A.F.M. (1993) ''Bayesian Theory'' ,Wiley (pages 119, 431) ) is a continuous probability distribution of a positive-val ...
and the
inverse-gamma distribution
In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to ...
. Compared to the inverse-chi-squared distribution, the scaled distribution has an extra parameter ''τ''
2, which scales the distribution horizontally and vertically, representing the inverse-variance of the original underlying process. Also, the scaled inverse chi-squared distribution is presented as the distribution for the inverse of the ''mean'' of ν squared deviates, rather than the inverse of their ''sum''. The two distributions thus have the relation that if
:
then
Compared to the inverse gamma distribution, the scaled inverse chi-squared distribution describes the same data distribution, but using a different
parametrization, which may be more convenient in some circumstances. Specifically, if
:
then
Either form may be used to represent the
maximum entropy distribution for a fixed first inverse
moment
Moment or Moments may refer to:
* Present time
Music
* The Moments, American R&B vocal group Albums
* ''Moment'' (Dark Tranquillity album), 2020
* ''Moment'' (Speed album), 1998
* ''Moments'' (Darude album)
* ''Moments'' (Christine Guldbrand ...
and first logarithmic moment
.
The scaled inverse chi-squared distribution also has a particular use in
Bayesian statistics
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
, somewhat unrelated to its use as a predictive distribution for ''x'' = 1/''s''
2. Specifically, the scaled inverse chi-squared distribution can be used as a
conjugate prior
In Bayesian probability theory, if the posterior distribution p(\theta \mid x) is in the same probability distribution family as the prior probability distribution p(\theta), the prior and posterior are then called conjugate distributions, and th ...
for the
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
parameter of a
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu ...
. In this context the scaling parameter is denoted by σ
02 rather than by τ
2, and has a different interpretation. The application has been more usually presented using the
inverse-gamma distribution
In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to ...
formulation instead; however, some authors, following in particular Gelman ''et al.'' (1995/2004) argue that the inverse chi-squared parametrisation is more intuitive.
Characterization
The
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
of the scaled inverse chi-squared distribution extends over the domain
and is
:
where
is the
degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
parameter and
is the
scale parameter
In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution.
Definition
If a family o ...
. The cumulative distribution function is
:
:
where
is the
incomplete gamma function
In mathematics, the upper and lower incomplete gamma functions are types of special functions which arise as solutions to various mathematical problems such as certain integrals.
Their respective names stem from their integral definitions, which ...
,
is the
gamma function
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except ...
and
is a
regularized gamma function
In mathematics, the upper and lower incomplete gamma functions are types of special functions which arise as solutions to various mathematical problems such as certain integrals.
Their respective names stem from their integral definitions, whic ...
. The
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts:
* The indicator function of a subset, that is the function
::\mathbf_A\colon X \to \,
:which for a given subset ''A'' of ''X'', has value 1 at points ...
is
:
:
where
is the modified
Bessel function of the second kind
Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions of Bessel's differential equation
x^2 \frac + x \frac + \left(x^2 - \alpha^2 \right)y = 0
for an arbitrary ...
.
Parameter estimation
The
maximum likelihood estimate
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statisti ...
of
is
:
The maximum likelihood estimate of
can be found using
Newton's method
In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-valu ...
on:
:
where
is the
digamma function
In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function:
:\psi(x)=\frac\ln\big(\Gamma(x)\big)=\frac\sim\ln-\frac.
It is the first of the polygamma functions. It is strictly increasing and strictly ...
. An initial estimate can be found by taking the formula for mean and solving it for
Let
be the sample mean. Then an initial estimate for
is given by:
:
Bayesian estimation of the variance of a normal distribution
The scaled inverse chi-squared distribution has a second important application, in the Bayesian estimation of the variance of a Normal distribution.
According to
Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
, the
posterior probability distribution
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior p ...
for quantities of interest is proportional to the product of a
prior distribution
In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken int ...
for the quantities and a
likelihood function
The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
:
:
where ''D'' represents the data and ''I'' represents any initial information about σ
2 that we may already have.
The simplest scenario arises if the mean μ is already known; or, alternatively, if it is the
conditional distribution
In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the co ...
of σ
2 that is sought, for a particular assumed value of μ.
Then the likelihood term ''L''(σ
2, ''D'') = ''p''(''D'', σ
2) has the familiar form
: