Generalized Chi-square Distribution
   HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
and
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the generalized chi-squared distribution (or generalized chi-square distribution) is the distribution of a
quadratic form In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example, :4x^2 + 2xy - 3y^2 is a quadratic form in the variables and . The coefficients usually belong to a ...
of a multinormal variable (normal vector), or a linear combination of different normal variables and squares of normal variables. Equivalently, it is also a linear sum of independent noncentral chi-square variables and a normal variable. There are several other such generalizations for which the same term is sometimes used; some of them are special cases of the family discussed here, for example the
gamma distribution In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distri ...
.


Definition

The generalized chi-squared variable may be described in multiple ways. One is to write it as a linear sum of independent noncentral chi-square variables and a normal variable:Davies, R.B. (1973) Numerical inversion of a characteristic function.
Biometrika ''Biometrika'' is a peer-reviewed scientific journal published by Oxford University Press for thBiometrika Trust The editor-in-chief is Paul Fearnhead (Lancaster University). The principal focus of this journal is theoretical statistics. It was es ...
, 60 (2), 415–417
Davies, R,B. (1980) "Algorithm AS155: The distribution of a linear combination of ''χ''2 random variables", ''Applied Statistics'', 29, 323–333 :\xi=\sum_i w_i y_i + x, \quad y_i \sim \chi'^2(k_i,\lambda_i), \quad x \sim N(m,s^2). Here the parameters are the weights w_i, the degrees of freedom k_i and non-centralities \lambda_i of the constituent chi-squares, and the normal parameters m and s. Some important special cases of this have all weights w_i of the same sign, or have central chi-squared components, or omit the normal term. Since a non-central chi-squared variable is a sum of squares of normal variables with different means, the generalized chi-square variable is also defined as a sum of squares of independent normal variables, plus an independent normal variable: that is, a quadratic in normal variables. Another equivalent way is to formulate it as a quadratic form of a normal vector \boldsymbol: :\xi=q(\boldsymbol) = \boldsymbol' \mathbf \boldsymbol + \boldsymbol' \boldsymbol + q_0. Here \mathbf is a matrix, \boldsymbol is a vector, and q_0 is a scalar. These, together with the mean \boldsymbol and covariance matrix \mathbf of the normal vector \boldsymbol, parameterize the distribution. The parameters of the former expression (in terms of non-central chi-squares, a normal and a constant) can be calculated in terms of the parameters of the latter expression (quadratic form of a normal vector). If (and only if) \mathbf in this formulation is
positive-definite In mathematics, positive definiteness is a property of any object to which a bilinear form or a sesquilinear form may be naturally associated, which is positive-definite. See, in particular: * Positive-definite bilinear form * Positive-definite fu ...
, then all the w_i in the first formulation will have the same sign. For the most general case, a reduction towards a common standard form can be made by using a representation of the following form:Sheil, J., O'Muircheartaigh, I. (1977) "Algorithm AS106: The distribution of non-negative quadratic forms in normal variables",''Applied Statistics'', 26, 92–98 :X=(z+a)^\mathrm T A(z+a)+c^\mathrm T z= (x+b)^\mathrm T D(x+b)+d^\mathrm T x+e , where ''D'' is a diagonal matrix and where ''x'' represents a vector of uncorrelated standard normal random variables.


Computing the pdf/cdf/inverse cdf/random numbers

The probability density, cumulative distribution, and inverse cumulative distribution functions of a generalized chi-squared variable do not have simple closed-form expressions. However, numerical algorithms and computer code
Fortran and CMatlabPython
have been published to evaluate some of these, and to generate random samples. In the case where \boldsymbol = 0, it is possible to obtain an exact expression for the mean and variance of \xi, as shown in the article on
quadratic forms In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example, :4x^2 + 2xy - 3y^2 is a quadratic form in the variables and . The coefficients usually belong to a ...
.


Applications

The generalized chi-squared is the distribution of statistical estimates in cases where the usual
statistical theory The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical ...
does not hold, as in the examples below.


In model fitting and selection

If a
predictive model Predictive modelling uses statistics to predict outcomes. Most often the event one wants to predict is in the future, but predictive modelling can be applied to any type of unknown event, regardless of when it occurred. For example, predictive mod ...
is fitted by
least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the res ...
, but the residuals have either
autocorrelation Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable ...
or
heteroscedasticity In statistics, a sequence (or a vector) of random variables is homoscedastic () if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The s ...
, then alternative models can be compared (in
model selection Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the ...
) by relating changes in the sum of squares to an asymptotically valid generalized chi-squared distribution.Jones, D.A. (1983) "Statistical analysis of empirical models fitted by optimisation",
Biometrika ''Biometrika'' is a peer-reviewed scientific journal published by Oxford University Press for thBiometrika Trust The editor-in-chief is Paul Fearnhead (Lancaster University). The principal focus of this journal is theoretical statistics. It was es ...
, 70 (1), 67–88


Classifying normal vectors using Gaussian discriminant analysis

If \boldsymbol is a normal vector, its log likelihood is a
quadratic form In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example, :4x^2 + 2xy - 3y^2 is a quadratic form in the variables and . The coefficients usually belong to a ...
of \boldsymbol, and is hence distributed as a generalized chi-squared. The log likelihood ratio that \boldsymbol arises from one normal distribution versus another is also a
quadratic form In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example, :4x^2 + 2xy - 3y^2 is a quadratic form in the variables and . The coefficients usually belong to a ...
, so distributed as a generalized chi-squared. In Gaussian discriminant analysis, samples from multinormal distributions are optimally separated by using a
quadratic classifier In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general version of the linear classifier. The classific ...
, a boundary that is a quadratic function (e.g. the curve defined by setting the likelihood ratio between two Gaussians to 1). The classification error rates of different types (false positives and false negatives) are integrals of the normal distributions within the quadratic regions defined by this classifier. Since this is mathematically equivalent to integrating a quadratic form of a normal vector, the result is an integral of a generalized-chi-squared variable.


In signal processing

The following application arises in the context of
Fourier analysis In mathematics, Fourier analysis () is the study of the way general functions may be represented or approximated by sums of simpler trigonometric functions. Fourier analysis grew from the study of Fourier series, and is named after Josep ...
in
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
,
renewal theory Renewal theory is the branch of probability theory that generalizes the Poisson process for arbitrary holding times. Instead of exponentially distributed holding times, a renewal process may have any independent and identically distributed (IID) ho ...
in
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, and multi-antenna systems in
wireless communication Wireless communication (or just wireless, when the context allows) is the transfer of information between two or more points without the use of an electrical conductor, optical fiber or other continuous guided medium for the transfer. The most ...
. The common factor of these areas is that the sum of exponentially distributed variables is of importance (or identically, the sum of squared magnitudes of circularly-symmetric centered complex Gaussian variables). If Z_i are ''k''
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
, circularly-symmetric centered complex Gaussian random variables with
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...
0 and
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
\sigma_i^2, then the random variable :\tilde = \sum_^k , Z_i, ^2 has a generalized chi-squared distribution of a particular form. The difference from the standard chi-squared distribution is that Z_i are complex and can have different variances, and the difference from the more general generalized chi-squared distribution is that the relevant scaling matrix ''A'' is diagonal. If \mu=\sigma_i^2 for all ''i'', then \tilde, scaled down by \mu/2 (i.e. multiplied by 2/\mu), has a
chi-squared distribution In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...
, \chi^2(2k), also known as an
Erlang distribution The Erlang distribution is a two-parameter family of continuous probability distributions with support x \in independent exponential distribution">exponential variables with mean 1/\lambda each. Equivalently, it is the distribution of the time ...
. If \sigma_i^2 have distinct values for all ''i'', then \tilde has the pdf : f(x; k,\sigma_1^2,\ldots,\sigma_k^2) = \sum_^k \frac \quad\text x\geq0. If there are sets of repeated variances among \sigma_i^2, assume that they are divided into ''M'' sets, each representing a certain variance value. Denote \mathbf=(r_1, r_2, \dots, r_M) to be the number of repetitions in each group. That is, the ''m''th set contains r_m variables that have variance \sigma^2_m. It represents an arbitrary linear combination of independent \chi^2-distributed random variables with different degrees of freedom: :\tilde = \sum_^M \sigma^2_m/2* Q_m, \quad Q_m \sim \chi^2(2r_m) \, . The pdf of \tilde isE. Björnson, D. Hammarwall, B. Ottersten (2009
"Exploiting Quantized Channel Norm Feedback through Conditional Statistics in Arbitrarily Correlated MIMO Systems"
''IEEE Transactions on Signal Processing'', 57, 4027–4041
: f(x; \mathbf, \sigma^2_1, \dots \sigma^2_M) = \prod_^M \frac \sum_^M \sum_^ \frac (-x)^ e^, \quad \textx\geq0 , where :\Psi_ = (-1)^ \sum_ \prod_ \binom \left(\frac 1 \!-\!\frac \right)^, with \mathbf= _1,\ldots,i_MT from the set \Omega_ of all partitions of l-1 (with i_k=0) defined as : \Omega_ = \left\.


See also

* Degrees of freedom (statistics)#Alternative *
Noncentral chi-squared distribution In probability theory and statistics, the noncentral chi-squared distribution (or noncentral chi-square distribution, noncentral \chi^2 distribution) is a noncentral generalization of the chi-squared distribution. It often arises in the power a ...
*
Chi-squared distribution In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...


References


External links


Davies, R.B.: Fortran and C source code for "Linear combination of chi-squared random variables"

Das, A: MATLAB code to compute the statistics, pdf, cdf, inverse cdf and random numbers of the generalized chi-square distribution.
{{ProbDistributions Continuous distributions