Confidence Region
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, a confidence region is a multi-dimensional generalization of a
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
. It is a set of points in an ''n''-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.


Interpretation

The confidence region is calculated in such a way that if a set of measurements were repeated many times and a confidence region calculated in the same way on each set of measurements, then a certain percentage of the time (e.g. 95%) the confidence region would include the point representing the "true" values of the set of variables being estimated. However, unless certain assumptions about
prior probabilities In Bayesian probability, Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some e ...
are made, it does not mean, when one confidence region has been calculated, that there is a 95% probability that the "true" values lie inside the region, since we do not assume any particular probability distribution of the "true" values and we may or may not have other information about where they are likely to lie.


The case of independent, identically normally-distributed errors

Suppose we have found a solution \boldsymbol to the following overdetermined problem: :\mathbf = \mathbf\boldsymbol + \boldsymbol where Y is an ''n''-dimensional column vector containing observed values of the
dependent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
, X is an ''n''-by-''p'' matrix of observed values of
independent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
s (which can represent a physical model) which is assumed to be known exactly, \boldsymbol is a column vector containing the ''p'' parameters which are to be estimated, and \boldsymbol is an ''n''-dimensional column vector of errors which are assumed to be independently distributed with
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
s with zero mean and each having the same unknown variance \sigma^2. A joint 100(1 − ''α'') % confidence region for the elements of \boldsymbol is represented by the set of values of the vector b which satisfy the following inequality: : (\boldsymbol - \mathbf)^\prime \mathbf^\prime\mathbf(\boldsymbol - \mathbf) \le ps^2 F_(p,\nu) , where the variable b represents any point in the confidence region, ''p'' is the number of parameters, i.e. number of elements of the vector \boldsymbol, \boldsymbol is the vector of estimated parameters, and ''s''2 is the
reduced chi-squared In statistics, the reduced chi-square statistic is used extensively in goodness of fit testing. It is also known as mean squared weighted deviation (MSWD) in isotopic dating and variance of unit weight in the context of weighted least squares. I ...
, an
unbiased estimate In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In st ...
of \sigma^2 equal to :s^2=\frac. Further, ''F'' is the
quantile function In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value equ ...
of the
F-distribution In probability theory and statistics, the ''F''-distribution or F-ratio, also known as Snedecor's ''F'' distribution or the Fisher–Snedecor distribution (after Ronald Fisher and George W. Snedecor) is a continuous probability distribution th ...
, with ''p'' and \nu = n - p
degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
, \alpha is the
statistical significance In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
level, and the symbol X^\prime means the
transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...
of X. The expression can be rewritten as: : (\boldsymbol - \mathbf)^\prime \mathbf_\mathbf^ (\boldsymbol - \mathbf) \le p F_(p,\nu) , where \mathbf_\mathbf = s^2 \left( \mathbf^\prime\mathbf \right)^ is the least-squares scaled covariance matrix of \boldsymbol. The above inequality defines an
ellipsoid An ellipsoid is a surface that may be obtained from a sphere by deforming it by means of directional scalings, or more generally, of an affine transformation. An ellipsoid is a quadric surface;  that is, a surface that may be defined as the ...
al region in the ''p''-dimensional Cartesian parameter space R''p''. The centre of the ellipsoid is at the estimate \boldsymbol. According to Press et al., it is easier to plot the ellipsoid after doing
singular value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is related ...
. The lengths of the axes of the ellipsoid are proportional to the reciprocals of the values on the diagonals of the diagonal matrix, and the directions of these axes are given by the rows of the 3rd matrix of the decomposition.


Weighted and generalised least squares

Now consider the more general case where some distinct elements of \boldsymbol have known nonzero
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the les ...
(in other words, the errors in the observations are not independently distributed), and/or the standard deviations of the errors are not all equal. Suppose the covariance matrix of \boldsymbol is \mathbf\sigma^2, where V is an ''n''-by-''n'' nonsingular matrix which was equal to \mathbf in the more specific case handled in the previous section, (where I is the
identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. Terminology and notation The identity matrix is often denoted by I_n, or simply by I if the size is immaterial o ...
,) but here is allowed to have nonzero
off-diagonal elements In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek δ ...
representing the covariance of pairs of individual observations, as well as not necessarily having all the diagonal elements equal. It is possible to find a nonsingular symmetric matrix P such that :\mathbf^\prime\mathbf = \mathbf\mathbf = \mathbf In effect, P is a square root of the covariance matrix V. The least-squares problem :\mathbf = \mathbf\boldsymbol + \boldsymbol can then be transformed by left-multiplying each term by the inverse of P, forming the new problem formulation :\mathbf = \mathbf\boldsymbol + \mathbf , where :\mathbf = \mathbf^\mathbf : \mathbf = \mathbf^\mathbf and :\mathbf = \mathbf^\boldsymbol A joint confidence region for the parameters, i.e. for the elements of \boldsymbol, is then bounded by the ellipsoid given by: : (\mathbf - \boldsymbol)^\prime \mathbf^\prime\mathbf(\mathbf - \boldsymbol) = (\mathbf^\prime\mathbf - \mathbf^\prime\mathbf^\prime\mathbf)F_(p,n-p). Here ''F'' represents the percentage point of the ''F''-distribution and the quantities ''p'' and ''n-p'' are the
degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
which are the parameters of this distribution.


Nonlinear problems

Confidence regions can be defined for any probability distribution. The experimenter can choose the significance level and the shape of the region, and then the size of the region is determined by the probability distribution. A natural choice is to use as a boundary a set of points with constant \chi^2 ( chi-squared) values. One approach is to use a linear approximation to the nonlinear model, which may be a close approximation in the vicinity of the solution, and then apply the analysis for a linear problem to find an approximate confidence region. This may be a reasonable approach if the confidence region is not very large and the second derivatives of the model are also not very large.
Bootstrapping In general, bootstrapping usually refers to a self-starting process that is supposed to continue or grow without external input. Etymology Tall boots may have a tab, loop or handle at the top known as a bootstrap, allowing one to use fingers ...
approaches can also be used.Hutton TJ, Buxton BF, Hammond P, Potts HWW (2003)
Estimating average growth trajectories in shape-space using kernel smoothing
''IEEE Transactions on Medical Imaging'', 22(6):747-53


See also

*
Circular error probable In the military science of ballistics, circular error probable (CEP) (also circular error probability or circle of equal probability) is a measure of a weapon system's precision. It is defined as the radius of a circle, centered on the mean, wh ...
*
Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
*
Confidence band A confidence band is used in statistical analysis to represent the uncertainty in an estimate of a curve or function based on limited or noisy data. Similarly, a prediction band is used to represent the uncertainty about the value of a new data-p ...
*
Credible region In Bayesian statistics, a credible interval is an interval within which an unobserved parameter value falls with a particular probability. It is an interval in the domain of a posterior probability distribution or a predictive distribution. The ...


Notes


References

* *{{cite book , title=Numerical Recipes in C: The Art of Scientific Computing , url=https://archive.org/details/numericalrecipes00pres_0 , url-access=registration , last=Press , first=W.H. , author2=S.A. Teukolsky , author3=W.T. Vetterling , author4=B.P. Flannery , year=1992 , orig-year=1988 , publisher=Cambridge University Press , location=Cambridge UK , edition=2nd Estimation theory