In statistical
estimation theory
Estimation theory is a branch of statistics that deals with estimating the values of Statistical parameter, parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such ...
, the coverage probability, or coverage for short, is the
probability
Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
that a
confidence interval or
confidence region will include the
true value
The True Value Company is an American wholesaler and Hardware store brand. The corporate headquarters are located in Chicago.
Historically True Value was a cooperative owned by retailers, but in 2018 it was purchased by ACON Investments. In Oc ...
(parameter) of interest.
It can be defined as the
proportion of instances where the interval surrounds the true value as assessed by
long-run frequency.
In statistical prediction, the coverage probability is the
probability
Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
that a
prediction interval
In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval (statistics), interval in which a future observation will fall, with a certain probability, given what has already been observed. Pr ...
will include an out-of-sample value of the
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
.
The coverage probability can be defined as the
proportion of instances where the interval surrounds an out-of-sample value as assessed by
long-run frequency.
Concept
The fixed
degree of certainty pre-specified by the analyst, referred to as the ''confidence level'' or ''confidence coefficient'' of the constructed interval, is effectively the nominal coverage probability of the procedure for constructing confidence intervals. Hence, referring to a "nominal confidence level" or "nominal confidence coefficient" (e.g., as a synonym for ''nominal coverage probability'') generally has to be considered
tautological and misleading, as the notion of ''confidence level'' itself inherently implies
nominality already. The nominal coverage probability is often set at 0.95. By contrast, the (true) coverage probability is the ''actual'' probability that the interval contains the parameter.
If all assumptions used in deriving a confidence interval are met, the nominal coverage probability will equal the coverage probability (termed "true" or "actual" coverage probability for emphasis). If any assumptions are not met, the actual coverage probability could either be less than or greater than the nominal coverage probability. When the actual coverage probability is greater than the nominal coverage probability, the interval is termed a conservative (confidence) interval; if it is less than the nominal coverage probability, the interval is termed anti-conservative, or permissive. For example, suppose the interest is in the
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
number of months that people with a particular type of
cancer
Cancer is a group of diseases involving Cell growth#Disorders, abnormal cell growth with the potential to Invasion (cancer), invade or Metastasis, spread to other parts of the body. These contrast with benign tumors, which do not spread. Po ...
remain in
remission following successful treatment with
chemotherapy
Chemotherapy (often abbreviated chemo, sometimes CTX and CTx) is the type of cancer treatment that uses one or more anti-cancer drugs (list of chemotherapeutic agents, chemotherapeutic agents or alkylating agents) in a standard chemotherapy re ...
. The confidence interval aims to contain the unknown mean remission duration with a given probability. In this example, the coverage probability would be the real probability that the interval actually contains the true mean remission duration.
A discrepancy between the coverage probability and the nominal coverage probability frequently occurs when approximating a
discrete distribution with a
continuous one. The construction of
binomial confidence intervals is a classic example where coverage probabilities rarely equal nominal levels.
For the binomial case, several techniques for constructing intervals have been created. The Wilson score interval is one well-known construction based on the
normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
. Other constructions include the Wald, exact, Agresti-Coull, and likelihood intervals. While the Wilson score interval may not be the most conservative estimate, it produces average coverage probabilities that are equal to nominal levels while still producing a comparatively narrow confidence interval.
The "probability" in ''coverage probability'' is interpreted with respect to a set of hypothetical repetitions of the entire data collection and analysis procedure. In these hypothetical repetitions,
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
data sets following the same
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
as the actual data are considered, and a confidence interval is computed from each of these data sets; see
Neyman construction. The coverage probability is the fraction of these computed confidence intervals that include the desired but unobservable parameter value.
Probability Matching
In estimation, when the coverage probability is equal to the nominal coverage probability, that is known as probability matching.
In prediction, when the coverage probability is equal to the nominal coverage probability, that is known as predictive probability matching.
Formula
The construction of the confidence interval ensures that the probability of finding the true parameter
in the
sample-dependent interval
is (at least)
:
:
See also
*
Binomial proportion confidence interval
*
Confidence distribution
*
False coverage rate
*
Interval estimation
Notes
References
{{Reflist, refs=
[{{cite journal , last = Severini, first = T , author2=Mukerjee, R , author3=Ghosh, M, year = 2002 , title = On an exact probability matching property of right-invariant priors , journal = Biometrika , volume = 89 , pages = 952–957 , jstor=4140551 , issue = 4, doi = 10.1093/biomet/89.4.952 ]
[{{cite book , author1=Ghosh, M , author2=Mukerjee, R , year = 1998 , title = Recent developments on probability matching priors , publisher = New York Science Publishers , pages = 227–252]
Estimation theory