statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

, an exact (significance) test is a test such that if the

null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...

is true, then all assumptions made during the derivation of the distribution of the

test statistic A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specifi ...

are met. Using an exact test provides a

significance test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...

that maintains the

type I error rate In statistical hypothesis testing, a type I error is the mistaken rejection of an actually true null hypothesis (also known as a "false positive" finding or conclusion; example: "an innocent person is convicted"), while a type II error is the fa ...

of the test (

\alpha

) at the desired significance level of the test. For example, an exact test at a

significance level In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the ...

\alpha = 5\%

, when repeated over many samples where the

is true, will reject at most

5\%

of the time. This is in contrast to an ''approximate test'' in which the desired type I error rate is only

approximately An approximation is anything that is intentionally similar but not exactly equal to something else. Etymology and usage The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very near'' and the prefix ' ...

maintained (i.e.: the test might reject > 5% of the time), while this approximation may be made as close to

\alpha

as desired by making the sample size sufficiently large. Exact tests that are based on discrete test statistics may be conservative, indicating that the actual rejection rate lies below the nominal significance level

\alpha

. As an example, this is the case for Fisher's exact test and its more powerful alternative,

Boschloo's test Boschloo's test is a statistical hypothesis test for analysing 2x2 contingency tables. It examines the association of two Bernoulli distributed random variables and is a uniformly more powerful alternative to Fisher's exact test. It was propo ...

. If the test statistic is continuous, it will reach the significance level exactly. Parametric tests, such as those used in

exact statistics Exact statistics, such as that described in exact test, is a branch of statistics that was developed to provide more accurate results pertaining to statistical testing and interval estimation by eliminating procedures based on asymptotic and appro ...

, are exact tests when the parametric assumptions are fully met, but in practice, the use of the term ''exact'' (significance) ''test'' is reserved for non-parametric tests, i.e., tests that do not rest on parametric assumptions. However, in practice, most implementations of non-parametric test software use asymptotical algorithms to obtain the significance value, which renders the test non-exact. Hence, when a result of statistical analysis is termed an “exact test” or specifies an “exact p-value”, this implies that the test is defined without parametric assumptions and is evaluated without making use of approximate algorithms. In principle, however, this could also signify that a parametric test has been employed in a situation where all parametric assumptions are fully met, but it is in most cases impossible to prove this completely in a real-world situation. Exceptions in which it is certain that parametric tests are exact include tests based on the binomial or Poisson distributions. The term

permutation test A permutation test (also called re-randomization test) is an exact statistical hypothesis test making use of the proof by contradiction. A permutation test involves two or more samples. The null hypothesis is that all samples come from the same di ...

is sometimes used as a synonym for exact test, but it should be kept in mind that all permutation tests are exact tests, but not all exact tests are permutation tests.

Formulation

The basic equation underlying exact tests is :

\Pr(\text)=\sum_ \Pr(\mathbf)

where: :*x is the actual observed outcome, :*Pr(y) is the probability under the null hypothesis of a potentially observed outcome y, :*''T''(y) is the value of the test statistic for an outcome y, with larger values of ''T'' representing cases which notionally represent greater departures from the null hypothesis, and where the sum ranges over all outcomes y (including the observed one) that have the same value of the test statistic obtained for the observed sample x, or a larger one.

Example: Pearson's chi-squared test versus an exact test

A simple example of this concept involves the observation that Pearson's chi-squared test is an approximate test. Suppose Pearson's chi-squared test is used to ascertain whether a six-sided die is "fair", indicating that it renders each of the six possible outcomes equally often. If the die is thrown ''n'' times, then one "expects" to see each outcome ''n''/6 times. The test statistic is :

\sum \frac
= \sum_^6 \frac,

where ''X''_''k'' is the number of times outcome ''k'' is observed. If the null hypothesis of "fairness" is true, then the

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...

of the test statistic can be made as close as desired to the chi-squared distribution with 5 degrees of freedom by making the sample size ''n'' sufficiently large. On the other hand, if ''n'' is small, then the probabilities based on chi-squared distributions may not be sufficiently close approximations. Finding the exact probability that this test statistic exceeds a certain value would then require a

combinatorial enumeration Enumerative combinatorics is an area of combinatorics that deals with the number of ways that certain patterns can be formed. Two examples of this type of problem are counting combinations and counting permutations. More generally, given an infini ...

of all outcomes of the experiment that gives rise to such a large value of the test statistic. It is then questionable whether the same test statistic ought to be used. A

likelihood-ratio test In statistics, the likelihood-ratio test assesses the goodness of fit of two competing statistical models based on the ratio of their likelihoods, specifically one found by maximization over the entire parameter space and another found after im ...

might be preferred, and the test statistic might not be a monotone function of the one above.

Example: Fisher's exact test

Fisher's exact test Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, a ...

, based on the work of

Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...

and

E. J. G. Pitman Edwin James George Pitman (29 October 1897 – 21 July 1993) was an Australian mathematician who made significant contributions to statistics and probability theory. In particular, he is remembered primarily as the originator of the Pitman per ...

in the 1930s, is exact because the sampling distribution (conditional on the marginals) is known exactly. This should be compared with Pearson's chi-squared test, which (although it tests the same null) is not exact because the distribution of the test statistic is only asymptotically correct.

References

(1954) '' Statistical Methods for Research Workers''. Oliver and Boyd. * Mehta, C.R. ; Patel, N.R. (1998). "Exact Inference for Categorical Data". In P. Armitage and T. Colton, eds., ''Encyclopedia of Biostatistics'', Chichester: John Wiley, pp. 1411–1422
unpublished preprint
* {{statistics, inference, collapsed Statistical tests de:Statistischer Test#Exakter Test

Formulation

Example: Pearson's chi-squared test versus an exact test

Example: Fisher's exact test

See also

References