HOME

TheInfoList



OR:

Boschloo's test is a
statistical hypothesis test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
for analysing 2x2
contingency tables In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. They are heavily used in survey research, business ...
. It examines the association of two Bernoulli distributed
random variables A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
and is a uniformly more powerful alternative to
Fisher's exact test Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, ...
. It was proposed in 1970 by R. D. Boschloo.


Setting

A 2x2 contingency table visualizes n independent observations of two binary variables A and B: : \begin & B = 1 & B = 0 & \mbox\\ \hline A = 1 & x_ & x_ & n_1 \\ A = 0 & x_ & x_ & n_0 \\ \hline \mbox & s_1 & s_0 & n\\ \end The probability distribution of such tables can be classified into three distinct cases. # The row sums n_1, n_0 and column sums s_1, s_0 are fixed in advance and not random.
Then all x_ are determined by x_. If A and B are independent, x_ follows a
hypergeometric distribution In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, ''without'' ...
with parameters n, n_1, s_1:
x_ \sim \mbox(n, n_1, s_1). # The row sums n_1, n_0 are fixed in advance but the column sums s_1, s_0 are not.
Then all random parameters are determined by x_ and x_ and x_, x_ follow a
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
with probabilities p_1, p_0:
x_ \sim B(n_1, p_1)
x_ \sim B(n_0, p_0) # Only the total number n is fixed but the row sums n_1, n_0 and the column sums s_1, s_0 are not.
Then the random vector (x_, x_, x_, x_) follows a
multinomial distribution In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a ''k''-sided dice rolled ''n'' times. For ''n'' independent trials each of w ...
with probability vector (p_, p_, p_, p_).
Fisher's exact test Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, ...
is designed for the first case and therefore an
exact Exact may refer to: * Exaction, a concept in real property law * ''Ex'Act'', 2016 studio album by Exo * Schooner Exact, the ship which carried the founders of Seattle Companies * Exact (company), a Dutch software company * Exact Change, an Ameri ...
conditional test (because it conditions on the column sums). The typical example of such a case is the
Lady tasting tea In the design of experiments in statistics, the lady tasting tea is a randomized experiment devised by Ronald Fisher and reported in his book ''The Design of Experiments'' (1935). The experiment is the original exposition of Fisher's notion of a ...
: A lady tastes 8 cups of tea with milk. In 4 of those cups the milk is poured in before the tea. In the other 4 cups the tea is poured in first. The lady tries to assign the cups to the two categories. Following our notation, the random variable A represents the used method (1 = milk first, 0 = milk last) and B represents the lady's guesses (1 = milk first guessed, 0 = milk last guessed). Then the row sums are the fixed numbers of cups prepared with each method: n_1 = 4, n_0 = 4. The lady knows that there are 4 cups in each category, so will assign 4 cups to each method. Thus, the column sums are also fixed in advance: s_1 = 4, s_0 = 4. If she is not able to tell the difference, A and B are independent and the number x_ of correctly classified cups with milk first follows the hypergeometric distribution \mbox(8, 4, 4). Boschloo's test is designed for the second case and therefore an exact unconditional test. Examples of such a case are often found in medical research, where a binary
endpoint An endpoint, end-point or end point may refer to: * Endpoint (band), a hardcore punk band from Louisville, Kentucky * Endpoint (chemistry), the conclusion of a chemical reaction, particularly for titration * Outcome measure, a measure used as an e ...
is compared between two patient groups. Following our notation, A = 1 represents the first group that receives some medication of interest. A = 0 represents the second group that receives a
placebo A placebo ( ) is a substance or treatment which is designed to have no therapeutic value. Common placebos include inert tablets (like sugar pills), inert injections (like Saline (medicine), saline), sham surgery, and other procedures. In general ...
. B indicates the cure of a patient (1 = cure, 0 = no cure). Then the row sums equal the group sizes and are usually fixed in advance. The column sums are the total number of cures respectively disease continuations and not fixed in advance. An example for the third case can be constructed as follows: Simultaneously flip two distinguishable coins A and B and do this n times. If we count the number of results in our 2x2 table (1 = head, 0 = tail), we neither know in advance how often coin A shows head or tail (row sums random), nor do we know how often coin B shows head or tail (column sums random).


Test hypothesis

The
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
of Boschloo's
one-tailed test In statistical significance testing, a one-tailed test and a two-tailed test are alternative ways of computing the statistical significance of a parameter inferred from a data set, in terms of a test statistic. A two-tailed test is appropriate i ...
(high values of x_1 favor the alternative hypothesis) is: : H_0: p_1 \le p_0 The null hypothesis of the one-tailed test can also be formulated in the other direction (small values of x_1 favor the alternative hypothesis): : H_0: p_1 \ge p_0 The null hypothesis of the two-tailed test is: : H_0: p_1 = p_0 There is no universal definition of the two-tailed version of Fisher's exact test. Since Boschloo's test is based on Fisher's exact test, a universal two-tailed version of Boschloo's test also doesn't exist. In the following we deal with the one-tailed test and H_0: p_1 \le p_0.


Boschloo's idea

We denote the desired
significance level In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the ...
by \alpha. Fisher's exact test is a conditional test and appropriate for the first of the above mentioned cases. But if we treat the observed column sum s_1 as fixed in advance, Fisher's exact test can also be applied to the second case. The true
size Size in general is the Magnitude (mathematics), magnitude or dimensions of a thing. More specifically, ''geometrical size'' (or ''spatial size'') can refer to linear dimensions (length, width, height, diameter, perimeter), area, or volume ...
of the test then depends on the nuisance parameters p_1 and p_0. It can be shown that the size maximum \max\limits_\big(\mbox(p_1, p_0)\big) is taken for equal proportions p=p_1=p_0 and is still controlled by \alpha. However, Boschloo stated that for small sample sizes, the maximal size is often considerably smaller than \alpha. This leads to an undesirable loss of
power Power most often refers to: * Power (physics), meaning "rate of doing work" ** Engine power, the power put out by an engine ** Electric power * Power (social and political), the ability to influence people or events ** Abusive power Power may a ...
. Boschloo proposed to use Fisher's exact test with a greater nominal level \alpha^* > \alpha. Here, \alpha^* should be chosen as large as possible such that the maximal size is still controlled by \alpha: \max\limits_\big(\mbox(p)\big) \le \alpha. This method was especially advantageous at the time of Boschloo's publication because \alpha^* could be looked up for common values of \alpha, n_1 and n_0. This made performing Boschloo's test computationally easy.


Test statistic

The
decision rule In decision theory, a decision rule is a function which maps an observation to an appropriate action. Decision rules play an important role in the theory of statistics and economics, and are closely related to the concept of a strategy (game theory ...
of Boschloo's approach is based on Fisher's exact test. An equivalent way of formulating the test is to use the p-value of Fisher's exact test as
test statistic A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specif ...
. Fisher's p-value is calculated from the hypergeometric distribution (for ease of notation we write x_1, x_0 instead of x_, x_): : p_F = 1-F_(x_1-1) The distribution of p_F is determined by the binomial distributions of x_1 and x_0 and depends on the unknown nuisance parameter p. For a specified significance level \alpha, the
critical value Critical value may refer to: *In differential topology, a critical value of a differentiable function between differentiable manifolds is the image (value of) ƒ(''x'') in ''N'' of a critical point ''x'' in ''M''. *In statistical hypothesis ...
of p_F is the maximal value \alpha^* that satisfies \max\limits_P(p_F \le \alpha^*) \le \alpha. The critical value \alpha^* is equal to the nominal level of Boschloo's original approach.


Modification

Boschloo's test deals with the unknown nuisance parameter p by taking the maximum over the whole parameter space ,1/math>. The Berger & Boos procedure takes a different approach by maximizing P(p_F \le \alpha^*) over a (1-\gamma)
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
of p = p_1 = p_0 and adding \gamma. \gamma is usually a small value such as 0.001 or 0.0001. This results in a modified Boschloo's test which is also exact.


Comparison to other exact tests

All exact tests hold the specified significance level but can have varying power in different situations. Mehrotra et al. compared the power of some exact tests in different situations. The results regarding Boschloo's test are summarized in the following.


Modified Boschloo's test

Boschloo's test and the modified Boschloo's test have similar power in all considered scenarios. Boschloo's test has slightly more power in some cases, and vice versa in some other cases.


Fisher's exact test

Boschloo's test is by construction uniformly more powerful than Fisher's exact test. For small sample sizes (e.g. 10 per group) the power difference is large, ranging from 16 to 20 percentage points in the regarded cases. The power difference is smaller for greater sample sizes.


Exact Z-Pooled test

This test is based on the test statistic : Z_P(x_1, x_0) = \frac, where \hat p_i = \frac are the group event rates and \tilde p = \frac is the pooled event rate. The power of this test is similar to that of Boschloo's test in most scenarios. In some cases, the Z-Pooled test has greater power, with differences mostly ranging from 1 to 5 percentage points. In very few cases, the difference goes up to 9 percentage points. This test can also be modified by the Berger & Boos procedure. However, the resulting test has very similar power to the unmodified test in all scenarios.


Exact Z-Unpooled test

This test is based on the test statistic : Z_U(x_1, x_0) = \frac, where \hat p_i = \frac are the group event rates. The power of this test is similar to that of Boschloo's test in many scenarios. In some cases, the Z-Unpooled test has greater power, with differences ranging from 1 to 5 percentage points. However, in some other cases, Boschloo's test has noticeably greater power, with differences up to 68 percentage points. This test can also be modified by the Berger & Boos procedure. The resulting test has similar power to the unmodified test in most scenarios. In some cases, the power is considerably improved by the modification but the overall power comparison to Boschloo's test remains unchanged.


Software

The calculation of Boschloo's test can be performed in following software: * The function ''scipy.stats.boschloo_exact'' from
SciPy SciPy (pronounced "sigh pie") is a free and open-source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal ...
* Packages ''Exact'' and ''exact2x2'' of the programming language R *
StatXact StatXact is a statistical software package for analyzing data using exact statistics. It calculates exact p-values and confidence intervals for contingency tables and non-parametric procedures. It is marketed by Cytel Inc. References * E ...


See also

*
Fisher's exact test Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, ...
*
Barnard's test In statistics, Barnard’s test is an exact test used in the analysis of contingency tables with one margin fixed. Barnard’s tests are really a class of hypothesis tests, also known as unconditional exact tests for two independent binomials. ...


References

{{Reflist Statistical tests for contingency tables Nonparametric statistics