statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

, Barnard’s test is an

exact test In statistics, an exact (significance) test is a test such that if the null hypothesis is true, then all assumptions made during the derivation of the distribution of the test statistic are met. Using an exact test provides a significance test th ...

used in the analysis of

contingency table In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. They are heavily used in survey research, business i ...

s with one margin fixed. Barnard’s tests are really a class of hypothesis tests, also known as unconditional exact tests for two independent binomials. These tests examine the association of two

categorical variable In statistics, a categorical variable (also called qualitative variable) is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or ...

s and are often a more powerful alternative than

Fisher's exact test Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, ...

for contingency tables. While first published in 1945 by G.A. Barnard, the test did not gain popularity due to the computational difficulty of calculating the value and Fisher’s specious disapproval. Nowadays, for small / moderate sample sizes computers can often implement Barnard’s test in a few seconds.

Purpose and scope

Barnard’s test is used to test the independence of rows and columns in a contingency table. The test assumes each response is independent. Under independence, there are three types of study designs that yield a table, and Barnard's test applies to the second type. To distinguish the different types of designs, suppose a researcher is interested in testing whether a treatment quickly heals an infection. # One possible study design would be to sample 100 infected subjects, and for each subject see if they got the novel treatment or the old, standard, medicine, and see if the infection is still present after a set time. This type of design is common in

cross-sectional studies In medical research, social science, and biology, a cross-sectional study (also known as a cross-sectional analysis, transverse study, prevalence study) is a type of observational study that analyzes data from a population, or a representative su ...

, or ‘field observations’ such as

epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evidenc ...

. # Another possible study design would be to give 50 infected subjects the treatment, 50 infected subjects the placebo, and see if the infection is still present after a set time. This type of design is common in

clinical trials Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, dietar ...

. # The final possible study design would be to give 50 infected subjects the treatment, 50 infected subjects the placebo, and stop the experiment once a pre-determined number of subjects has healed from the infection. This type of design is rare, but has the same structure as the ''

lady tasting tea In the design of experiments in statistics, the lady tasting tea is a randomized experiment devised by Ronald Fisher and reported in his book ''The Design of Experiments'' (1935). The experiment is the original exposition of Fisher's notion of a ...

'' study that led

R.A. Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...

to create

. Although the results of each design of experiment can be laid out in nearly identical-appearing tables, their statistics are different, and hence the criteria for a "significant" result are different for each: # The probability of a table under the first study design is given by the

multinomial distribution In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a ''k''-sided dice rolled ''n'' times. For ''n'' independent trials each of w ...

; where the total number of samples taken is the only statistical constraint. This is a form of uncontrolled experiment, or "field observation", where experimenter simply "takes the data as it comes". # The second study design is given by the product of two independent

binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...

s; the totals in one of the margins (either the row totals or the column totals) are constrained by the experimental design, but the totals in other margin are free. This is by far the most common form of experimental design, where the experimenter constrains part of the experiment, say by assigning half of the subjects to be provided with a new medicine and the other half to receive an older, conventional medicine, but has no control over the numbers of individuals in each controlled category who either recover or succumb to the illness. # The third design is given by the

hypergeometric distribution In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, ''without'' ...

; where both the total numbers in each column and row are constrained. For example an individual is allowed to taste 8 cups of soda, but must assign four to each category "brand X" and "brand Y", so that both the row totals and the column totals are constrained to four. This kind of experiment is complicated to manage, and is almost unknown in practical experiments. The operational difference between Barnard’s exact test and Fisher’s ‘exact’ test is how they handle the

nuisance parameter Nuisance (from archaic ''nocence'', through Fr. ''noisance'', ''nuisance'', from Lat. ''nocere'', "to hurt") is a common law tort. It means that which causes offence, annoyance, trouble or injury. A nuisance can be either public (also "common") ...

(s) of the common success probability, when calculating the value.

avoids estimating the nuisance parameter(s) by conditioning on both margins, an approximately

ancillary statistic An ancillary statistic is a measure of a sample whose distribution (or whose pmf or pdf) does not depend on the parameters of the model. An ancillary statistic is a pivotal quantity that is also a statistic. Ancillary statistics can be used to ...

that constrains the possible outcomes. Barnard’s test considers all legitimate possible values of the nuisance parameter(s) and chooses the value(s) that maximizes the value. The theoretical difference between the tests is that Barnard’s test uses the double-

binomially distributed In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...

, whereas Fisher’s test, because of the conditioning uses is the

. However, even when the data come from double-binomial distribution, the conditioning (that leads to using the hypergeometric distribution for calculating the Fisher's exact p-value) produces a valid test. Both tests are valid, that is, bound the type I error rate at the alpha level. However, Barnard’s test can be more powerful than Fisher’s test because it considers more ‘as or more extreme’ tables, by not conditioning on the second margin, which the procedure for Fisher’s test ignores. In fact, one variant of Barnard’s test, called

Boschloo's test Boschloo's test is a statistical hypothesis test for analysing 2x2 contingency tables. It examines the association of two Bernoulli distributed random variables and is a uniformly more powerful alternative to Fisher's exact test. It was proposed ...

, is uniformly more powerful than Fisher’s test. A more detailed description of Barnard’s test is given by Mehta and Senchaudhuri (2003). Barnard’s test has been used alongside Fisher's exact test in project management research

Criticisms

Under specious pressure from Fisher, Barnard retracted his test in a published paper, however many researchers prefer Barnard’s exact test over Fisher's exact test for analyzing contingency tables, since its statistics are more powerful for the vast majority of experimental designs, whereas Fisher’s exact test statistics are conservative, meaning the significance shown by its values are too high, leading the experimenter to dismiss as insignificant results that would be statistically significant using the less conservative double-binomial statistics of Barnard's tests rather than the hypergeometric statistics of Fisher's exact test. Barnard's tests are not appropriate in the rare case of an experimental design that constrains both marginal results (e.g. ‘taste tests’); although rare, experimentally imposed constraints on both marginal totals makes the true sampling distribution for the table hypergeometric. Barnard's test can be applied to larger tables, but the computation time increases and the power advantage quickly decreases. It remains unclear which test statistic is preferred when implementing Barnard's test; however, most test statistics yield uniformly more powerful tests than Fisher's exact test.

Footnotes

References

External links

* {{DEFAULTSORT:Barnard's Test Statistical tests for contingency tables

Purpose and scope

Criticisms

See also

Footnotes

References

External links