Lindley's paradox is a
counterintuitive
A paradox is a logically self-contradictory statement or a statement that runs contrary to one's expectation. It is a statement that, despite apparently valid reasoning from true premises, leads to a seemingly self-contradictory or a logically u ...
situation in
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
in which the
Bayesian
Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister.
Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...
and
frequentist
Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pro ...
approaches to a
hypothesis testing
A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
Hypothesis testing allows us to make probabilistic statements about population parameters.
...
problem give different results for certain choices of the
prior distribution
In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken int ...
. The problem of the disagreement between the two approaches was discussed in
Harold Jeffreys
Sir Harold Jeffreys, FRS (22 April 1891 – 18 March 1989) was a British mathematician, statistician, geophysicist, and astronomer. His book, ''Theory of Probability'', which was first published in 1939, played an important role in the revival ...
' 1939 textbook; it became known as Lindley's paradox after
Dennis Lindley
Dennis Victor Lindley (25 July 1923 – 14 December 2013) was an English statistician, decision theorist and leading advocate of Bayesian statistics.
Biography
Lindley grew up in the south-west London suburb of Surbiton. He was an only child an ...
called the disagreement a
paradox
A paradox is a logically self-contradictory statement or a statement that runs contrary to one's expectation. It is a statement that, despite apparently valid reasoning from true premises, leads to a seemingly self-contradictory or a logically u ...
in a 1957 paper.
Although referred to as a ''paradox'', the differing results from the Bayesian and frequentist approaches can be explained as using them to answer fundamentally different questions, rather than actual disagreement between the two methods.
Nevertheless, for a large class of priors the differences between the frequentist and Bayesian approach are caused by keeping the significance level fixed: as even Lindley recognized, "the theory does not justify the practice of keeping the significance level fixed
'' and even "some computations by Prof. Pearson in the discussion to that paper emphasized how the significance level would have to change with the sample size, if the losses and prior probabilities were kept fixed.
'' In fact, if the critical value increases with the sample size suitably fast, then the disagreement between the frequentist and Bayesian approaches becomes negligible as the sample size increases.
Description of the paradox
The result
of some experiment has two possible explanations, hypotheses
and
, and some prior distribution
representing uncertainty as to which hypothesis is more accurate before taking into account
.
Lindley's paradox occurs when
# The result
is "significant" by a frequentist test of
, indicating sufficient evidence to reject
, say, at the 5% level, and
# The
posterior probability
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
of
given
is high, indicating strong evidence that
is in better agreement with
than
.
These results can occur at the same time when
is very specific,
more diffuse, and the prior distribution does not strongly favor one or the other, as seen below.
Numerical example
The following numerical example illustrates Lindley's paradox. In a certain city 49,581 boys and 48,870 girls have been born over a certain time period. The observed proportion
of male births is thus 49,581/98,451 ≈ 0.5036. We assume the fraction of male births is a
binomial variable with parameter
. We are interested in testing whether
is 0.5 or some other value. That is, our null hypothesis is
and the alternative is
.
Frequentist approach
The frequentist approach to testing
is to compute a
p-value, the probability of observing a fraction of boys at least as large as
assuming
is true. Because the number of births is very large, we can use a
normal approximation for the fraction of male births
, with
and
, to compute
:
We would have been equally surprised if we had seen 49,581 female births, i.e.
, so a frequentist would usually perform a
two-sided
In mathematics, specifically in topology of manifolds, a compact codimension-one submanifold F of a manifold M is said to be 2-sided in M when there is an embedding
::h\colon F\times 1,1to M
with h(x,0)=x for each x\in F and
::h(F\times 1,1\ ...
test, for which the p-value would be
. In both cases, the p-value is lower than the significance level, α, of 5%, so the frequentist approach rejects
as it disagrees with the observed data.
Bayesian approach
Assuming no reason to favor one hypothesis over the other, the Bayesian approach would be to assign prior probabilities
and a uniform distribution to
under
, and then to compute the posterior probability of
using
Bayes' theorem,
:
After observing
boys out of
births, we can compute the posterior probability of each hypothesis using the
probability mass function for a binomial variable,
:
where
is the
Beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t^( ...
.
From these values, we find the posterior probability of
, which strongly favors
over
.
The two approaches—the Bayesian and the frequentist—appear to be in conflict, and this is the "paradox".
Reconciling the Bayesian and frequentist approaches
Almost sure hypothesis testing In statistics, almost sure hypothesis testing or a.s. hypothesis testing utilizes almost sure convergence in order to determine the validity of a statistical hypothesis with probability one. This is to say that whenever the null hypothesis is true, ...
Naaman
proposed an adaption of the significance level to the sample size in order to control false positives: , such that with .
At least in the numerical example, taking , results in a significance level of 0.00318, so the frequentist would not reject the null hypothesis, which is in agreement with the Bayesian approach.
Uninformative priors
If we use an
uninformative prior
In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
and test a hypothesis more similar to that in the frequentist approach, the paradox disappears.
For example, if we calculate the posterior distribution
, using a uniform prior distribution on
(i.e.
), we find
:
If we use this to check the probability that a newborn is more likely to be a boy than a girl, i.e.
, we find
:
In other words, it is very likely that the proportion of male births is above 0.5.
Neither analysis gives an estimate of the
effect size
In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...
, directly, but both could be used to determine, for instance, if the fraction of boy births is likely to be above some particular threshold.
The lack of an actual paradox
The apparent disagreement between the two approaches is caused by a combination of factors. First, the frequentist approach above tests
without reference to
. The Bayesian approach evaluates
as an alternative to
, and finds the first to be in better agreement with the observations. This is because the latter hypothesis is much more diffuse, as
can be anywhere in