Cromwell's rule, named by statistician
Dennis Lindley
Dennis Victor Lindley (25 July 1923 – 14 December 2013) was an English statistician, decision theorist and leading advocate of Bayesian statistics.
Biography
Lindley grew up in the south-west London suburb of Surbiton. He was an only child an ...
, states that the use of
prior probabilities
In Bayesian probability, Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some e ...
of 1 ("the event will definitely occur") or 0 ("the event will definitely not occur") should be avoided, except when applied to statements that are logically true or false, such as 2+2 equaling 4 or 5.
The reference is to
Oliver Cromwell
Oliver Cromwell (25 April 15993 September 1658) was an English politician and military officer who is widely regarded as one of the most important statesmen in English history. He came to prominence during the 1639 to 1651 Wars of the Three Ki ...
, who wrote to the General Assembly of the
Church of Scotland
The Church of Scotland ( sco, The Kirk o Scotland; gd, Eaglais na h-Alba) is the national church in Scotland.
The Church of Scotland was principally shaped by John Knox, in the Scottish Reformation, Reformation of 1560, when it split from t ...
on 3 August 1650, shortly before the
Battle of Dunbar, including a phrase that has become well known and frequently quoted:
As Lindley puts it, assigning a probability should "leave a little probability for
the moon being made of green cheese
"The Moon is made of green cheese" is a statement referring to a fanciful belief that the Moon is composed of cheese. In its original formulation as a proverb and metaphor for credulity with roots in fable, this refers to the perception of a sim ...
; it can be as small as 1 in a million, but have it there since otherwise an army of astronauts returning with samples of the said cheese will leave you unmoved." Similarly, in assessing the likelihood that tossing a coin will result in either a head or a tail facing upwards, there is a possibility, albeit remote, that the coin will land on its edge and remain in that position.
If the prior probability assigned to a hypothesis is 0 or 1, then, by
Bayes' theorem, the
posterior probability
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
(probability of the hypothesis, given the evidence) is forced to be 0 or 1 as well; no evidence, no matter how strong, could have any influence.
A strengthened version of Cromwell's rule, applying also to statements of arithmetic and logic, alters the first rule of probability, or the convexity rule, 0 ≤ Pr(''A'') ≤ 1, to 0 < Pr(''A'') < 1.
Bayesian divergence (pessimistic)
An example of Bayesian divergence of opinion is based on Appendix A of Sharon Bertsch McGrayne's 2011 book. Tim and Susan disagree as to whether a stranger who has two fair coins and one unfair coin (one with heads on both sides) has tossed one of the two fair coins or the unfair one; the stranger has tossed one of his coins three times and it has come up heads each time.
Tim assumes that the stranger picked the coin randomly – i.e., assumes a
prior probability distribution
In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken int ...
in which each coin had a 1/3 chance of being the one picked. Applying
Bayesian inference, Tim then calculates an 80% probability that the result of three consecutive heads was achieved by using the unfair coin, because each of the fair coins had a 1/8 chance of giving three straight heads, while the unfair coin had an 8/8 chance; out of 24 equally likely possibilities for what could happen, 8 out of the 10 that agree with the observations came from the unfair coin. If more flips are conducted, each further head increases the probability that the coin is the unfair one. If no tail ever appears, this probability converges to 1. But if a tail ever occurs, the probability that the coin is unfair immediately goes to 0 and stays at 0 permanently.
Susan assumes the stranger chose a fair coin (so the prior probability that the tossed coin is the unfair coin is 0). Consequently, Susan calculates the probability that three (or any number of consecutive heads) were tossed with the unfair coin must be 0; if still more heads are thrown, Susan does not change her probability. Tim and Susan's probabilities do not converge as more and more heads are thrown.
Bayesian convergence (optimistic)
An example of Bayesian convergence of opinion is in Nate Silver's 2012 book ''
The Signal and the Noise: Why so many predictions fail — but some don't''.
[
] After stating, "Absolutely nothing useful is realized when one person who holds that there is a 0 (zero) percent probability of something argues against another person who holds that the probability is 100 percent", Silver describes a simulation where three investors start out with initial guesses of 10%, 50% and 90% that the stock market is in a bull market; by the end of the simulation (shown in a graph), "all of the investors conclude they are in a bull market with almost (although not exactly of course) 100 percent certainty."
See also
*
Additive smoothing
In statistics, additive smoothing, also called Laplace smoothing or Lidstone smoothing, is a technique used to smooth categorical data. Given a set of observation counts \textstyle from a \textstyle -dimensional multinomial distribution with ...
*
Rule of succession
In probability theory, the rule of succession is a formula introduced in the 18th century by Pierre-Simon Laplace in the course of treating the sunrise problem. The formula is still used, particularly to estimate underlying probabilities when ...
References
{{reflist
Bayesian statistics
Statistical principles