Cox's theorem, named after the physicist
Richard Threlkeld Cox
Richard Threlkeld Cox (August 5, 1898 – May 2, 1991) was a professor of physics at Johns Hopkins University, known for Cox's theorem relating to the foundations of probability..
Biography
He was born in Portland, Oregon, the son of attorney Lew ...
, is a derivation of the laws of
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
from a certain set of
postulates
An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word (), meaning 'that which is thought worthy or f ...
. This derivation justifies the so-called "logical" interpretation of probability, as the laws of probability derived by Cox's theorem are applicable to any proposition. Logical (also known as objective Bayesian) probability is a type of
Bayesian probability
Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification ...
. Other forms of Bayesianism, such as the subjective interpretation, are given other justifications.
Cox's assumptions
Cox wanted his system to satisfy the following conditions:
#Divisibility and comparability – The plausibility of a
proposition
In logic and linguistics, a proposition is the meaning of a declarative sentence. In philosophy, " meaning" is understood to be a non-linguistic entity which is shared by all sentences with the same meaning. Equivalently, a proposition is the no ...
is a real number and is dependent on information we have related to the proposition.
#Common sense – Plausibilities should vary sensibly with the assessment of plausibilities in the model.
#Consistency – If the plausibility of a proposition can be derived in many ways, all the results must be equal.
The postulates as stated here are taken from Arnborg and Sjödin.
[Stefan Arnborg and Gunnar Sjödin, ''On the foundations of Bayesianism,'' Preprint: Nada, KTH (1999) — ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/06arnborg.ps — ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/06arnborg.pdf][Stefan Arnborg and Gunnar Sjödin, ''A note on the foundations of Bayesianism,'' Preprint: Nada, KTH (2000a) — ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobshle.ps — ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobshle.pdf][Stefan Arnborg and Gunnar Sjödin, "Bayes rules in finite models," in ''European Conference on Artificial Intelligence,'' Berlin, (2000b) — ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobc1.ps — ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobc1.pdf]
"
Common sense
''Common Sense'' is a 47-page pamphlet written by Thomas Paine in 1775–1776 advocating independence from Great Britain to people in the Thirteen Colonies. Writing in clear and persuasive prose, Paine collected various moral and political arg ...
" includes consistency with Aristotelian
logic
Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the science of deductively valid inferences or of logical truths. It is a formal science investigating how conclusions follow from premise ...
in the sense that logically equivalent propositions shall have the same plausibility.
The postulates as originally stated by Cox were not mathematically
rigorous (although more so than the informal description above),
as noted by
Halpern Halpern is a variation of the Jewish surname Heilprin and may refer to:
* Baruch Halpern, Jewish studies
* Benjamin Halpern, American marine biologist and ecologist
* Carolyn Halpern, American psychologist
* Charles Halpern, lawyer
* Charna Hal ...
.
[Joseph Y. Halpern, "A counterexample to theorems of Cox and Fine," ''Journal of AI research,'' 10, 67–85 (1999) — http://www.jair.org/media/536/live-536-2054-jair.ps.Z ][Joseph Y. Halpern, "Technical Addendum, Cox's theorem Revisited," ''Journal of AI research,'' 11, 429–435 (1999) — http://www.jair.org/media/644/live-644-1840-jair.ps.Z ] However it appears to be possible
to augment them with various mathematical assumptions made either
implicitly or explicitly by Cox to produce a valid proof.
Cox's notation:
:The plausibility of a proposition
given some related information
is denoted by
.
Cox's postulates and functional equations are:
*The plausibility of the
conjunction
Conjunction may refer to:
* Conjunction (grammar), a part of speech
* Logical conjunction, a mathematical operator
** Conjunction introduction, a rule of inference of propositional logic
* Conjunction (astronomy), in which two astronomical bodies ...
of two propositions
,
, given some related information
, is determined by the plausibility of
given
and that of
given
.
:In form of a
functional equation
In mathematics, a functional equation
is, in the broadest meaning, an equation in which one or several functions appear as unknowns. So, differential equations and integral equations are functional equations. However, a more restricted meaning ...
::
:Because of the associative nature of the conjunction in propositional logic, the consistency with logic gives a functional equation saying that the function
is an
associative binary operation.
*Additionally, Cox postulates the function
to be
monotonic
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
.
:All strictly increasing associative binary operations on the real numbers are isomorphic to multiplication of numbers in a
subinterval
In mathematics, a (real) interval is a set of real numbers that contains all real numbers lying between any two numbers of the set. For example, the set of numbers satisfying is an interval which contains , , and all numbers in between. Other ...
of , which means that there is a monotonic function
mapping plausibilities to such that
::
*In case
given
is certain, we have
and
due to the requirement of consistency. The general equation then leads to
:
:This shall hold for any proposition
, which leads to
::
*In case
given
is impossible, we have
and
due to the requirement of consistency. The general equation (with the A and B factors switched) then leads to
:
:This shall hold for any proposition
, which, without loss of generality, leads to a solution
::
::Due to the requirement of monotonicity, this means that
maps plausibilities to interval .
*The plausibility of a proposition determines the plausibility of the proposition's
negation.
:This postulates the existence of a function
such that
::
:Because "a double negative is an affirmative", consistency with logic gives a functional equation
::
:saying that the function
is an
involution
Involution may refer to:
* Involute, a construction in the differential geometry of curves
* '' Agricultural Involution: The Processes of Ecological Change in Indonesia'', a 1963 study of intensification of production through increased labour inpu ...
, i.e., it is its own inverse.
*Furthermore, Cox postulates the function
to be monotonic.
:The above functional equations and consistency with logic imply that
::
:Since
is logically equivalent to
, we also get
::
:If, in particular,
, then also
and
and we get
::
:and
::
:Abbreviating
and
we get the functional equation
::
Implications of Cox's postulates
The laws of probability derivable from these postulates are the following.
Edwin Thompson Jaynes
Edwin Thompson Jaynes (July 5, 1922 – April 30, 1998) was the Wayman Crow Distinguished Professor of Physics at Washington University in St. Louis. He wrote extensively on statistical mechanics and on foundations of probability and statist ...
, ''Probability Theory: The Logic of Science,'' Cambridge University Press (2003). — preprint version (1996) at ; Chapters 1 to 3 of published version at http://bayes.wustl.edu/etj/prob/book.pdf
Let
be the plausibility of the proposition
given
satisfying Cox's postulates. Then there is a function
mapping plausibilities to interval
,1and a positive number
such that
# Certainty is represented by
#
#
It is important to note that the postulates imply only these general properties. We may recover the usual laws of probability by setting a new function, conventionally denoted
or
, equal to
. Then we obtain the laws of probability in a more familiar form:
# Certain truth is represented by
, and certain falsehood by
#
#
Rule 2 is a rule for negation, and rule 3 is a rule for conjunction. Given that any proposition containing conjunction,
disjunction
In logic, disjunction is a logical connective typically notated as \lor and read aloud as "or". For instance, the English language sentence "it is raining or it is snowing" can be represented in logic using the disjunctive formula R \lor S ...
, and negation can be equivalently rephrased using conjunction and negation alone (the
conjunctive normal form
In Boolean logic, a formula is in conjunctive normal form (CNF) or clausal normal form if it is a conjunction of one or more clauses, where a clause is a disjunction of literals; otherwise put, it is a product of sums or an AND of ORs. As a cano ...
), we can now handle any compound proposition.
The laws thus derived yield
finite additivity of probability, but not
countable additivity. The
measure-theoretic formulation of Kolmogorov assumes that a probability measure is countably additive. This slightly stronger condition is necessary for the proof of certain theorems.
Interpretation and further discussion
Cox's theorem has come to be used as one of the
justifications for the use of
Bayesian probability theory
Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification ...
. For example, in Jaynes it is discussed in detail in chapters 1 and 2 and is a cornerstone for the rest of the book.
Probability is interpreted as a
formal system
A formal system is an abstract structure used for inferring theorems from axioms according to a set of rules. These rules, which are used for carrying out the inference of theorems from axioms, are the logical calculus of the formal system.
A form ...
of
logic
Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the science of deductively valid inferences or of logical truths. It is a formal science investigating how conclusions follow from premise ...
, the natural extension of
Aristotelian logic (in which every statement is either true or false) into the realm of reasoning in the presence of uncertainty.
It has been debated to what degree the theorem excludes alternative models for reasoning about
uncertainty
Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable ...
. For example, if certain "unintuitive" mathematical assumptions were dropped then alternatives could be devised, e.g., an example provided by Halpern.
However Arnborg and Sjödin
suggest additional
"common sense" postulates, which would allow the assumptions to be relaxed in some cases while still ruling out the Halpern example. Other approaches were devised by Hardy or Dupré and Tipler.
[Dupré, Maurice J. & Tipler, Frank J. (2009)]
"New Axioms for Rigorous Bayesian Probability"
''Bayesian Analysis'', 4(3): 599-606.
The original formulation of Cox's theorem is in , which is extended with additional results and more discussion in . Jaynes
cites Abel for the first known use of the associativity functional equation.
János Aczél provides a long proof of the "associativity equation" (pages 256-267). Jaynes
reproduces the shorter proof by Cox in which differentiability is assumed. A guide to Cox's theorem by Van Horn aims at comprehensively introducing the reader to all these references.
See also
*
Probability axioms
The Kolmogorov axioms are the foundations of probability theory introduced by Russian mathematician Andrey Kolmogorov in 1933. These axioms remain central and have direct contributions to mathematics, the physical sciences, and real-world probabili ...
*
Probability logic Probabilistic logic (also probability logic and probabilistic reasoning) involves the use of probability and logic to deal with uncertain situations. Probabilistic logic extends traditional logic truth tables with probabilistic expressions. A diffi ...
References
Further reading
*
*{{cite book , first=C. Ray , last=Smith , first2=Gary , last2=Erickson , chapter=From Rationality and Consistency to Bayesian Probability , pages=29–44 , title=Maximum Entropy and Bayesian Methods , editor-first=John , editor-last=Skilling , location=Dordrecht , publisher=Kluwer , year=1989 , isbn=0-7923-0224-9 , doi=10.1007/978-94-015-7860-8_2
Probability theorems
Probability interpretations
Theorems in statistics