Personal Probability
   HOME

TheInfoList



OR:

Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or
propensity The propensity theory of probability is a probability interpretation in which the probability is thought of as a physical propensity, disposition, or tendency of a given type of situation to yield an outcome of a certain kind, or to yield a long- ...
of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief. The Bayesian interpretation of probability can be seen as an extension of propositional logic that enables reasoning with
hypotheses A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous obser ...
; that is, with propositions whose truth or falsity is unknown. In the Bayesian view, a probability is assigned to a hypothesis, whereas under frequentist inference, a hypothesis is typically tested without being assigned a probability. Bayesian probability belongs to the category of evidential probabilities; to evaluate the probability of a hypothesis, the Bayesian probabilist specifies a prior probability. This, in turn, is then updated to a posterior probability in the light of new, relevant data (evidence). The Bayesian interpretation provides a standard set of procedures and formulae to perform this calculation. The term ''Bayesian'' derives from the 18th-century mathematician and theologian Thomas Bayes, who provided the first mathematical treatment of a non-trivial problem of statistical
data analysis Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, enco ...
using what is now known as
Bayesian inference Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
. Mathematician Pierre-Simon Laplace pioneered and popularized what is now called Bayesian probability.


Bayesian methodology

Bayesian methods are characterized by concepts and procedures as follows: * The use of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s, or more generally unknown quantities, to model all sources of uncertainty in statistical models including uncertainty resulting from lack of information (see also aleatoric and epistemic uncertainty). * The need to determine the prior probability distribution taking into account the available (prior) information. * The sequential use of
Bayes' theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
: as more data become available, calculate the posterior distribution using Bayes' theorem; subsequently, the posterior distribution becomes the next prior. * While for the frequentist, a hypothesis is a proposition (which must be either true or false) so that the frequentist probability of a hypothesis is either 0 or 1, in Bayesian statistics, the probability that can be assigned to a hypothesis can also be in a range from 0 to 1 if the truth value is uncertain.


Objective and subjective Bayesian probabilities

Broadly speaking, there are two interpretations of Bayesian probability. For objectivists, who interpret probability as an extension of logic, ''probability'' quantifies the reasonable expectation that everyone (even a "robot") who shares the same knowledge should share in accordance with the rules of Bayesian statistics, which can be justified by Cox's theorem. For subjectivists, ''probability'' corresponds to a personal belief. Rationality and coherence allow for substantial variation within the constraints they pose; the constraints are justified by the Dutch book argument or by decision theory and de Finetti's theorem. The objective and subjective variants of Bayesian probability differ mainly in their interpretation and construction of the prior probability.


History

The term ''Bayesian'' derives from Thomas Bayes (1702–1761), who proved a special case of what is now called
Bayes' theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
in a paper titled " An Essay towards solving a Problem in the Doctrine of Chances". In that special case, the prior and posterior distributions were
beta distribution In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval , 1in terms of two positive parameters, denoted by ''alpha'' (''α'') and ''beta'' (''β''), that appear as ...
s and the data came from Bernoulli trials. It was Pierre-Simon Laplace (1749–1827) who introduced a general version of the theorem and used it to approach problems in celestial mechanics, medical statistics, reliability, and jurisprudence. Early Bayesian inference, which used uniform priors following Laplace's principle of insufficient reason, was called " inverse probability" (because it
infer Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in ...
s backwards from observations to parameters, or from effects to causes). After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called
frequentist statistics Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pro ...
. In the 20th century, the ideas of Laplace developed in two directions, giving rise to ''objective'' and ''subjective'' currents in Bayesian practice. Harold Jeffreys' ''Theory of Probability'' (first published in 1939) played an important role in the revival of the Bayesian view of probability, followed by works by Abraham Wald (1950) and
Leonard J. Savage Leonard Jimmie Savage (born Leonard Ogashevitz; 20 November 1917 – 1 November 1971) was an American mathematician and statistician. Economist Milton Friedman said Savage was "one of the few people I have met whom I would unhesitatingly call a ge ...
(1954). The adjective ''Bayesian'' itself dates to the 1950s; the derived ''Bayesianism'', ''neo-Bayesianism'' is of 1960s coinage. In the objectivist stream, the statistical analysis depends on only the model assumed and the data analysed. No subjective decisions need to be involved. In contrast, "subjectivist" statisticians deny the possibility of fully objective analysis for the general case. In the 1980s, there was a dramatic growth in research and applications of Bayesian methods, mostly attributed to the discovery of Markov chain Monte Carlo methods and the consequent removal of many of the computational problems, and to an increasing interest in nonstandard, complex applications. While frequentist statistics remains strong (as demonstrated by the fact that much of undergraduate teaching is based on it ), Bayesian methods are widely accepted and used, e.g., in the field of machine learning.


Justification of Bayesian probabilities

The use of Bayesian probabilities as the basis of
Bayesian inference Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
has been supported by several arguments, such as Cox axioms, the Dutch book argument, arguments based on decision theory and de Finetti's theorem.


Axiomatic approach

Richard T. Cox Richard Threlkeld Cox (August 5, 1898 – May 2, 1991) was a professor of physics at Johns Hopkins University, known for Cox's theorem relating to the foundations of probability.. Biography He was born in Portland, Oregon, the son of attorney Lew ...
showed that Bayesian updating follows from several axioms, including two functional equations and a hypothesis of differentiability. The assumption of differentiability or even continuity is controversial; Halpern found a counterexample based on his observation that the Boolean algebra of statements may be finite. Other axiomatizations have been suggested by various authors with the purpose of making the theory more rigorous.


Dutch book approach

Bruno de Finetti proposed the Dutch book argument based on betting. A clever bookmaker makes a Dutch book by setting the odds and bets to ensure that the bookmaker profits—at the expense of the gamblers—regardless of the outcome of the event (a horse race, for example) on which the gamblers bet. It is associated with
probabilities Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
implied by the odds not being
coherent Coherence, coherency, or coherent may refer to the following: Physics * Coherence (physics), an ideal property of waves that enables stationary (i.e. temporally and spatially constant) interference * Coherence (units of measurement), a deri ...
. However,
Ian Hacking Ian MacDougall Hacking (born February 18, 1936) is a Canadian philosopher specializing in the philosophy of science. Throughout his career, he has won numerous awards, such as the Killam Prize for the Humanities and the Balzan Prize, and been ...
noted that traditional Dutch book arguments did not specify Bayesian updating: they left open the possibility that non-Bayesian updating rules could avoid Dutch books. For example, Hacking writes "And neither the Dutch book argument, nor any other in the personalist arsenal of proofs of the probability axioms, entails the dynamic assumption. Not one entails Bayesianism. So the personalist requires the dynamic assumption to be Bayesian. It is true that in consistency a personalist could abandon the Bayesian model of learning from experience. Salt could lose its savour." In fact, there are non-Bayesian updating rules that also avoid Dutch books (as discussed in the literature on " probability kinematics" following the publication of
Richard C. Jeffrey Richard Carl Jeffrey (August 5, 1926 – November 9, 2002) was an American philosopher, logician, and probability theory, probability theorist. He is best known for developing and championing the philosophy of radical probabilism and the associa ...
's rule, which is itself regarded as Bayesian). The additional hypotheses sufficient to (uniquely) specify Bayesian updating are substantial and not universally seen as satisfactory.


Decision theory approach

A decision-theoretic justification of the use of Bayesian inference (and hence of Bayesian probabilities) was given by Abraham Wald, who proved that every admissible statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures. Conversely, every Bayesian procedure is admissible.


Personal probabilities and objective methods for constructing priors

Following the work on expected utility theory of Ramsey and von Neumann, decision-theorists have accounted for rational behavior using a probability distribution for the
agent Agent may refer to: Espionage, investigation, and law *, spies or intelligence officers * Law of agency, laws involving a person authorized to act on behalf of another ** Agent of record, a person with a contractual agreement with an insuranc ...
.
Johann Pfanzagl Johann Richard Pfanzagl (2 July 1928 – 4 June 2019) was an Austrian mathematician known for his research in mathematical statistics. Life and career Pfanzagl studied from 1946 to 1951 at the University of Vienna and received his doctorate t ...
completed the ''
Theory of Games and Economic Behavior ''Theory of Games and Economic Behavior'', published in 1944 by Princeton University Press, is a book by mathematician John von Neumann and economist Oskar Morgenstern which is considered the groundbreaking text that created the interdisciplinar ...
'' by providing an axiomatization of subjective probability and utility, a task left uncompleted by von Neumann and Oskar Morgenstern: their original theory supposed that all the agents had the same probability distribution, as a convenience. Pfanzagl's axiomatization was endorsed by Oskar Morgenstern: "Von Neumann and I have anticipated ...
he question whether probabilities He or HE may refer to: Language * He (pronoun), an English pronoun * He (kana), the romanization of the Japanese kana へ * He (letter), the fifth letter of many Semitic alphabets * He (Cyrillic), a letter of the Cyrillic script called ''He'' ...
might, perhaps more typically, be subjective and have stated specifically that in the latter case axioms could be found from which could derive the desired numerical utility together with a number for the probabilities (cf. p. 19 of The Theory of Games and Economic Behavior). We did not carry this out; it was demonstrated by Pfanzagl ... with all the necessary rigor". Ramsey and
Savage Savage may refer to: Places Antarctica * Savage Glacier, Ellsworth Land * Savage Nunatak, Marie Byrd Land * Savage Ridge, Victoria Land United States * Savage, Maryland, an unincorporated community * Savage, Minnesota, a city * Savage, Mi ...
noted that the individual agent's probability distribution could be objectively studied in experiments. Procedures for testing hypotheses about probabilities (using finite samples) are due to Ramsey (1931) and de Finetti (1931, 1937, 1964, 1970). Both Bruno de Finetti and Frank P. Ramsey acknowledge their debts to
pragmatic philosophy "Pragmaticism" is a term used by Charles Sanders Peirce for his pragmatic philosophy starting in 1905, in order to distance himself and it from pragmatism, the original name, which had been used in a manner he did not approve of in the "literary j ...
, particularly (for Ramsey) to Charles S. Peirce. The "Ramsey test" for evaluating probability distributions is implementable in theory, and has kept experimental psychologists occupied for a half century. This work demonstrates that Bayesian-probability propositions can be falsified, and so meet an empirical criterion of Charles S. Peirce, whose work inspired Ramsey. (This falsifiability-criterion was popularized by
Karl Popper Sir Karl Raimund Popper (28 July 1902 – 17 September 1994) was an Austrian-British philosopher, academic and social commentator. One of the 20th century's most influential philosophers of science, Popper is known for his rejection of the cl ...
.) Modern work on the experimental evaluation of personal probabilities uses the randomization, blinding, and Boolean-decision procedures of the Peirce-Jastrow experiment.Peirce & Jastrow (1885) Since individuals act according to different probability judgments, these agents' probabilities are "personal" (but amenable to objective study). Personal probabilities are problematic for science and for some applications where decision-makers lack the knowledge or time to specify an informed probability-distribution (on which they are prepared to act). To meet the needs of science and of human limitations, Bayesian statisticians have developed "objective" methods for specifying prior probabilities. Indeed, some Bayesians have argued the prior state of knowledge defines ''the'' (unique) prior probability-distribution for "regular" statistical problems; cf. well-posed problems. Finding the right method for constructing such "objective" priors (for appropriate classes of regular problems) has been the quest of statistical theorists from Laplace to John Maynard Keynes, Harold Jeffreys, and Edwin Thompson Jaynes. These theorists and their successors have suggested several methods for constructing "objective" priors (Unfortunately, it is not clear how to assess the relative "objectivity" of the priors proposed under these methods): * Maximum entropy * Transformation group analysis * Reference analysis Each of these methods contributes useful priors for "regular" one-parameter problems, and each prior can handle some challenging
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
s (with "irregularity" or several parameters). Each of these methods has been useful in Bayesian practice. Indeed, methods for constructing "objective" (alternatively, "default" or "ignorance") priors have been developed by avowed subjective (or "personal") Bayesians like James Berger (
Duke University Duke University is a private research university in Durham, North Carolina. Founded by Methodists and Quakers in the present-day city of Trinity in 1838, the school moved to Durham in 1892. In 1924, tobacco and electric power industrialist James ...
) and
José-Miguel Bernardo José-Miguel Bernardo Herranz (born 12 March 1950) is a Spanish mathematician and statistician. A noted Bayesian statistics, Bayesian, since 1978 he has been a professor of statistics at the University of Valencia. Bernardo was born in Valencia, ...
( Universitat de València), simply because such priors are needed for Bayesian practice, particularly in science. The quest for "the universal method for constructing priors" continues to attract statistical theorists. Thus, the Bayesian statistician needs either to use informed priors (using relevant expertise or previous data) or to choose among the competing methods for constructing "objective" priors.


See also

* '' An Essay towards solving a Problem in the Doctrine of Chances'' * Bayesian epistemology * Bertrand paradox—a paradox in classical probability *
Credal network Credal networks are probabilistic graphical models based on imprecise probability. Credal networks can be regarded as an extension of Bayesian networks, where credal sets replace probability mass functions in the specification of the local models ...
*
De Finetti's game In a thought experiment proposed by the Italian probabilist Bruno de Finetti in order to justify Bayesian probability, an array of wagers is coherent precisely if it does not expose the wagerer to certain loss regardless of the outcomes of events ...
—a procedure for evaluating someone's subjective probability *
Monty Hall problem The Monty Hall problem is a brain teaser, in the form of a probability puzzle, loosely based on the American television game show ''Let's Make a Deal'' and named after its original host, Monty Hall. The problem was originally posed (and solved) ...
*
QBism In physics and the philosophy of physics, quantum Bayesianism is a collection of related approaches to the interpretation of quantum mechanics, of which the most prominent is QBism (pronounced "cubism"). QBism is an interpretation that takes an a ...
—an
interpretation of quantum mechanics An interpretation of quantum mechanics is an attempt to explain how the mathematical theory of quantum mechanics might correspond to experienced reality. Although quantum mechanics has held up to rigorous and extremely precise tests in an extraord ...
based on subjective Bayesian probability *
Reference class problem In statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case. For example, to estimate the probability of an aircraft crashing, we could refer to the fre ...


References


Bibliography

* * * * * * * (translation of de Finetti, 1931) * (translation of de Finetti, 1937, above) * , , two volumes. * Goertz, Gary and James Mahoney. 2012. ''A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences''. Princeton University Press. *. *
Partly reprinted in * * * * * * ( * * * * * * * * * * {{cite book , author=Winkler, R.L. , title=Introduction to Bayesian Inference and Decision , publisher=Probabilistic , year=2003 , isbn=978-0-9647938-4-2 , edition=2nd , quote=Updated classic textbook. Bayesian theory clearly presented Probability Justification (epistemology) Probability interpretations Philosophy of mathematics Philosophy of science