Bayesian probability is an
interpretation of the concept of probability, in which, instead of
frequency
Frequency is the number of occurrences of a repeating event per unit of time. It is also occasionally referred to as ''temporal frequency'' for clarity, and is distinct from ''angular frequency''. Frequency is measured in hertz (Hz) which is eq ...
or
propensity
The propensity theory of probability is a probability interpretation in which the probability is thought of as a physical propensity, disposition, or tendency of a given type of situation to yield an outcome of a certain kind, or to yield a long- ...
of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge
or as quantification of a personal belief.
The Bayesian interpretation of probability can be seen as an extension of
propositional logic
Propositional calculus is a branch of logic. It is also called propositional logic, statement logic, sentential calculus, sentential logic, or sometimes zeroth-order logic. It deals with propositions (which can be true or false) and relations b ...
that enables reasoning with
hypotheses;
that is, with propositions whose
truth or falsity is unknown. In the Bayesian view, a probability is assigned to a hypothesis, whereas under
frequentist inference
Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pro ...
, a hypothesis is typically tested without being assigned a probability.
Bayesian probability belongs to the category of evidential probabilities; to evaluate the probability of a hypothesis, the Bayesian probabilist specifies a
prior probability
In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
. This, in turn, is then updated to a
posterior probability
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
in the light of new, relevant
data
In the pursuit of knowledge, data (; ) is a collection of discrete Value_(semiotics), values that convey information, describing quantity, qualitative property, quality, fact, statistics, other basic units of meaning, or simply sequences of sy ...
(evidence).
The Bayesian interpretation provides a standard set of procedures and formulae to perform this calculation.
The term ''Bayesian'' derives from the 18th-century mathematician and theologian
Thomas Bayes
Thomas Bayes ( ; 1701 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his ...
, who provided the first mathematical treatment of a non-trivial problem of statistical
data analysis using what is now known as
Bayesian inference.
Mathematician
Pierre-Simon Laplace
Pierre-Simon, marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French scholar and polymath whose work was important to the development of engineering, mathematics, statistics, physics, astronomy, and philosophy. He summarized ...
pioneered and popularized what is now called Bayesian probability.
Bayesian methodology
Bayesian methods are characterized by concepts and procedures as follows:
* The use of
random variables, or more generally unknown quantities,
to model all sources of
uncertainty
Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable ...
in statistical models including uncertainty resulting from lack of information (see also
aleatoric and epistemic uncertainty).
* The need to determine the prior probability distribution taking into account the available (prior) information.
* The sequential use of
Bayes' theorem: as more data become available, calculate the posterior distribution using Bayes' theorem; subsequently, the posterior distribution becomes the next prior.
* While for the frequentist, a
hypothesis
A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous obse ...
is a
proposition
In logic and linguistics, a proposition is the meaning of a declarative sentence. In philosophy, " meaning" is understood to be a non-linguistic entity which is shared by all sentences with the same meaning. Equivalently, a proposition is the no ...
(which must be
either true or false) so that the frequentist probability of a hypothesis is either 0 or 1, in Bayesian statistics, the probability that can be assigned to a hypothesis can also be in a range from 0 to 1 if the truth value is uncertain.
Objective and subjective Bayesian probabilities
Broadly speaking, there are two interpretations of Bayesian probability. For objectivists, who interpret probability as an extension of
logic
Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the science of deductively valid inferences or of logical truths. It is a formal science investigating how conclusions follow from premise ...
, ''probability'' quantifies the reasonable expectation that everyone (even a "robot") who shares the same knowledge should share in accordance with the rules of Bayesian statistics, which can be justified by
Cox's theorem
Cox's theorem, named after the physicist Richard Threlkeld Cox, is a derivation of the laws of probability theory from a certain set of postulates. This derivation justifies the so-called "logical" interpretation of probability, as the laws of p ...
.
For subjectivists, ''probability'' corresponds to a personal belief.
Rationality and coherence allow for substantial variation within the constraints they pose; the constraints are justified by the
Dutch book
In gambling, a Dutch book or lock is a set of odds and bets, established by the bookmaker, that ensures that the bookmaker will profit—at the expense of the gamblers—regardless of the outcome of the event (a horse race, for example) on which ...
argument or by
decision theory
Decision theory (or the theory of choice; not to be confused with choice theory) is a branch of applied probability theory concerned with the theory of making decisions based on assigning probabilities to various factors and assigning numerical ...
and
de Finetti's theorem
In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in hono ...
.
The objective and subjective variants of Bayesian probability differ mainly in their interpretation and construction of the prior probability.
History
The term ''Bayesian'' derives from
Thomas Bayes
Thomas Bayes ( ; 1701 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his ...
(1702–1761), who proved a special case of what is now called
Bayes' theorem in a paper titled "
An Essay towards solving a Problem in the Doctrine of Chances
''An Essay towards solving a Problem in the Doctrine of Chances'' is a work on the mathematical theory of probability by Thomas Bayes, published in 1763, two years after its author's death, and containing multiple amendments and additions due to h ...
". In that special case, the prior and posterior distributions were
beta distribution
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval , 1in terms of two positive parameters, denoted by ''alpha'' (''α'') and ''beta'' (''β''), that appear as ...
s and the data came from
Bernoulli trial
In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is ...
s. It was
Pierre-Simon Laplace
Pierre-Simon, marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French scholar and polymath whose work was important to the development of engineering, mathematics, statistics, physics, astronomy, and philosophy. He summarized ...
(1749–1827) who introduced a general version of the theorem and used it to approach problems in
celestial mechanics
Celestial mechanics is the branch of astronomy that deals with the motions of objects in outer space. Historically, celestial mechanics applies principles of physics (classical mechanics) to astronomical objects, such as stars and planets, to ...
, medical statistics,
reliability
Reliability, reliable, or unreliable may refer to:
Science, technology, and mathematics Computing
* Data reliability (disambiguation), a property of some disk arrays in computer storage
* High availability
* Reliability (computer networking), a ...
, and
jurisprudence
Jurisprudence, or legal theory, is the theoretical study of the propriety of law. Scholars of jurisprudence seek to explain the nature of law in its most general form and they also seek to achieve a deeper understanding of legal reasoning a ...
. Early Bayesian inference, which used uniform priors following Laplace's
principle of insufficient reason
The principle of indifference (also called principle of insufficient reason) is a rule for assigning epistemic probabilities. The principle of indifference states that in the absence of any relevant evidence, agents should distribute their cre ...
, was called "
inverse probability
In probability theory, inverse probability is an obsolete term for the probability distribution of an unobserved variable.
Today, the problem of determining an unobserved variable (by whatever method) is called inferential statistics, the method o ...
" (because it
infers backwards from observations to parameters, or from effects to causes).
After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called
frequentist statistics
Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pr ...
.
[
In the 20th century, the ideas of Laplace developed in two directions, giving rise to ''objective'' and ''subjective'' currents in Bayesian practice.
]Harold Jeffreys
Sir Harold Jeffreys, FRS (22 April 1891 – 18 March 1989) was a British mathematician, statistician, geophysicist, and astronomer. His book, ''Theory of Probability'', which was first published in 1939, played an important role in the revival ...
' ''Theory of Probability'' (first published in 1939) played an important role in the revival of the Bayesian view of probability, followed by works by Abraham Wald
Abraham Wald (; hu, Wald Ábrahám, yi, אברהם וואַלד; – ) was a Jewish Hungarian mathematician who contributed to decision theory, geometry, and econometrics and founded the field of statistical sequential analysis. One ...
(1950) and Leonard J. Savage (1954). The adjective ''Bayesian'' itself dates to the 1950s; the derived ''Bayesianism'', ''neo-Bayesianism'' is of 1960s coinage. In the objectivist stream, the statistical analysis depends on only the model assumed and the data analysed. No subjective decisions need to be involved. In contrast, "subjectivist" statisticians deny the possibility of fully objective analysis for the general case.
In the 1980s, there was a dramatic growth in research and applications of Bayesian methods, mostly attributed to the discovery of Markov chain Monte Carlo
In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
methods and the consequent removal of many of the computational problems, and to an increasing interest in nonstandard, complex applications. While frequentist statistics remains strong (as demonstrated by the fact that much of undergraduate teaching is based on it ), Bayesian methods are widely accepted and used, e.g., in the field of machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
.
Justification of Bayesian probabilities
The use of Bayesian probabilities as the basis of Bayesian inference has been supported by several arguments, such as Cox axioms, the Dutch book argument, arguments based on decision theory
Decision theory (or the theory of choice; not to be confused with choice theory) is a branch of applied probability theory concerned with the theory of making decisions based on assigning probabilities to various factors and assigning numerical ...
and de Finetti's theorem
In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in hono ...
.
Axiomatic approach
Richard T. Cox showed that Bayesian updating follows from several axioms, including two functional equations
In mathematics, a functional equation
is, in the broadest meaning, an equation in which one or several functions appear as unknowns. So, differential equations and integral equations are functional equations. However, a more restricted meaning ...
and a hypothesis of differentiability. The assumption of differentiability or even continuity is controversial; Halpern found a counterexample based on his observation that the Boolean algebra of statements may be finite. Other axiomatizations have been suggested by various authors with the purpose of making the theory more rigorous.
Dutch book approach
Bruno de Finetti
Bruno de Finetti (13 June 1906 – 20 July 1985) was an Italian probabilist statistician and actuary, noted for the "operational subjective" conception of probability. The classic exposition of his distinctive theory is the 1937 "La prévision: ...
proposed the Dutch book argument based on betting. A clever bookmaker
A bookmaker, bookie, or turf accountant is an organization or a person that accepts and pays off bets on sporting and other events at agreed-upon odds.
History
The first bookmaker, Ogden, stood at Newmarket in 1795.
Range of events
Bookm ...
makes a Dutch book
In gambling, a Dutch book or lock is a set of odds and bets, established by the bookmaker, that ensures that the bookmaker will profit—at the expense of the gamblers—regardless of the outcome of the event (a horse race, for example) on which ...
by setting the odds
Odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of events that produce that outcome to the number that do not. Odds are commonly used in gambling and statistics.
Odds also have ...
and bets to ensure that the bookmaker profits—at the expense of the gamblers—regardless of the outcome of the event (a horse race, for example) on which the gamblers bet. It is associated with probabilities implied by the odds not being coherent
Coherence, coherency, or coherent may refer to the following:
Physics
* Coherence (physics), an ideal property of waves that enables stationary (i.e. temporally and spatially constant) interference
* Coherence (units of measurement), a deri ...
.
However, Ian Hacking noted that traditional Dutch book arguments did not specify Bayesian updating: they left open the possibility that non-Bayesian updating rules could avoid Dutch books. For example, Hacking writes "And neither the Dutch book argument, nor any other in the personalist arsenal of proofs of the probability axioms, entails the dynamic assumption. Not one entails Bayesianism. So the personalist requires the dynamic assumption to be Bayesian. It is true that in consistency a personalist could abandon the Bayesian model of learning from experience. Salt could lose its savour."
In fact, there are non-Bayesian updating rules that also avoid Dutch books (as discussed in the literature on "probability kinematics Radical probabilism is a hypothesis in philosophy, in particular epistemology, and probability theory that holds that no facts are known for certain. That view holds profound implications for statistical inference. The philosophy is particularly ass ...
" following the publication of Richard C. Jeffrey
Richard Carl Jeffrey (August 5, 1926 – November 9, 2002) was an American philosopher, logician, and probability theory, probability theorist. He is best known for developing and championing the philosophy of radical probabilism and the associa ...
's rule, which is itself regarded as Bayesian). The additional hypotheses sufficient to (uniquely) specify Bayesian updating are substantial and not universally seen as satisfactory.
Decision theory approach
A decision-theoretic justification of the use of Bayesian inference (and hence of Bayesian probabilities) was given by Abraham Wald
Abraham Wald (; hu, Wald Ábrahám, yi, אברהם וואַלד; – ) was a Jewish Hungarian mathematician who contributed to decision theory, geometry, and econometrics and founded the field of statistical sequential analysis. One ...
, who proved that every admissible statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures. Conversely, every Bayesian procedure is admissible.
Personal probabilities and objective methods for constructing priors
Following the work on expected utility The expected utility hypothesis is a popular concept in economics that serves as a reference guide for decisions when the payoff is uncertain. The theory recommends which option rational individuals should choose in a complex situation, based on the ...
theory
A theory is a rational type of abstract thinking about a phenomenon, or the results of such thinking. The process of contemplative and rational thinking is often associated with such processes as observational study or research. Theories may be ...
of Ramsey
Ramsey may refer to:
Geography British Isles
* Ramsey, Cambridgeshire, a small market town in England
* Ramsey, Essex, a village near Harwich, England
** Ramsey and Parkeston, a civil parish formerly called just "Ramsey"
* Ramsey, Isle of Man, t ...
and von Neumann Von Neumann may refer to:
* John von Neumann (1903–1957), a Hungarian American mathematician
* Von Neumann family
* Von Neumann (surname), a German surname
* Von Neumann (crater), a lunar impact crater
See also
* Von Neumann algebra
* Von Ne ...
, decision-theorists have accounted for rational behavior using a probability distribution for the agent
Agent may refer to:
Espionage, investigation, and law
*, spies or intelligence officers
* Law of agency, laws involving a person authorized to act on behalf of another
** Agent of record, a person with a contractual agreement with an insuranc ...
. Johann Pfanzagl
Johann Richard Pfanzagl (2 July 1928 – 4 June 2019) was an Austrian mathematician known for his research in mathematical statistics.
Life and career
Pfanzagl studied from 1946 to 1951 at the University of Vienna and received his doctorate t ...
completed the '' Theory of Games and Economic Behavior'' by providing an axiomatization of subjective probability and utility, a task left uncompleted by von Neumann and Oskar Morgenstern
Oskar Morgenstern (January 24, 1902 – July 26, 1977) was an Austrian-American economist. In collaboration with mathematician John von Neumann, he founded the mathematical field of game theory as applied to the social sciences and strategic decis ...
: their original theory supposed that all the agents had the same probability distribution, as a convenience. Pfanzagl's axiomatization was endorsed by Oskar Morgenstern: "Von Neumann and I have anticipated ... he question whether probabilities
He or HE may refer to:
Language
* He (pronoun), an English pronoun
* He (kana), the romanization of the Japanese kana へ
* He (letter), the fifth letter of many Semitic alphabets
* He (Cyrillic), a letter of the Cyrillic script called ''He'' ...
might, perhaps more typically, be subjective and have stated specifically that in the latter case axioms could be found from which could derive the desired numerical utility together with a number for the probabilities (cf. p. 19 of The Theory of Games and Economic Behavior). We did not carry this out; it was demonstrated by Pfanzagl ... with all the necessary rigor".
Ramsey and Savage
Savage may refer to:
Places Antarctica
* Savage Glacier, Ellsworth Land
* Savage Nunatak, Marie Byrd Land
* Savage Ridge, Victoria Land
United States
* Savage, Maryland, an unincorporated community
* Savage, Minnesota, a city
* Savage, Mi ...
noted that the individual agent's probability distribution could be objectively studied in experiments. Procedures for testing hypotheses about probabilities (using finite samples) are due to Ramsey
Ramsey may refer to:
Geography British Isles
* Ramsey, Cambridgeshire, a small market town in England
* Ramsey, Essex, a village near Harwich, England
** Ramsey and Parkeston, a civil parish formerly called just "Ramsey"
* Ramsey, Isle of Man, t ...
(1931) and de Finetti (1931, 1937, 1964, 1970). Both Bruno de Finetti
Bruno de Finetti (13 June 1906 – 20 July 1985) was an Italian probabilist statistician and actuary, noted for the "operational subjective" conception of probability. The classic exposition of his distinctive theory is the 1937 "La prévision: ...
and Frank P. Ramsey
Frank Plumpton Ramsey (; 22 February 1903 – 19 January 1930) was a British philosopher, mathematician, and economist who made major contributions to all three fields before his death at the age of 26. He was a close friend of Ludwig Wittgenste ...
acknowledge their debts to pragmatic philosophy, particularly (for Ramsey) to Charles S. Peirce
Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism".
Educated as a chemist and employed as a scientist for t ...
.
The "Ramsey test" for evaluating probability distributions is implementable in theory, and has kept experimental psychologists occupied for a half century.
This work demonstrates that Bayesian-probability propositions can be falsified
Falsifiability is a standard of evaluation of scientific theories and hypotheses that was introduced by the philosopher of science Karl Popper in his book ''The Logic of Scientific Discovery'' (1934). He proposed it as the cornerstone of a sol ...
, and so meet an empirical criterion of Charles S. Peirce
Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism".
Educated as a chemist and employed as a scientist for t ...
, whose work inspired Ramsey. (This falsifiability
Falsifiability is a standard of evaluation of scientific theories and hypotheses that was introduced by the philosopher of science Karl Popper in his book '' The Logic of Scientific Discovery'' (1934). He proposed it as the cornerstone of a s ...
-criterion was popularized by Karl Popper.)
Modern work on the experimental evaluation of personal probabilities uses the randomization, blinding, and Boolean-decision procedures of the Peirce-Jastrow experiment.[Peirce & Jastrow (1885)
] Since individuals act according to different probability judgments, these agents' probabilities are "personal" (but amenable to objective study).
Personal probabilities are problematic for science and for some applications where decision-makers lack the knowledge or time to specify an informed probability-distribution (on which they are prepared to act). To meet the needs of science and of human limitations, Bayesian statisticians have developed "objective" methods for specifying prior probabilities.
Indeed, some Bayesians have argued the prior state of knowledge defines ''the'' (unique) prior probability-distribution for "regular" statistical problems; cf. well-posed problem
The mathematical term well-posed problem stems from a definition given by 20th-century French mathematician Jacques Hadamard. He believed that mathematical models of physical phenomena should have the properties that:
# a solution exists,
# the sol ...
s. Finding the right method for constructing such "objective" priors (for appropriate classes of regular problems) has been the quest of statistical theorists from Laplace to John Maynard Keynes
John Maynard Keynes, 1st Baron Keynes, ( ; 5 June 1883 – 21 April 1946), was an English economist whose ideas fundamentally changed the theory and practice of macroeconomics and the economic policies of governments. Originally trained in ...
, Harold Jeffreys
Sir Harold Jeffreys, FRS (22 April 1891 – 18 March 1989) was a British mathematician, statistician, geophysicist, and astronomer. His book, ''Theory of Probability'', which was first published in 1939, played an important role in the revival ...
, and Edwin Thompson Jaynes
Edwin Thompson Jaynes (July 5, 1922 – April 30, 1998) was the Wayman Crow Distinguished Professor of Physics at Washington University in St. Louis. He wrote extensively on statistical mechanics and on foundations of probability and statist ...
. These theorists and their successors have suggested several methods for constructing "objective" priors (Unfortunately, it is not clear how to assess the relative "objectivity" of the priors proposed under these methods):
* Maximum entropy
* Transformation group analysis
* Reference analysis
Each of these methods contributes useful priors for "regular" one-parameter problems, and each prior can handle some challenging statistical models (with "irregularity" or several parameters). Each of these methods has been useful in Bayesian practice. Indeed, methods for constructing "objective" (alternatively, "default" or "ignorance") priors have been developed by avowed subjective (or "personal") Bayesians like James Berger ( Duke University) and José-Miguel Bernardo (Universitat de València
The University of Valencia ( ca-valencia, Universitat de València ; also known as UV) is a public university, public research university located in the city of Valencia, Spain, Valencia, Spain. It is one of the List of oldest universities in ...
), simply because such priors are needed for Bayesian practice, particularly in science. The quest for "the universal method for constructing priors" continues to attract statistical theorists.
Thus, the Bayesian statistician needs either to use informed priors (using relevant expertise or previous data) or to choose among the competing methods for constructing "objective" priors.
See also
* ''An Essay towards solving a Problem in the Doctrine of Chances
''An Essay towards solving a Problem in the Doctrine of Chances'' is a work on the mathematical theory of probability by Thomas Bayes, published in 1763, two years after its author's death, and containing multiple amendments and additions due to h ...
''
* Bayesian epistemology
Bayesian epistemology is a formal approach to various topics in epistemology that has its roots in Thomas Bayes' work in the field of probability theory. One advantage of its formal method in contrast to traditional epistemology is that its concep ...
* Bertrand paradox—a paradox in classical probability
* Credal network
* De Finetti's game—a procedure for evaluating someone's subjective probability
* Monty Hall problem
The Monty Hall problem is a brain teaser, in the form of a probability puzzle, loosely based on the American television game show ''Let's Make a Deal'' and named after its original host, Monty Hall. The problem was originally posed (and solved) ...
* QBism
In physics and the philosophy of physics, quantum Bayesianism is a collection of related approaches to the interpretation of quantum mechanics, of which the most prominent is QBism (pronounced "cubism"). QBism is an interpretation that takes an a ...
—an interpretation of quantum mechanics
An interpretation of quantum mechanics is an attempt to explain how the mathematical theory of quantum mechanics might correspond to experienced reality. Although quantum mechanics has held up to rigorous and extremely precise tests in an extraord ...
based on subjective Bayesian probability
* Reference class problem
In statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case.
For example, to estimate the probability of an aircraft crashing, we could refer to the fre ...
References
Bibliography
*
*
*
*
*
*
* (translation of de Finetti, 1931)
* (translation of de Finetti, 1937, above)
* , , two volumes.
* Goertz, Gary and James Mahoney. 2012. ''A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences''. Princeton University Press.
*.
*
Partly reprinted in
*
*
*
*
*
* (
*
*
*
*
*
*
*
*
*
* {{cite book , author=Winkler, R.L. , title=Introduction to Bayesian Inference and Decision , publisher=Probabilistic , year=2003 , isbn=978-0-9647938-4-2 , edition=2nd , quote=Updated classic textbook. Bayesian theory clearly presented
Probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speakin ...
Justification (epistemology)
Probability interpretations
Philosophy of mathematics
Philosophy of science