Probabilistic causation is a concept in a group of philosophical theories that aim to characterize the relationship between cause and effect using the tools of

probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...

. The central idea behind these theories is that causes raise the probabilities of their effects,

all else being equal ' (also spelled '; () is a Latin phrase, meaning "other things equal"; some other English translations of the phrase are "all other things being equal", "other things held constant", "all else unchanged", and "all else being equal". A statement ...

Deterministic versus probabilistic theory

Interpreting causation as a

deterministic Determinism is a philosophical view, where all events are determined completely by previously existing causes. Deterministic theories throughout the history of philosophy have developed from diverse and sometimes overlapping motives and cons ...

relation means that if ''A'' causes ''B'', then ''A'' must ''always'' be followed by ''B''. In this sense, war does not cause deaths, nor does

smoking Smoking is a practice in which a substance is burned and the resulting smoke is typically breathed in to be tasted and absorbed into the bloodstream. Most commonly, the substance used is the dried leaves of the tobacco plant, which have b ...

cause

cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...

. As a result, many turn to a notion of probabilistic causation. Informally, ''A'' probabilistically causes ''B'' if ''As occurrence increases the probability of ''B''. This is sometimes interpreted to reflect imperfect knowledge of a deterministic system but other times interpreted to mean that the causal system under study has an inherently

indeterministic Indeterminism is the idea that events (or certain events, or events of certain types) are not caused, or do not cause deterministically. It is the opposite of determinism and related to chance. It is highly relevant to the philosophical probl ...

nature. (

Propensity probability The propensity theory of probability is a probability interpretation in which the probability is thought of as a physical propensity, disposition, or tendency of a given type of situation to yield an outcome of a certain kind, or to yield a long ...

is an analogous idea, according to which probabilities have an objective existence and are not just limitations in a subject's knowledge). Philosophers such as

Hugh Mellor David Hugh Mellor (; 10 July 1938 – 21 June 2020) was a British philosopher. He was a Professor of Philosophy and Pro-Vice-Chancellor, later Professor Emeritus, of Cambridge University. Biography Mellor was born in London on 10 July 1938, ...

and

Patrick Suppes Patrick Colonel Suppes (; March 17, 1922 – November 17, 2014) was an American philosopher who made significant contributions to philosophy of science, the theory of measurement, the foundations of quantum mechanics, decision theory, psychology ...

Suppes, P. (1970) ''A Probabilistic Theory of Causality'', Amsterdam: North-Holland Publishing have defined causation in terms of a cause preceding and increasing the probability of the effect. (Additionally, Mellor claims that cause and effect are both facts - not events - since even a non-event, such as the failure of a train to arrive, can cause effects such as my taking the bus. Suppes, by contrast, relies on events defined set-theoretically, and much of his discussion is informed by this terminology.) Pearl Pearl, Judea (2000). ''Causality: Models, Reasoning, and Inference,'' Cambridge University Press. argues that the entire enterprise of probabilistic causation has been misguided from the very beginning, because the central notion that causes "raise the probabilities" of their effects cannot be expressed in the language of probability theory. In particular, the inequality ''Pr(effect , cause) > Pr(effect , ~cause)'' which philosophers invoked to define causation, as well as its many variations and nuances, fails to capture the intuition behind "probability raising", which is inherently a manipulative or counterfactual notion. The correct formulation, according to Pearl, should read:

''Pr(effect , do(cause)) > Pr(effect , do(~cause))''

where ''do(C)'' stands for an external intervention that compels the truth of ''C''. The conditional probability ''Pr(E , C)'', in contrast, represents a probability resulting from a passive observation of ''C'', and rarely coincides with ''Pr(E , do(C))''. Indeed, observing the barometer falling increases the probability of a storm coming, but does not "cause" the storm; were the act of manipulating the barometer to change the probability of storms, the falling barometer would qualify as a cause of storms. In general, formulating the notion of "probability raising" within the calculus of ''do''-operators resolves the difficulties that probabilistic causation has encountered in the past half-century,Cartwright, N. (1989). ''Nature's Capacities and Their Measurement,'' Clarendon Press, Oxnard.Eells, E. (1991). ''Probabilistic Causality'' Cambridge University Press, Cambridge, MA. among them the infamous

Simpson's paradox Simpson's paradox is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined. This result is often encountered in social-science and medical-science st ...

, and clarifies precisely what relationships exist between probabilities and causation. The establishing of cause and effect, even with this relaxed reading, is notoriously difficult, expressed by the widely accepted statement "

Correlation does not imply causation The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or correlation between them. The id ...

". For instance, the observation that smokers have a dramatically increased lung cancer rate does not establish that smoking must be a ''cause'' of that increased cancer rate: maybe there exists a certain genetic defect which both causes cancer and a yearning for nicotine; or even perhaps nicotine craving is a symptom of very early-stage lung cancer which is not otherwise detectable. Scientists are always seeking the exact mechanisms by which Event ''A'' produces Event ''B''. But scientists also are comfortable making a statement like, "Smoking probably causes cancer," when the statistical correlation between the two, according to probability theory, is far greater than chance. In this dual approach, scientists accept both deterministic and probabilistic causation in their terminology. In

statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...

, it is generally accepted that observational studies (like counting cancer cases among smokers and among non-smokers and then comparing the two) can give hints, but can never ''establish'' cause and effect. Often, however, qualitative causal assumptions (e.g., absence of causation between some variables) may permit the derivation of consistent causal effect estimates from observational studies. The gold standard for causation here is the ''randomized experiment'': take a large number of people, randomly divide them into two groups, force one group to smoke and prohibit the other group from smoking, then determine whether one group develops a significantly higher lung cancer rate. Random assignment plays a crucial role in the inference to causation because, in the long run, it renders the two groups equivalent in terms of all other possible effects on the outcome (cancer) so that any changes in the outcome will reflect only the manipulation (smoking). Obviously, for ethical reasons this

experiment An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs whe ...

cannot be performed, but the method is widely applicable for less damaging experiments. One limitation of experiments, however, is that whereas they do a good job of testing for the presence of some causal effect they do less well at estimating the size of that effect in a population of interest. (This is a common criticism of studies of safety of food additives that use doses much higher than people consuming the product would actually ingest.)

Closed versus open systems

In a

closed system A closed system is a natural physical system that does not allow transfer of matter in or out of the system, although — in contexts such as physics, chemistry or engineering — the transfer of energy (''e.g.'' as work or heat) is allowed. In ...

the data may suggest that cause A * B precedes effect C in a defined interval of time τ. This relationship can determine causality with confidence bounded by τ. However, this same relationship may not be deterministic with confidence in an open system where uncontrolled factors may affect the result.Markov Condition: ''Interpretations of Philosophy''
/ref> An example would be a system of A, B and C, where A, B and C are known. Characteristics are below and limited to a given time (such as 50 ms, or 50 hours): ^A * ^ B => ^ C (99.9999998027%) A * ^B => ^C (99.9999998027%) ^A * B => ^C (99.9999998027%) A * B => C (99.9999998027%) One can reasonably claim, within 6

Standard Deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, whil ...

s, that A * B cause C given the time boundary (such as 50 ms, or 50 hours) IF And Only IF A, B and C are the only parts of the system in question. Any result outside of this may be considered a deviation.

Notes

References

*{{cite SEP , url-id=causation-probabilistic , title=Probabilistic Causation , last=Hitchcock , first=Christopher Causality Causal inference Probabilistic arguments