The base rate fallacy, also called base rate neglect or base rate bias, is a type of
fallacy
A fallacy is the use of invalid or otherwise faulty reasoning, or "wrong moves," in the construction of an argument which may appear stronger than it really is if the fallacy is not spotted. The term in the Western intellectual tradition was intr ...
in which people tend to ignore the
base rate
In probability and statistics, the base rate (also known as prior probabilities) is the class of probabilities unconditional on "featural evidence" (likelihoods).
For example, if 1% of the population were medical professionals, and remaining ...
(i.e., general
prevalence
In epidemiology, prevalence is the proportion of a particular population found to be affected by a medical condition (typically a disease or a risk factor such as smoking or seatbelt use) at a specific time. It is derived by comparing the number o ...
) in favor of the individuating information (i.e., information pertaining only to a specific case).
Base rate neglect is a specific form of the more general
extension neglect
__NOTOC__
Extension neglect is a type of cognitive bias which occurs when the sample size is ignored when its determination is relevant. For instance, when reading an article about a scientific study, extension neglect occurs when the reader igno ...
.
False positive paradox
An example of the base rate fallacy is the false positive paradox. This paradox describes situations where there are more
false positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
test results than true positives. For example, if a facial recognition camera can identify wanted criminals 99% accurately, but analyzes 10,000 people a day, the high accuracy is outweighed by the number of tests, and the program's list of criminals will likely have far more false positives than true. The probability of a positive test result is determined not only by the accuracy of the test but also by the characteristics of the sampled population. When the prevalence, the proportion of those who have a given condition, is lower than the test's
false positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
rate, even tests that have a very low risk of giving a false positive ''in an individual case'' will give more false than true positives ''overall''.
[ - Citing: ] The paradox surprises most people.
It is especially counter-intuitive when interpreting a positive result in a test on a low-prevalence
population
Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction using a ...
after having dealt with positive results drawn from a high-prevalence population.
If the
false positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
rate of the test is higher than the proportion of the ''new'' population with the condition, then a test administrator whose experience has been drawn from testing in a high-prevalence population may
conclude from experience that a positive test result usually indicates a positive subject, when in fact a false positive is far more likely to have occurred.
Examples
Example 1: Disease
High-incidence population
Imagine running an infectious disease test on a population ''A'' of 1000 persons, of which 40% are infected. The test has a false positive rate of 5% (0.05) and no false negative rate. The
expected outcome of the 1000 tests on population ''A'' would be:
:Infected and test indicates disease (
true positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
)
::1000 × = 400 people would receive a true positive
:Uninfected and test indicates disease (false positive)
::1000 × × 0.05 = 30 people would receive a false positive
:The remaining 570 tests are correctly negative.
So, in population ''A'', a person receiving a positive test could be over 93% confident () that it correctly indicates infection.
Low-incidence population
Now consider the same test applied to population ''B'', of which only 2% are infected. The
expected outcome of 1000 tests on population ''B'' would be:
:Infected and test indicates disease (
true positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
)
::1000 × = 20 people would receive a true positive
:Uninfected and test indicates disease (false positive)
::1000 × × 0.05 = 49 people would receive a false positive
:The remaining 931 tests are correctly negative.
In population ''B'', only 20 of the 69 total people with a positive test result are actually infected. So, the probability of actually being infected after one is told that one is infected is only 29% () for a test that otherwise appears to be "95% accurate".
A tester with experience of group ''A'' might find it a paradox that in group ''B'', a result that had usually correctly indicated infection is now usually a
false positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
. The confusion of the
posterior probability
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior p ...
of infection with the
prior probability
In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
of receiving a false positive is a natural
error
An error (from the Latin ''error'', meaning "wandering") is an action which is inaccurate or incorrect. In some usages, an error is synonymous with a mistake. The etymology derives from the Latin term 'errare', meaning 'to stray'.
In statistics ...
after receiving a health-threatening test result.
Example 2: Drunk drivers
: A group of police officers have
breathalyzer
A breathalyzer or breathalyser (a portmanteau of ''breath'' and ''analyzer/analyser'') is a device for estimating blood alcohol content (BAC), or to detect viruses or diseases from a breath sample.
The name is a genericized trademark of the Br ...
s displaying false drunkenness in 5% of the cases in which the driver is sober. However, the breathalyzers never fail to detect a truly drunk person. One in a thousand drivers is driving drunk. Suppose the police officers then stop a driver at random to administer a breathalyzer test. It indicates that the driver is drunk. We assume you do not know anything else about them. How high is the probability they really are drunk?
Many would answer as high as 95%, but the correct probability is about 2%.
An explanation for this is as follows: on average, for every 1,000 drivers tested,
* 1 driver is drunk, and it is 100% certain that for that driver there is a ''true'' positive test result, so there is 1 ''true'' positive test result
* 999 drivers are not drunk, and among those drivers there are 5% ''false'' positive test results, so there are 49.95 ''false'' positive test results
Therefore, the probability that one of the drivers among the 1 + 49.95 = 50.95 positive test results really is drunk is
.
The validity of this result does, however, hinge on the validity of the initial assumption that the police officer stopped the driver truly at random, and not because of bad driving. If that or another non-arbitrary reason for stopping the driver was present, then the calculation also involves the probability of a drunk driver driving competently and a non-drunk driver driving (in-)competently.
More formally, the same probability of roughly 0.02 can be established using
Bayes's theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For exampl ...
. The goal is to find the probability that the driver is drunk given that the breathalyzer indicated they are drunk, which can be represented as
:
where ''D'' means that the breathalyzer indicates that the driver is drunk. Bayes's theorem tells us that
:
We were told the following in the first paragraph:
:
:
:
and
:
As you can see from the formula, one needs ''p''(''D'') for Bayes' theorem, which one can compute from the preceding values using the
law of total probability
In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities. It expresses the total probability of an outcome which can be realized via several distinct even ...
:
:
which gives
:
Plugging these numbers into Bayes' theorem, one finds that
:
Example 3: Terrorist identification
In a city of 1 million inhabitants, let there be 100 terrorists and 999,900 non-terrorists. To simplify the example, it is assumed that all people present in the city are inhabitants. Thus, the base rate probability of a randomly selected inhabitant of the city being a terrorist is 0.0001, and the base rate probability of that same inhabitant being a non-terrorist is 0.9999. In an attempt to catch the terrorists, the city installs an alarm system with a surveillance camera and automatic
facial recognition software
A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and wo ...
.
The software has two failure rates of 1%:
* The false negative rate: If the camera scans a terrorist, a bell will ring 99% of the time, and it will fail to ring 1% of the time.
* The false positive rate: If the camera scans a non-terrorist, a bell will not ring 99% of the time, but it will ring 1% of the time.
Suppose now that an inhabitant triggers the alarm. What is the probability that the person is a terrorist? In other words, what is P(T , B), the probability that a terrorist has been detected given the ringing of the bell? Someone making the 'base rate fallacy' would infer that there is a 99% probability that the detected person is a terrorist. Although the inference seems to make sense, it is actually bad reasoning, and a calculation below will show that the probability of a terrorist is actually near 1%, not near 99%.
The fallacy arises from confusing the natures of two different failure rates. The 'number of non-bells per 100 terrorists' and the 'number of non-terrorists per 100 bells' are unrelated quantities. One does not necessarily equal the other, and they don't even have to be almost equal. To show this, consider what happens if an identical alarm system were set up in a second city with no terrorists at all. As in the first city, the alarm sounds for 1 out of every 100 non-terrorist inhabitants detected, but unlike in the first city, the alarm never sounds for a terrorist. Therefore, 100% of all occasions of the alarm sounding are for non-terrorists, but a false negative rate cannot even be calculated. The 'number of non-terrorists per 100 bells' in that city is 100, yet P(T , B) = 0%. There is zero chance that a terrorist has been detected given the ringing of the bell.
Imagine that the first city's entire population of one million people pass in front of the camera. About 99 of the 100 terrorists will trigger the alarm—and so will about 9,999 of the 999,900 non-terrorists. Therefore, about 10,098 people will trigger the alarm, among which about 99 will be terrorists. The probability that a person triggering the alarm actually is a terrorist is only about 99 in 10,098, which is less than 1%, and very, very far below our initial guess of 99%.
The base rate fallacy is so misleading in this example because there are many more non-terrorists than terrorists, and the number of false positives (non-terrorists scanned as terrorists) is so much larger than the true positives (terrorists scanned as terrorists).
Multiple practitioners have argued that as the base rate of terrorism is extremely low, using
data mining and predictive algorithms to identify terrorists cannot feasibly work due to the false positive paradox.
Estimates of the number of false positives for each accurate result vary from over ten thousand
to one billion;
consequently, investigating each lead would be cost and time prohibitive.
The level of accuracy required to make these models viable is likely unachievable. Foremost the low base rate of terrorism also means there is a lack of data with which to make an accurate algorithm.
Further, in the context of detecting terrorism false negatives are highly undesirable and thus must be minimised as much as possible, however this requires
increasing sensitivity at the cost of specificity, increasing false positives.
It is also questionable whether the use of such models by law enforcement would meet the requisite
burden of proof given that over 99% of results would be false positives.
Findings in psychology
In experiments, people have been found to prefer individuating information over general information when the former is available.
In some experiments, students were asked to estimate the
grade point average
Grading in education is the process of applying standardized measurements for varying levels of achievements in a course. Grades can be assigned as letters (usually A through F), as a range (for example, 1 to 6), as a percentage, or as a numbe ...
s (GPAs) of hypothetical students. When given relevant statistics about GPA distribution, students tended to ignore them if given descriptive information about the particular student even if the new descriptive information was obviously of little or no relevance to school performance.
This finding has been used to argue that interviews are an unnecessary part of the
college admissions
University admission or college admission is the process through which students enter tertiary education at universities and colleges. Systems vary widely from country to country, and sometimes from institution to institution.
In many countries, ...
process because interviewers are unable to pick successful candidates better than basic statistics.
Psychologists
Daniel Kahneman
Daniel Kahneman (; he, דניאל כהנמן; born March 5, 1934) is an Israeli-American psychologist and economist notable for his work on the psychology of judgment and decision-making, as well as behavioral economics, for which he was award ...
and
Amos Tversky
Amos Nathan Tversky ( he, עמוס טברסקי; March 16, 1937 – June 2, 1996) was an Israeli cognitive and mathematical psychologist and a key figure in the discovery of systematic human cognitive bias and handling of risk.
Much of his ...
attempted to explain this finding in terms of a
simple rule or "heuristic" called
representativeness The representativeness heuristic is used when making judgments about the probability of an event under uncertainty. It is one of a group of heuristics (simple rules governing judgment or decision-making) proposed by psychologists Amos Tversky and ...
. They argued that many judgments relating to likelihood, or to cause and effect, are based on how representative one thing is of another, or of a category.
Kahneman considers base rate neglect to be a specific form of
extension neglect
__NOTOC__
Extension neglect is a type of cognitive bias which occurs when the sample size is ignored when its determination is relevant. For instance, when reading an article about a scientific study, extension neglect occurs when the reader igno ...
.
Richard Nisbett
__NOTOC__
Richard Eugene Nisbett (born June 1, 1941) is an American social psychologist and writer. He is the Theodore M. Newcomb Distinguished Professor of social psychology and co-director of the Culture and Cognition program at the University ...
has argued that some
attributional bias
In psychology, an attribution bias or attributional bias is a cognitive bias that refers to the systematic errors made when people evaluate or try to find reasons for their own and others' behaviors.Kelley, H.H. (1967). Attribution theory in social ...
es like the
fundamental attribution error
In social psychology, fundamental attribution error (FAE), also known as correspondence bias or attribution effect, is the tendency for people to under-emphasize situational and environmental explanations for an individual's observed behavior whil ...
are instances of the base rate fallacy: people do not use the "consensus information" (the "base rate") about how others behaved in similar situations and instead prefer simpler
dispositional attribution
Dispositional attribution ''(or internal attribution)'' is a phrase in personality psychology that refers to the tendency to assign responsibility for others' behaviors due to their inherent characteristics, such as their motives, beliefs or perso ...
s.
There is considerable debate in psychology on the conditions under which people do or do not appreciate base rate information.
Researchers in the heuristics-and-biases program have stressed empirical findings showing that people tend to ignore base rates and make inferences that violate certain norms of probabilistic reasoning, such as
Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
. The conclusion drawn from this line of research was that human probabilistic thinking is fundamentally flawed and error-prone.
Other researchers have emphasized the link between cognitive processes and information formats, arguing that such conclusions are not generally warranted.
Consider again Example 2 from above. The required inference is to estimate the (posterior) probability that a (randomly picked) driver is drunk, given that the breathalyzer test is positive. Formally, this probability can be calculated using
Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
, as shown above. However, there are different ways of presenting the relevant information. Consider the following, formally equivalent variant of the problem:
: 1 out of 1000 drivers are driving drunk. The breathalyzers never fail to detect a truly drunk person. For 50 out of the 999 drivers who are not drunk the breathalyzer falsely displays drunkenness. Suppose the policemen then stop a driver at random, and force them to take a breathalyzer test. It indicates that they are drunk. We assume you don't know anything else about them. How high is the probability they really are drunk?
In this case, the relevant numerical information—''p''(drunk), ''p''(''D'' , drunk), ''p''(''D'' , sober)—is presented in terms of natural frequencies with respect to a certain reference class (see
reference class problem
In statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case.
For example, to estimate the probability of an aircraft crashing, we could refer to the fre ...
). Empirical studies show that people's inferences correspond more closely to Bayes' rule when information is presented this way, helping to overcome base-rate neglect in laypeople
and experts.
As a consequence, organizations like the
Cochrane Collaboration
Cochrane (previously known as the Cochrane Collaboration) is a British international charitable organisation formed to organise medical research findings to facilitate evidence-based choices about health interventions involving health profess ...
recommend using this kind of format for communicating health statistics.
Teaching people to translate these kinds of Bayesian reasoning problems into natural frequency formats is more effective than merely teaching them to plug probabilities (or percentages) into Bayes' theorem.
It has also been shown that graphical representations of natural frequencies (e.g., icon arrays, hypothetical outcome plots) help people to make better inferences.
Why are natural frequency formats helpful? One important reason is that this information format facilitates the required inference because it simplifies the necessary calculations. This can be seen when using an alternative way of computing the required probability ''p''(drunk, ''D''):
:
where ''N''(drunk ∩ ''D'') denotes the number of drivers that are drunk and get a positive breathalyzer result, and ''N''(''D'') denotes the total number of cases with a positive breathalyzer result. The equivalence of this equation to the above one follows from the axioms of probability theory, according to which ''N''(drunk ∩ ''D'') = ''N'' × ''p'' (''D'' , drunk) × ''p'' (drunk). Importantly, although this equation is formally equivalent to Bayes' rule, it is not psychologically equivalent. Using natural frequencies simplifies the inference because the required mathematical operation can be performed on natural numbers, instead of normalized fractions (i.e., probabilities), because it makes the high number of false positives more transparent, and because natural frequencies exhibit a "nested-set structure".
Not every frequency format facilitates Bayesian reasoning.
Natural frequencies refer to frequency information that results from ''natural sampling'',
which preserves base rate information (e.g., number of drunken drivers when taking a random sample of drivers). This is different from ''systematic sampling'', in which base rates are fixed a priori (e.g., in scientific experiments). In the latter case it is not possible to infer the posterior probability ''p'' (drunk , positive test) from comparing the number of drivers who are drunk and test positive compared to the total number of people who get a positive breathalyzer result, because base rate information is not preserved and must be explicitly re-introduced using Bayes' theorem.
See also
*
Base rate
In probability and statistics, the base rate (also known as prior probabilities) is the class of probabilities unconditional on "featural evidence" (likelihoods).
For example, if 1% of the population were medical professionals, and remaining ...
*
Bayesian probability
Bayesian probability is an Probability interpretations, interpretation of the concept of probability, in which, instead of frequentist probability, frequency or propensity probability, propensity of some phenomenon, probability is interpreted as re ...
*
Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
*
Data dredging
Data dredging (also known as data snooping or ''p''-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives. ...
*
Inductive argument
*
List of cognitive biases
Cognitive biases are systematic patterns of deviation from norm and/or rationality in judgment. They are often studied in psychology, sociology and behavioral economics.
Although the reality of most of these biases is confirmed by reproducible ...
*
List of paradoxes
This list includes well known paradoxes, grouped thematically. The grouping is approximate, as paradoxes may fit into more than one category. This list collects only scenarios that have been called a paradox by at least one source and have their ...
*
Misleading vividness
Anecdotal evidence is evidence based only on personal observation, collected in a casual or non-systematic manner. The term is sometimes used in a legal context to describe certain kinds of testimony which are uncorroborated by objective, independ ...
*
Prevention paradox
The prevention paradox describes the seemingly contradictory situation where the majority of cases of a disease come from a population at low or moderate risk of that disease, and only a minority of cases come from the high risk population (of t ...
*
Prosecutor's fallacy
The prosecutor's fallacy is a fallacy of statistical reasoning involving a test for an occurrence, such as a DNA match. A positive result in the test may paradoxically be more likely to be an erroneous result than an actual occurrence, even i ...
, a mistake in reasoning that involves ignoring a low
prior probability
In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
*
Simpson's paradox
Simpson's paradox is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined. This result is often encountered in social-science and medical-science st ...
, another error in statistical reasoning dealing with comparing groups
*
Stereotype
In social psychology, a stereotype is a generalized belief about a particular category of people. It is an expectation that people might have about every person of a particular group. The type of expectation can vary; it can be, for example ...
*
Intuitive statistics Intuitive statistics, or folk statistics, refers to the cognitive phenomenon where organisms use data to make generalizations and predictions about the world. This can be a small amount of sample data or training instances, which in turn contribute ...
References
External links
The Base Rate FallacyThe Fallacy Files
{{Fallacies
Relevance fallacies
Cognitive biases
Behavioral finance
Probability fallacies
Statistical paradoxes