In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the
dependent variable and independent variable, causing a
spurious association. Confounding is a
causal
Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cau ...
concept, and as such, cannot be described in terms of correlations or associations.
[Pearl, J., (2009). ]Simpson's Paradox
Simpson's paradox is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined. This result is often encountered in social-science and medical-science st ...
, Confounding, and Collapsibility In ''Causality: Models, Reasoning and Inference'' (2nd ed.). New York : Cambridge University Press. The existence of confounders is an important quantitative explanation why
correlation does not imply causation
The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or correlation between them. The id ...
.
Confounds are threats to
internal validity
Internal validity is the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study. It is one of the most important properties of scientific studies and is an important concept in reason ...
.
Definition
Confounding is defined in terms of the data generating model. Let ''X'' be some
independent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
, and ''Y'' some
dependent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
. To estimate the effect of ''X'' on ''Y'', the statistician must suppress the effects of
extraneous variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
s that influence both ''X'' and ''Y''. We say that ''X'' and ''Y'' are confounded by some other variable ''Z'' whenever ''Z'' causally influences both ''X'' and ''Y''.
Let
be the probability of event ''Y'' = ''y'' under the hypothetical intervention ''X'' = ''x''. ''X'' and ''Y'' are not confounded if and only if the following holds:
for all values ''X'' = ''x'' and ''Y'' = ''y'', where
is the
conditional probability
In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occur ...
upon seeing ''X'' = ''x''. Intuitively, this equality states that ''X'' and ''Y'' are not confounded whenever the observationally witnessed association between them is the same as the association that would be measured in a
controlled experiment
A scientific control is an experiment or observation designed to minimize the effects of variables other than the independent variable (i.e. confounding variables). This increases the reliability of the results, often through a comparison betw ...
, with ''x''
randomize Randomization is the process of making something random. Randomization is not haphazard; instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution d ...
d.
In principle, the defining equality
can be verified from the data generating model, assuming we have all the equations and probabilities associated with the model. This is done by simulating an intervention
(see
Bayesian network
A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...
) and checking whether the resulting probability of ''Y'' equals the conditional probability
. It turns out, however, that graph structure alone is sufficient for verifying the equality
.
Control
Consider a researcher attempting to assess the effectiveness of drug ''X'', from population data in which drug usage was a patient's choice. The data shows that gender (''Z'') influences a patient's choice of drug as well as their chances of recovery (''Y''). In this scenario, gender ''Z'' confounds the relation between ''X'' and Y since ''Z'' is a cause of both ''X'' and ''Y'':
We have that
because the observational quantity contains information about the correlation between ''X'' and ''Z'', and the interventional quantity does not (since ''X'' is not correlated with ''Z'' in a randomized experiment). It can be shown
[Pearl, J., (1993). "Aspects of Graphical Models Connected With Causality," ''In Proceedings of the 49th Session of the International Statistical Science Institute,'' pp. 391–401.] that, in cases where only observational data are available, an unbiased estimate of the desired quantity
, can
be obtained by "adjusting" for all confounding factors, namely, conditioning on their various values and averaging the result. In the case of a single confounder ''Z'', this leads to the "adjustment formula":
which gives an unbiased estimate for the causal effect of ''X'' on ''Y''. The same adjustment formula works when there are multiple confounders except, in this case, the choice of a set ''Z'' of variables that would guarantee unbiased estimates must be done with caution. The criterion for a proper choice of variables is called the Back-Door
[Pearl, J. (2009). Causal Diagrams and the Identification of Causal Effects In ''Causality: Models, Reasoning and Inference'' (2nd ed.). New York, NY, USA: Cambridge University Press.] and requires that the chosen set ''Z'' "blocks" (or intercepts) every path between ''X'' and ''Y'' that contains an arrow into X. Such sets are called "Back-Door admissible" and may include variables which are not common causes of ''X'' and ''Y'', but merely proxies thereof.
Returning to the drug use example, since ''Z'' complies with the Back-Door requirement (i.e., it intercepts the one Back-Door path
), the Back-Door adjustment formula is valid:
In this way the physician can predict the likely effect of administering the drug from observational studies in which the conditional probabilities appearing on the right-hand side of the equation can be estimated by regression.
Contrary to common beliefs, adding covariates to the adjustment set ''Z'' can introduce bias. A typical counterexample occurs when ''Z'' is a common effect of ''X'' and ''Y'', a case in which ''Z'' is not a confounder (i.e., the null set is Back-door admissible) and adjusting for ''Z'' would create bias known as "
collider
A collider is a type of particle accelerator which brings two opposing particle beams together such that the particles collide. Colliders may either be ring accelerators or linear accelerators.
Colliders are used as a research tool in particle ...
bias" or "
Berkson's paradox
Berkson's paradox, also known as Berkson's bias, collider bias, or Berkson's fallacy, is a result in conditional probability and statistics which is often found to be counterintuitive, and hence a veridical paradox. It is a complicating factor ari ...
."
In general, confounding can be controlled by adjustment if and only if there is a set of observed covariates that satisfies the Back-Door condition. Moreover, if ''Z'' is such a set, then the adjustment formula of Eq. (3) is valid.
[Pearl, J. (2009). Causal Diagrams and the Identification of Causal Effects In ''Causality: Models, Reasoning and Inference'' (2nd ed.). New York, NY, USA: Cambridge University Press.] Pearl's do-calculus provides all possible conditions under which
can be estimated, not necessarily by adjustment.
History
According to Morabia (2011), the word derives from the
Medieval Latin
Medieval Latin was the form of Literary Latin used in Roman Catholic Western Europe during the Middle Ages. In this region it served as the primary written language, though local languages were also written to varying degrees. Latin functioned ...
verb "confudere", which meant "mixing", and was probably chosen to represent the confusion (from Latin: con=with + fusus=mix or fuse together) between the cause one wishes to assess and other causes that may affect the outcome and thus confuse, or stand in the way of the desired assessment.
Fisher
Fisher is an archaic term for a fisherman, revived as gender-neutral.
Fisher, Fishers or The Fisher may also refer to:
Places
Australia
*Division of Fisher, an electoral district in the Australian House of Representatives, in Queensland
*Elect ...
used the word "confounding" in his 1935 book "The Design of Experiments" to denote any source of error in his ideal of randomized experiment. According to Vandenbroucke (2004) it was
Kish
Kish may refer to:
Geography
* Gishi, Nagorno-Karabakh, Azerbaijan, a village also called Kish
* Kiş, Shaki, Azerbaijan, a village and municipality also spelled Kish
* Kish Island, an Iranian island and a city in the Persian Gulf
* Kish, Iran, ...
who used the word "confounding" in the modern sense of the word, to mean "incomparability" of two or more groups (e.g., exposed and unexposed) in an observational study.
Formal conditions defining what makes certain groups "comparable" and others "incomparable" were later developed in
epidemiology
Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population.
It is a cornerstone of public health, and shapes policy decisions and evidenc ...
by Greenland and Robins (1986) using the counterfactual language of
Neyman (1935) and
Rubin
Rubin is both a surname and a given name. Rubins is a Latvian-language form of the name. As a Jewish name, it derives from the biblical name Reuben. The choice is also influenced by the word ''rubin'' meaning "ruby" is some languages. (1974). These were later supplemented by graphical criteria such as the Back-Door condition (
Pearl
A pearl is a hard, glistening object produced within the soft tissue (specifically the mantle) of a living shelled mollusk or another animal, such as fossil conulariids. Just like the shell of a mollusk, a pearl is composed of calcium carb ...
1993; Greenland, Pearl and Robins, 1999).
Graphical criteria were shown to be formally equivalent to the counterfactual definition but more transparent to researchers relying on process models.
Types
In the case of
risk assessment
Broadly speaking, a risk assessment is the combined effort of:
# identifying and analyzing potential (future) events that may negatively impact individuals, assets, and/or the environment (i.e. hazard analysis); and
# making judgments "on the to ...
s evaluating the magnitude and nature of risk to
human
Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, ...
health
Health, according to the World Health Organization, is "a state of complete physical, mental and social well-being and not merely the absence of disease and infirmity".World Health Organization. (2006)''Constitution of the World Health Organiza ...
, it is important to control for confounding to isolate the effect of a particular hazard such as a food additive,
pesticide
Pesticides are substances that are meant to control pests. This includes herbicide, insecticide, nematicide, molluscicide, piscicide, avicide, rodenticide, bactericide, insect repellent, animal repellent, microbicide, fungicide, and lampri ...
, or new drug. For prospective studies, it is difficult to recruit and screen for volunteers with the same background (age, diet, education, geography, etc.), and in historical studies, there can be similar variability. Due to the inability to control for variability of volunteers and human studies, confounding is a particular challenge. For these reasons,
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into Causality, cause-and-effect by demonstrating what outcome oc ...
s offer a way to avoid most forms of confounding.
In some disciplines, confounding is categorized into different types. In
epidemiology
Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population.
It is a cornerstone of public health, and shapes policy decisions and evidenc ...
, one type is "confounding by indication", which relates to confounding from
observational studies
In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample (statistics), sample to a statistical population, population where the dependent and independent variables, independ ...
. Because prognostic factors may influence treatment decisions (and bias estimates of treatment effects), controlling for known prognostic factors may reduce this problem, but it is always possible that a forgotten or unknown factor was not included or that factors interact complexly. Confounding by indication has been described as the most important limitation of observational studies. Randomized trials are not affected by confounding by indication due to
random assignment
Random assignment or random placement is an experimental technique for assigning human participants or animal subjects to different groups in an experiment (e.g., a treatment group versus a control group) using randomization, such as by a chan ...
.
Confounding variables may also be categorised according to their source. The choice of measurement instrument (operational confound), situational characteristics (procedural confound), or inter-individual differences (person confound).
*An operational confounding can occur in both
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into Causality, cause-and-effect by demonstrating what outcome oc ...
al and non-experimental research designs. This type of confounding occurs when a measure designed to assess a particular construct inadvertently measures something else as well.
*A procedural confounding can occur in a laboratory experiment or a
quasi-experiment
A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design ...
. This type of confound occurs when the researcher mistakenly allows another variable to change along with the manipulated independent variable.
[
*A person confounding occurs when two or more groups of units are analyzed together (e.g., workers from different occupations), despite varying according to one or more other (observed or unobserved) characteristics (e.g., gender).
]
Examples
Say one is studying the relation between birth order (1st child, 2nd child, etc.) and the presence of Down Syndrome
Down syndrome or Down's syndrome, also known as trisomy 21, is a genetic disorder caused by the presence of all or part of a third copy of chromosome 21. It is usually associated with physical growth delays, mild to moderate intellectual dis ...
in the child. In this scenario, maternal age would be a confounding variable:
# Higher maternal age is directly associated with Down Syndrome in the child
# Higher maternal age is directly associated with Down Syndrome, regardless of birth order (a mother having her 1st vs 3rd child at age 50 confers the same risk)
# Maternal age is directly associated with birth order (the 2nd child, except in the case of twins, is born when the mother is older than she was for the birth of the 1st child)
# Maternal age is not a consequence of birth order (having a 2nd child does not change the mother's age)
In risk assessment
Broadly speaking, a risk assessment is the combined effort of:
# identifying and analyzing potential (future) events that may negatively impact individuals, assets, and/or the environment (i.e. hazard analysis); and
# making judgments "on the to ...
s, factors such as age, gender, and educational levels often affect health status and so should be controlled. Beyond these factors, researchers may not consider or have access to data on other causal factors. An example is on the study of smoking tobacco on human health. Smoking, drinking alcohol, and diet are lifestyle activities that are related. A risk assessment that looks at the effects of smoking but does not control for alcohol consumption or diet may overestimate the risk of smoking. Smoking and confounding are reviewed in occupational risk assessments such as the safety of coal mining. When there is not a large sample population of non-smokers or non-drinkers in a particular occupation, the risk assessment may be biased towards finding a negative effect on health.
Decreasing the potential for confounding
A reduction in the potential for the occurrence and effect of confounding factors can be obtained by increasing the types and numbers of comparisons performed in an analysis. If measures or manipulations of core constructs are confounded (i.e. operational or procedural confounds exist), subgroup analysis may not reveal problems in the analysis. Additionally, increasing the number of comparisons can create other problems (see multiple comparisons
In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values.
The more inferences ...
).
Peer review
Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work (peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review ...
is a process that can assist in reducing instances of confounding, either before study implementation or after analysis has occurred. Peer review relies on collective expertise within a discipline to identify potential weaknesses in study design and analysis, including ways in which results may depend on confounding. Similarly, replication can test for the robustness of findings from one study under alternative study conditions or alternative analyses (e.g., controlling for potential confounds not identified in the initial study).
Confounding effects may be less likely to occur and act similarly at multiple times and locations. In selecting study sites, the environment can be characterized in detail at the study sites to ensure sites are ecologically similar and therefore less likely to have confounding variables. Lastly, the relationship between the environmental variables that possibly confound the analysis and the measured parameters can be studied. The information pertaining to environmental variables can then be used in site-specific models to identify residual variance that may be due to real effects.
Depending on the type of study design in place, there are various ways to modify that design to actively exclude or control confounding variables:[
]
* Case-control studies assign confounders to both groups, cases and controls, equally. For example, if somebody wanted to study the cause of myocardial infarct and thinks that the age is a probable confounding variable, each 67-year-old infarct patient will be matched with a healthy 67-year-old "control" person. In case-control studies, matched variables most often are the age and sex. Drawback: Case-control studies are feasible only when it is easy to find controls, ''i.e.'' persons whose status vis-à-vis all known potential confounding factors is the same as that of the case's patient: Suppose a case-control study attempts to find the cause of a given disease in a person who is 1) 45 years old, 2) African-American, 3) from Alaska
Alaska ( ; russian: Аляска, Alyaska; ale, Alax̂sxax̂; ; ems, Alas'kaaq; Yup'ik: ''Alaskaq''; tli, Anáaski) is a state located in the Western United States on the northwest extremity of North America. A semi-exclave of the U.S., ...
, 4) an avid football player, 5) vegetarian, and 6) working in education. A theoretically perfect control would be a person who, in addition to not having the disease being investigated, matches all these characteristics and has no diseases that the patient does not also have—but finding such a control would be an enormous task.
* Cohort studies
A cohort study is a particular form of longitudinal study that samples a cohort (a group of people who share a defining characteristic, typically those who experienced a common event in a selected period, such as birth or graduation), performing ...
: A degree of matching is also possible and it is often done by only admitting certain age groups or a certain sex into the study population, creating a cohort of people who share similar characteristics and thus all cohorts are comparable in regard to the possible confounding variable. For example, if age and sex are thought to be confounders, only 40 to 50 years old males would be involved in a cohort study that would assess the myocardial infarct risk in cohorts that either are physically active or inactive. Drawback: In cohort studies, the overexclusion of input data may lead researchers to define too narrowly the set of similarly situated persons for whom they claim the study to be useful, such that other persons to whom the causal relationship does in fact apply may lose the opportunity to benefit from the study's recommendations. Similarly, "over-stratification" of input data within a study may reduce the sample size in a given stratum to the point where generalizations drawn by observing the members of that stratum alone are not statistically significant
In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
.
* Double blinding
In a blind or blinded experiment, information which may influence the participants of the experiment is withheld until after the experiment is complete. Good blinding can reduce or eliminate experimental biases that arise from a participants' expec ...
: conceals from the trial population and the observers the experiment group membership of the participants. By preventing the participants from knowing if they are receiving treatment or not, the placebo effect
A placebo ( ) is a substance or treatment which is designed to have no therapeutic value. Common placebos include inert tablets (like sugar pills), inert injections (like Saline (medicine), saline), sham surgery, and other procedures.
In general ...
should be the same for the control and treatment groups. By preventing the observers from knowing of their membership, there should be no bias from researchers treating the groups differently or from interpreting the outcomes differently.
* Randomized controlled trial
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical te ...
: A method where the study population is divided randomly in order to mitigate the chances of self-selection by participants or bias by the study designers. Before the experiment begins, the testers will assign the members of the participant pool to their groups (control, intervention, parallel), using a randomization process such as the use of a random number generator. For example, in a study on the effects of exercise, the conclusions would be less valid if participants were given a choice if they wanted to belong to the control group which would not exercise or the intervention group which would be willing to take part in an exercise program. The study would then capture other variables besides exercise, such as pre-experiment health levels and motivation to adopt healthy activities. From the observer's side, the experimenter may choose candidates who are more likely to show the results the study wants to see or may interpret subjective results (more energetic, positive attitude) in a way favorable to their desires.
* Stratification
Stratification may refer to:
Mathematics
* Stratification (mathematics), any consistent assignment of numbers to predicate symbols
* Data stratification in statistics
Earth sciences
* Stable and unstable stratification
* Stratification, or st ...
: As in the example above, physical activity is thought to be a behaviour that protects from myocardial infarct; and age is assumed to be a possible confounder. The data sampled is then stratified by age group – this means that the association between activity and infarct would be analyzed per each age group. If the different age groups (or age strata) yield much different risk ratios, age must be viewed as a confounding variable. There exist statistical tools, among them Mantel–Haenszel methods, that account for stratification of data sets.
* Controlling for confounding by measuring the known confounders and including them as covariate
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
s is multivariable analysis
Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable.
Multivariate statistics concerns understanding the different aims and background of each of the dif ...
such as regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
. Multivariate analyses reveal much less information about the ''strength'' or ''polarity'' of the confounding variable than do stratification methods. For example, if multivariate analysis controls for antidepressant
Antidepressants are a class of medication used to treat major depressive disorder, anxiety disorders, chronic pain conditions, and to help manage addictions. Common side-effects of antidepressants include dry mouth, weight gain, dizziness, hea ...
, and it does not stratify antidepressants for TCA and SSRI
Selective serotonin reuptake inhibitors (SSRIs) are a class of drugs that are typically used as antidepressants in the treatment of major depressive disorder, anxiety disorders, and other psychological conditions.
SSRIs increase the extracellul ...
, then it will ignore that these two classes of antidepressant have ''opposite'' effects on myocardial infarction, and one is much ''stronger'' than the other.
All these methods have their drawbacks:
# The best available defense against the possibility of spurious results due to confounding is often to dispense with efforts at stratification and instead conduct a randomized study
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical te ...
of a sufficiently large
In the mathematical areas of number theory and analysis, an infinite sequence or a function is said to eventually have a certain property, if it doesn't have the said property across all its ordered instances, but will after some instances have pa ...
sample taken as a whole, such that all potential confounding variables (known and unknown) will be distributed by chance across all study groups and hence will be uncorrelated with the binary variable for inclusion/exclusion in any group.
# Ethical considerations: In double-blind and randomized controlled trials, participants are not aware that they are recipients of sham treatment
A placebo ( ) is a substance or treatment which is designed to have no therapeutic value. Common placebos include inert tablets (like sugar pills), inert injections (like Saline (medicine), saline), sham surgery, and other procedures.
In general ...
s and may be denied effective treatments. There is a possibility that patients only agree to invasive surgery (which carry real medical risks) under the understanding that they are receiving treatment. Although this is an ethical concern, it is not a complete account of the situation. For surgeries that are currently being performed regularly, but for which there is no concrete evidence of a genuine effect, there may be ethical issues to continue such surgeries. In such circumstances, many of people are exposed to the real risks of surgery yet these treatments may possibly offer no discernible benefit. Sham-surgery control is a method that may allow medical science to determine whether a surgical procedure is efficacious or not. Given that there are known risks associated with medical operations, it is questionably ethical to allow unverified surgeries to be conducted ad infinitum into the future.
Artifacts
Artifacts are variables that should have been systematically varied, either within or across studies, but that was accidentally held constant. Artifacts are thus threats to external validity
External validity is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can be generalized to and across other situations, people, stim ...
. Artifacts are factors that covary with the treatment and the outcome. Campbell and Stanley identify several artifacts. The major threats to internal validity are history, maturation, testing, instrumentation, statistical regression
Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industri ...
, selection, experimental mortality, and selection-history interactions.
One way to minimize the influence of artifacts is to use a pretest-posttest control group
In the design of experiments, hypotheses are applied to experimental units in a treatment group.
In comparative experiments, members of a control group receive a standard treatment, a placebo, or no treatment at all. There may be more than one tr ...
design. Within this design, "groups of people who are initially equivalent (at the pretest phase) are randomly assigned to receive the experimental treatment or a control condition and then assessed again after this differential experience (posttest phase)". Thus, any effects of artifacts are (ideally) equally distributed in participants in both the treatment and control conditions.
See also
*
*
*
*
References
Further reading
*
*
*
*
External links
These sites contain descriptions or examples of confounding variables:
Tutorial: Confounding and Effect Measure Modification (Boston University School of Public Health)
{{statistics
Analysis of variance
Causal inference
Design of experiments
*