HOME

TheInfoList



OR:

Matching is a statistical technique that evaluates the effect of a treatment by comparing the treated and the non-treated units in an
observational study In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample (statistics), sample to a statistical population, population where the dependent and independent variables, independ ...
or
quasi-experiment A quasi-experiment is a research design used to estimate the causal impact of an intervention. Quasi-experiments share similarities with experiments and randomized controlled trials, but specifically lack random assignment to treatment or control. ...
(i.e. when the treatment is not randomly assigned). The goal of matching is to reduce bias for the estimated treatment effect in an observational-data study, by finding, for every treated unit, one (or more) non-treated unit(s) with similar observable characteristics against which the covariates are balanced out (similar to the
K-nearest neighbors algorithm In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a Non-parametric statistics, non-parametric supervised learning method. It was first developed by Evelyn Fix and Joseph Lawson Hodges Jr., Joseph Hodges in 1951, and later expand ...
). By matching treated units to similar non-treated units, matching enables a comparison of outcomes among treated and non-treated units to estimate the effect of the treatment reducing bias due to
confounding In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlatio ...
.
Propensity score matching In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that pred ...
, an early matching technique, was developed as part of the Rubin causal model, but has been shown to increase model dependence, bias, inefficiency, and power and is no longer recommended compared to other matching methods. A simple, easy-to-understand, and statistically powerful method of matching known as Coarsened Exact Matching or CEM. Matching has been promoted by
Donald Rubin Donald Bruce Rubin (born December 22, 1943) is an Emeritus Professor of Statistics at Harvard University, where he chaired the department of Statistics for 13 years. He also works at Tsinghua University in China and at Temple University in Philad ...
. It was prominently criticized in
economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...
by Robert LaLonde (1986), who compared estimates of treatment effects from an
experiment An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs whe ...
to comparable estimates produced with matching methods and showed that matching methods are biased. Rajeev Dehejia and Sadek Wahba (1999) reevaluated LaLonde's critique and showed that matching is a good solution. Similar critiques have been raised in
political science Political science is the scientific study of politics. It is a social science dealing with systems of governance and Power (social and political), power, and the analysis of political activities, political philosophy, political thought, polit ...
and
sociology Sociology is the scientific study of human society that focuses on society, human social behavior, patterns of Interpersonal ties, social relationships, social interaction, and aspects of culture associated with everyday life. The term sociol ...
journals.


Analysis

When the outcome of interest is binary, the most general tool for the analysis of matched data is conditional logistic regression as it handles strata of arbitrary size and continuous or binary treatments (predictors) and can control for covariates. In particular cases, simpler tests like
paired difference test A paired difference test, better known as a paired comparison, is a type of location test that is used when comparing two sets of paired sample, paired measurements to assess whether their expected value, population means differ. A paired differen ...
, McNemar test and Cochran–Mantel–Haenszel test are available. When the outcome of interest is continuous, estimation of the
average treatment effect The average treatment effect (ATE) is a measure used to compare treatments (or interventions) in randomized experiments, evaluation of policy interventions, and medical trials. The ATE measures the difference in mean (average) outcomes between unit ...
is performed. Matching can also be used to "pre-process" a sample before analysis via another technique, such as regression analysis.


Overmatching

''Overmatching,'' or ''post-treatment bias,'' is matching for an apparent mediator that actually is a result of the exposure. If the mediator itself is stratified, an obscured relation of the exposure to the disease would highly be likely to be induced. Overmatching thus causes
statistical bias In the field of statistics, bias is a systematic tendency in which the methods used to gather data and estimate a sample statistic present an inaccurate, skewed or distorted (''biased'') depiction of reality. Statistical bias exists in numerou ...
. For example, matching the control group by gestation length and/or the number of
multiple birth A multiple birth is the culmination of a multiple pregnancy, wherein the mother gives birth to two or more babies. A term most applicable to vertebrate species, multiple births occur in most kinds of mammals, with varying frequencies. Such births ...
s when estimating
perinatal mortality Perinatal mortality (PNM) is the death of a fetus or neonate and is the basis to calculate the perinatal mortality rate. ''Perinatal'' means "relating to the period starting a few weeks before birth and including the birth and a few weeks after bi ...
and birthweight after
in vitro fertilization In vitro fertilisation (IVF) is a process of fertilisation in which an egg is combined with sperm in vitro ("in glass"). The process involves monitoring and stimulating the ovulatory process, then removing an ovum or ova (egg or eggs) from ...
(IVF) is overmatching, since IVF itself increases the risk of premature birth and multiple birth. It may be regarded as a
sampling bias In statistics, sampling bias is a bias (statistics), bias in which a sample is collected in such a way that some members of the intended statistical population, population have a lower or higher sampling probability than others. It results in a b ...
in decreasing the
external validity External validity is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can generalize or transport to other situations, people, stimul ...
of a study, because the controls become more similar to the cases in regard to exposure than the general population.


See also

*
Propensity score matching In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that pred ...


References


Further reading

* {{Authority control Bias Design of experiments Medical statistics Observational study Sampling techniques