Difference in differences (DID or DD
) is a
statistical technique
A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
Hypothesis testing allows us to make probabilistic statements about population parameters.
...
used in
econometrics
Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...
and
quantitative research
Quantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory, shaped by empiricist and positivist philoso ...
in the social sciences that attempts to mimic an
experimental research design using
observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a '
control group
In the design of experiments, hypotheses are applied to experimental units in a treatment group.
In comparative experiments, members of a control group receive a standard treatment, a placebo, or no treatment at all. There may be more than one t ...
' in a
natural experiment. It calculates the effect of a treatment (i.e., an explanatory variable or an
independent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or dema ...
) on an outcome (i.e., a response variable or
dependent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or dema ...
) by comparing the average change over time in the outcome variable for the treatment group to the average change over time for the control group. Although it is intended to mitigate the effects of extraneous factors and
selection bias
Selection bias is the bias introduced by the selection of individuals, groups, or data for analysis in such a way that proper randomization is not achieved, thereby failing to ensure that the sample obtained is representative of the population int ...
, depending on how the treatment group is chosen, this method may still be subject to certain biases (e.g.,
mean regression,
reverse causality and
omitted variable bias).
In contrast to a
time-series estimate of the treatment effect on subjects (which analyzes differences over time) or a cross-section estimate of the treatment effect (which measures the difference between treatment and control groups), difference in differences uses
panel data
In statistics and econometrics, panel data and longitudinal data are both multi-dimensional data involving measurements over time. Panel data is a subset of longitudinal data where observations are for the same subjects each time.
Time series and ...
to measure the differences, between the treatment and control group, of the changes in the outcome variable that occur over time.
General definition
Difference in differences requires data measured from a treatment group and a control group at two or more different time periods, specifically at least one time period before "treatment" and at least one time period after "treatment." In the example pictured, the outcome in the treatment group is represented by the line P and the outcome in the control group is represented by the line S. The outcome (dependent) variable in both groups is measured at time 1, before either group has received the treatment (i.e., the independent or explanatory variable), represented by the points ''P''
1 and ''S''
1. The treatment group then receives or experiences the treatment and both groups are again measured at time 2. Not all of the difference between the treatment and control groups at time 2 (that is, the difference between ''P''
2 and ''S''
2) can be explained as being an effect of the treatment, because the treatment group and control group did not start out at the same point at time 1. DID therefore calculates the "normal" difference in the outcome variable between the two groups (the difference that would still exist if neither group experienced the treatment), represented by the dotted line ''Q''. (Notice that the slope from ''P''
1 to ''Q'' is the same as the slope from ''S''
1 to ''S''
2.) The treatment effect is the difference between the observed outcome (P
2) and the "normal" outcome (the difference between P
2 and Q).
Formal definition
Consider the model
:
where
is the dependent variable for
individual
An individual is that which exists as a distinct entity. Individuality (or self-hood) is the state or quality of being an individual; particularly (in the case of humans) of being a person unique from other people and possessing one's own need ...
and time
,
is the group to which
belongs (i.e. the treatment or the control group), and
is short-hand for the
dummy variable equal to 1 when the event described in
is true, and 0 otherwise. In the plot of time versus
by group,
is the vertical intercept for the graph for
, and
is the time trend shared by both groups according to the parallel trend assumption (see
Assumptions below).
is the treatment effect, and
is the
residual term.
Consider the average of the dependent variable and dummy indicators by group and time:
:
and suppose for simplicity that
and
. Note that
is not random; it just encodes how the groups and the periods are labeled. Then
:
The
strict exogeneity assumption then implies that
:
Without loss of generality
''Without loss of generality'' (often abbreviated to WOLOG, WLOG or w.l.o.g.; less commonly stated as ''without any loss of generality'' or ''with no loss of generality'') is a frequently used expression in mathematics. The term is used to indicat ...
, assume that
is the treatment group, and
is the after period, then
and
, giving the DID estimator
:
which can be interpreted as the treatment effect of the treatment indicated by
. Below it is shown how this estimator can be read as a coefficient in an ordinary least squares regression. The model described in this section is over-parametrized; to remedy that, one of the coefficients for the dummy variables can be set to 0, for example, we may set
.
Assumptions
All the assumptions of the
OLS model apply equally to DID. In addition, DID requires a parallel trend assumption. The parallel trend assumption says that
are the same in both
and
. Given that the
formal definition above accurately represents reality, this assumption automatically holds. However, a model with
may well be more realistic. In order to increase the likelihood of the parallel trend assumption holding, a difference-in-differences approach is often combined with
matching. This involves 'Matching' known 'treatment' units with simulated counterfactual 'control' units: characteristically equivalent units which did not receive treatment. By defining the Outcome Variable as a temporal difference (change in observed outcome between pre- and posttreatment periods), and Matching multiple units in a large sample on the basis of similar pre-treatment histories, the resulting
ATE (i.e. the ATT: Average Treatment Effect for the Treated) provides a robust difference-in-differences estimate of treatment effects. This serves two statistical purposes: firstly, conditional on pre-treatment covariates, the parallel trends assumption is likely to hold; and secondly, this approach reduces dependence on associated ignorability assumptions necessary for valid inference.
As illustrated to the right, the treatment effect is the difference between the observed value of ''y'' and what the value of ''y'' would have been with parallel trends, had there been no treatment. The Achilles' heel of DID is when something other than the treatment changes in one group but not the other at the same time as the treatment, implying a violation of the parallel trend assumption.
To guarantee the accuracy of the DID estimate, the composition of individuals of the two groups is assumed to remain unchanged over time. When using a DID model, various issues that may compromise the results, such as
autocorrelation
Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable ...
and
Ashenfelter dip Ashenfelter is a surname. Notable people with the surname include:
* Bill Ashenfelter (1924–2010), American long-distance runner
*Horace Ashenfelter (1923–2018), American middle-distance runner
*Orley Ashenfelter
Orley Clark Ashenfelter (bor ...
s, must be considered and dealt with.
Implementation
The DID method can be implemented according to the table below, where the lower right cell is the DID estimator.
Running a regression analysis gives the same result. Consider the OLS model
:
where
is a dummy variable for the period, equal to
when
, and
is a dummy variable for group membership, equal to
when
. The composite variable
is a dummy variable indicating when
. Although it is not shown rigorously here, this is a proper parametrization of the model
formal definition, furthermore, it turns out that the group and period averages in that section relate to the model parameter estimates as follows
:
where
stands for conditional averages computed on the sample, for example,
is the indicator for the after period,
is an indicator for the control group. Note that
is an estimate of the counterfactual rather than the impact of the control group. The control group is often used as a proxy for the
counterfactual (see,
Synthetic control method for a deeper understanding of this point). Thereby,
can be interpreted as the impact of both the control group and the intervention's (treatment's) counterfactual. Similarly,
, due to the parallel trend assumption, is also the same differential between the treatment and control group in
. The above descriptions should not be construed to imply the (average) effect of only the control group, for
, or only the difference of the treatment and control groups in the pre-period, for
. As in
Card and
Krueger, below, a first (time) difference of the outcome variable
eliminates the need for time-trend (i.e.,
) to form an unbiased estimate of
, implying that
is not actually conditional on the treatment or control group.
Consistently, a difference among the treatment and control groups would eliminate the need for treatment differentials (i.e.,
) to form an unbiased estimate of
. This nuance is important to understand when the user believes (weak) violations of parallel pre-trend exist or in the case of violations of the appropriate counterfactual approximation assumptions given the existence of non-common shocks or confounding events. To see the relation between this notation and the previous section, consider as above only one observation per time period for each group, then
:
and so on for other values of
and
, which is equivalent to
:
But this is the expression for the treatment effect that was given in the
formal definition and in the above table.
Card and Krueger (1994) example
The
Card and
Krueger article on
minimum wage
A minimum wage is the lowest remuneration that employers can legally pay their employees—the price floor below which employees may not sell their labor. Most countries had introduced minimum wage legislation by the end of the 20th century. B ...
in
New Jersey
New Jersey is a state in the Mid-Atlantic and Northeastern regions of the United States. It is bordered on the north and east by the state of New York; on the east, southeast, and south by the Atlantic Ocean; on the west by the Delawa ...
, published in 1994,
is considered one of the most famous DID studies; Card was later awarded the 2021
Nobel Memorial Prize in Economic Sciences
The Nobel Memorial Prize in Economic Sciences, officially the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel ( sv, Sveriges riksbanks pris i ekonomisk vetenskap till Alfred Nobels minne), is an economics award administered ...
in part for this and related work. Card and Krueger compared
employment
Employment is a relationship between two parties regulating the provision of paid labour services. Usually based on a contract, one party, the employer, which might be a corporation, a not-for-profit organization, a co-operative, or any o ...
in the
fast food
Fast food is a type of mass-produced food designed for commercial resale, with a strong priority placed on speed of service. It is a commercial term, limited to food sold in a restaurant or store with frozen, preheated or precooked ingredie ...
sector in New Jersey and in
Pennsylvania
Pennsylvania (; ( Pennsylvania Dutch: )), officially the Commonwealth of Pennsylvania, is a state spanning the Mid-Atlantic, Northeastern, Appalachian, and Great Lakes regions of the United States. It borders Delaware to its southeast, ...
, in February 1992 and in November 1992, after New Jersey's minimum wage rose from $4.25 to $5.05 in April 1992. Observing a change in employment in New Jersey only, before and after the treatment, would fail to control for
omitted variables such as weather and macroeconomic conditions of the region. By including Pennsylvania as a control in a difference-in-differences model, any bias caused by variables common to New Jersey and Pennsylvania is implicitly controlled for, even when these variables are unobserved. Assuming that New Jersey and Pennsylvania have parallel trends over time, Pennsylvania's change in employment can be interpreted as the change New Jersey would have experienced, had they not increased the minimum wage, and vice versa. The evidence suggested that the increased minimum wage did not induce a decrease in employment in New Jersey, contrary to what some economic theory would suggest. The table below shows Card & Krueger's estimates of the treatment effect on employment, measured as
FTEs (or full-time equivalents). Card and Krueger estimate that the $0.80 minimum wage increase in New Jersey led to a 2.75 FTE increase in employment.
A software example application of this research is found on the
Stata
Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fie ...
's command -diff-
authored by
Juan Miguel Villa.
See also
*
Design of experiments
The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
*
Average treatment effect The average treatment effect (ATE) is a measure used to compare treatments (or interventions) in randomized experiments, evaluation of policy interventions, and medical trials. The ATE measures the difference in mean (average) outcomes between units ...
*
Synthetic control method
References
Further reading
*
*
*
*
*
External links
Difference in Difference Estimation Healthcare Economist website
{{DEFAULTSORT:Difference In Differences
Econometric modeling
Regression analysis
Design of experiments
Observational study
Causal inference
Subtraction