The average treatment effect (ATE) is a measure used to compare treatments (or interventions) in randomized experiments, evaluation of policy interventions, and medical trials. The ATE measures the difference in

mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...

(average) outcomes between units assigned to the treatment and units assigned to the control. In a

randomized trial In science, randomized experiments are the experiments that allow the greatest reliability and validity of statistical estimates of treatment effects. Randomization-based inference is especially important in experimental design and in survey sampl ...

(i.e., an experimental study), the average treatment effect can be

estimated Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is der ...

from a sample using a comparison in mean outcomes for treated and untreated units. However, the ATE is generally understood as a

causal Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the ca ...

parameter (i.e., an estimate or property of a

population Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction using a ...

) that a researcher desires to know, defined without reference to the

study design Clinical study design is the formulation of trials and experiments, as well as observational studies in medical, clinical and other types of research (e.g., epidemiological) involving human beings. The goal of a clinical study is to assess the saf ...

or estimation procedure. Both

observational Observation is the active acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the perception and recording of data (information), data via the use of scienti ...

studies and experimental study designs with random assignment may enable one to estimate an ATE in a variety of ways.

General definition

Originating from early statistical analysis in the fields of agriculture and medicine, the term "treatment" is now applied, more generally, to other fields of natural and social science, especially

psychology Psychology is the scientific study of mind and behavior. Psychology includes the study of conscious and unconscious phenomena, including feelings and thoughts. It is an academic discipline of immense scope, crossing the boundaries betwe ...

political science Political science is the scientific study of politics. It is a social science dealing with systems of governance and power, and the analysis of political activities, political thought, political behavior, and associated constitutions and la ...

, and

economics Economics () is the social science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and intera ...

such as, for example, the evaluation of the impact of public policies. The nature of a treatment or outcome is relatively unimportant in the estimation of the ATE—that is to say, calculation of the ATE requires that a treatment be applied to some units and not others, but the nature of that treatment (e.g., a pharmaceutical, an incentive payment, a political advertisement) is irrelevant to the definition and estimation of the ATE. The expression "treatment effect" refers to the causal effect of a given treatment or intervention (for example, the administering of a drug) on an outcome variable of interest (for example, the health of the patient). In the Neyman-Rubin "potential outcomes framework" of

causality Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cau ...

a treatment effect is defined for each individual unit in terms of two "potential outcomes." Each unit has one outcome that would manifest if the unit were exposed to the treatment and another outcome that would manifest if the unit were exposed to the control. The "treatment effect" is the difference between these two potential outcomes. However, this individual-level treatment effect is unobservable because individual units can only receive the treatment or the control, but not both.

Random assignment Random assignment or random placement is an experimental technique for assigning human participants or animal subjects to different groups in an experiment (e.g., a treatment group versus a control group) using randomization, such as by a chan ...

to treatment ensures that units assigned to the treatment and units assigned to the control are identical (over a large number of iterations of the experiment). Indeed, units in both groups have identical distributions of

covariate Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...

s and potential outcomes. Thus the average outcome among the treatment units serves as a

counterfactual Counterfactual conditionals (also ''subjunctive'' or ''X-marked'') are conditional sentences which discuss what would have been true under different circumstances, e.g. "If Peter believed in ghosts, he would be afraid to be here." Counterfactual ...

for the average outcome among the control units. The differences between these two averages is the ATE, which is an estimate of the

central tendency In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications ...

of the distribution of unobservable individual-level treatment effects. If a sample is randomly constituted from a population, the sample ATE (abbreviated SATE) is also an estimate of the population ATE (abbreviated PATE). While an

experiment An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into Causality, cause-and-effect by demonstrating what outcome oc ...

ensures, in expectation, that potential outcomes (and all covariates) are equivalently distributed in the treatment and control groups, this is not the case in an

observational study In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample (statistics), sample to a statistical population, population where the dependent and independent variables, independ ...

. In an observational study, units are not assigned to treatment and control randomly, so their assignment to treatment may depend on unobserved or unobservable factors. Observed factors can be statistically controlled (e.g., through regression or matching), but any estimate of the ATE could be

confounded In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...

by unobservable factors that influenced which units received the treatment versus the control.

Formal definition

In order to define formally the ATE, we define two potential outcomes :

y_(i)

is the value of the outcome variable for individual

i

if they are not treated,

y_(i)

is the value of the outcome variable for individual

i

if they are treated. For example,

y_(i)

is the health status of the individual if they are not administered the drug under study and

y_(i)

is the health status if they are administered the drug. The treatment effect for individual

i

is given by

y_(i)-y_(i)=\beta(i)

. In the general case, there is no reason to expect this effect to be constant across individuals. The average treatment effect is given by :

\text = \frac\sum_i (y_(i)-y_(i))

where the summation occurs over all

N

individuals in the population. If we could observe, for each individual,

y_(i)

and

y_(i)

among a large representative sample of the population, we could estimate the ATE simply by taking the average value of

y_(i)-y_(i)

across the sample. However, we can not observe both

y_(i)

and

y_(i)

for each individual since an individual cannot be both treated and not treated. For example, in the drug example, we can only observe

y_(i)

for individuals who have received the drug and

y_(i)

for those who did not receive it. This is the main problem faced by scientists in the evaluation of treatment effects and has triggered a large body of estimation techniques.

Estimation

Depending on the data and its underlying circumstances, many methods can be used to estimate the ATE. The most common ones are: *

Natural experiment A natural experiment is an empirical study in which individuals (or clusters of individuals) are exposed to the experimental and control conditions that are determined by nature or by other factors outside the control of the investigators. The pro ...

s *

Difference in differences Difference in differences (DID or DD) is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differe ...

Regression discontinuity design In statistics, econometrics, political science, epidemiology, and related disciplines, a regression discontinuity design (RDD) is a quasi-experimental pretest-posttest design that aims to determine the causal effects of interventions by assigning a ...

s *

Propensity score matching In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predic ...

Instrumental variables estimation In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered t ...

An example

Consider an example where all units are unemployed individuals, and some experience a policy intervention (the treatment group), while others do not (the control group). The causal effect of interest is the impact a job search monitoring policy (the treatment) has on the length of an unemployment spell: On average, how much shorter would one's unemployment be if they experienced the intervention? The ATE, in this case, is the difference in expected values (means) of the treatment and control groups' length of unemployment. A positive ATE, in this example, would suggest that the job policy increased the length of unemployment. A negative ATE would suggest that the job policy decreased the length of unemployment. An ATE estimate equal to zero would suggest that there was no advantage or disadvantage to providing the treatment in terms of the length of unemployment. Determining whether an ATE estimate is distinguishable from zero (either positively or negatively) requires

statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...

. Because the ATE is an estimate of the average effect of the treatment, a positive or negative ATE does not indicate that any particular individual would benefit or be harmed by the treatment. Thus the average treatment effect neglects the distribution of the treatment effect. Some parts of the population might be worse off with the treatment even if the mean effect is positive.

Heterogenous treatment effects

Some researchers call a treatment effect "heterogenous" if it affects different individuals differently (heterogeneously). For example, perhaps the above treatment of a job search monitoring policy affected men and women differently, or people who live in different states differently. ATE requires a strong assumption known as the stable unit treatment value assumption (SUTVA) which requires the value of the potential outcome

y(i)

be unaffected by the mechanism used to assign the treatment and the treatment exposure of all other individuals. Let

d

be the treatment, the treatment effect for individual

i

is given by

y_(i,d)-y_(i,d)

. The SUTVA assumption allows us to declare

y_(i,d) = y_(i), y_(i,d)=y_(i)

. One way to look for heterogeneous treatment effects is to divide the study data into subgroups (e.g., men and women, or by state), and see if the average treatment effects are different by subgroup. If the average treatment effects are different, SUTVA is violated. A per-subgroup ATE is called a "conditional average treatment effect" (CATE), i.e. the ATE conditioned on membership in the subgroup. CATE can be used as an estimate if SUTVA does not hold. A challenge with this approach is that each subgroup may have substantially less data than the study as a whole, so if the study has been powered to detect the main effects without subgroup analysis, there may not be enough data to properly judge the effects on subgroups. There is some work on detecting heterogenous treatment effects using random forests. Recently, metalearning approaches have been developed that use arbitrary regression frameworks as base learners to infer the CATE.

Representation learning In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature e ...

can be used to further improve the performance of these methods.

General definition

Formal definition

Estimation

An example

Heterogenous treatment effects

References

Further reading