In fields such as
epidemiology
Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and Risk factor (epidemiology), determinants of health and disease conditions in a defined population, and application of this knowledge to prevent dise ...
,
social science
Social science (often rendered in the plural as the social sciences) is one of the branches of science, devoted to the study of societies and the relationships among members within those societies. The term was formerly used to refer to the ...
s,
psychology
Psychology is the scientific study of mind and behavior. Its subject matter includes the behavior of humans and nonhumans, both consciousness, conscious and Unconscious mind, unconscious phenomena, and mental processes such as thoughts, feel ...
and
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, an observational study draws inferences from a
sample to a
population
Population is a set of humans or other organisms in a given region or area. Governments conduct a census to quantify the resident population size within a given jurisdiction. The term is also applied to non-human animals, microorganisms, and pl ...
where the
independent variable
A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function ...
is not under the
control of the researcher because of ethical concerns or logistical constraints. One common observational study is about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a
control group
In the design of experiments, hypotheses are applied to experimental units in a treatment group.
In comparative experiments, members of a control group receive a standard treatment, a placebo, or no treatment at all. There may be more than one tr ...
is outside the control of the investigator.
This is in contrast with
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs whe ...
s, such as
randomized controlled trial
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical ...
s, where each subject is
randomly assigned to a treated group or a control group. Observational studies, for lacking an assignment mechanism, naturally present difficulties for inferential analysis.
Motivation
The independent variable may be beyond the control of the investigator for a variety of reasons:
* A randomized experiment would violate
ethical
Ethics is the philosophical study of moral phenomena. Also called moral philosophy, it investigates normative questions about what people ought to do or which behavior is morally right. Its main branches include normative ethics, applied e ...
standards. Suppose one wanted to investigate the
abortion – breast cancer hypothesis, which postulates a causal link between induced abortion and the incidence of breast cancer. In a hypothetical controlled experiment, one would start with a large subject pool of pregnant women and divide them randomly into a treatment group (receiving induced abortions) and a control group (not receiving abortions), and then conduct regular cancer screenings for women from both groups. Needless to say, such an experiment would run counter to common ethical principles. (It would also suffer from various confounds and sources of bias, e.g. it would be impossible to conduct it as a
blind experiment.) The published studies investigating the abortion–breast cancer hypothesis generally start with a group of women who already have received abortions. Membership in this "treated" group is not controlled by the investigator: the group is formed after the "treatment" has been assigned.
* The investigator may simply lack the requisite influence. Suppose a scientist wants to study the public health effects of a community-wide ban on smoking in public indoor areas. In a controlled experiment, the investigator would randomly pick a set of communities to be in the treatment group. However, it is typically up to each community and/or its legislature to enact a
smoking ban
Smoking bans, or smoke-free laws, are public policies, including criminal laws and occupational safety and health regulations, that prohibit tobacco smoking in certain spaces. The spaces most commonly affected by smoking bans are indoor employ ...
. The investigator can be expected to lack the political power to cause precisely those communities in the randomly selected treatment group to pass a smoking ban. In an observational study, the investigator would typically start with a treatment group consisting of those communities where a smoking ban is already in effect.
* A randomized experiment may be impractical. Suppose a researcher wants to study the suspected link between a certain medication and a very rare group of symptoms arising as a side effect. Setting aside any ethical considerations, a randomized experiment would be impractical because of the rarity of the effect. There may not be a subject pool large enough for the symptoms to be observed in at least one treated subject. An observational study would typically start with a group of symptomatic subjects and work backwards to find those who were given the medication and later developed the symptoms. Thus a subset of the treated group was determined based on the presence of symptoms, instead of by random assignment.
* Many
randomized controlled trial
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical ...
s are not broadly representative of real-world patients and this may limit their
external validity
External validity is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can generalize or transport to other situations, people, stimul ...
. Patients who are eligible for inclusion in a randomized controlled trial are usually younger, more likely to be male, healthier and more likely to be treated according to recommendations from guidelines. If and when the intervention is later added to routine-care, a large portion of the patients who will receive it may be old with many concomitant diseases and drug-therapies.
Types
*
Case-control study: study originally developed in epidemiology, in which two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute.
*
Cross-sectional study
In statistics and econometrics, cross-sectional data is a type of data collected by observing many subjects (such as individuals, firms, countries, or regions) at a single point or period of time. Analysis of cross-sectional data usually consists ...
: involves data collection from a population, or a representative subset, at one specific point in time.
*
Longitudinal study
A longitudinal study (or longitudinal survey, or panel study) is a research design that involves repeated observations of the same variables (e.g., people) over long periods of time (i.e., uses longitudinal data). It is often a type of observationa ...
: correlational research
study that involves repeated observations of the same variables over long periods of time.
Cohort study
A cohort study is a particular form of longitudinal study that samples a Cohort (statistics), cohort (a group of people who share a defining characteristic, typically those who experienced a common event in a selected period, such as birth or gra ...
and
Panel study are particular forms of longitudinal study.
Degree of usefulness and reliability
"Although observational studies cannot be used to make definitive statements of fact about the "safety, efficacy, or effectiveness" of a practice, they can:
[
# provide information on 'real world' use and practice;
# detect signals about the benefits and risks of... heuse f practicesin the general population;
# help formulate hypotheses to be tested in subsequent experiments;
# provide part of the community-level data needed to design more informative pragmatic clinical trials; and
# inform clinical practice."]["Although observational studies cannot provide definitive evidence of safety, efficacy, or effectiveness, they can: 1) provide information on "real world" use and practice; 2) detect signals about the benefits and risks of complementary therapies use in the general population; 3) help formulate hypotheses to be tested in subsequent experiments; 4) provide part of the community-level data needed to design more informative pragmatic clinical trials; and 5) inform clinical practice.]
"Observational Studies and Secondary Data Analyses To Assess Outcomes in Complementary and Integrative Health Care."
Richard Nahin, Ph.D., M.P.H., Senior Advisor for Scientific Coordination and Outreach, National Center for Complementary and Integrative Health, June 25, 2012
Bias and compensating methods
In all of those cases, if a randomized experiment cannot be carried out, the alternative line of investigation suffers from the problem that the decision of which subjects receive the treatment is not entirely random and thus is a potential source of bias
Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is inaccurate, closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individ ...
. A major challenge in conducting observational studies is to draw inferences that are acceptably free from influences by overt biases, as well as to assess the influence of potential hidden biases. The following are a non-exhaustive set of problems especially common in observational studies.
Matching techniques bias
In lieu of experimental control, multivariate statistical techniques allow the approximation of experimental control with statistical control by using matching methods. Matching methods account for the influences of observed factors that might influence a cause-and-effect relationship. In healthcare
Health care, or healthcare, is the improvement or maintenance of health via the preventive healthcare, prevention, diagnosis, therapy, treatment, wikt:amelioration, amelioration or cure of disease, illness, injury, and other disability, physic ...
and the social science
Social science (often rendered in the plural as the social sciences) is one of the branches of science, devoted to the study of societies and the relationships among members within those societies. The term was formerly used to refer to the ...
s, investigators may use matching to compare units that nonrandomly received the treatment and control. One common approach is to use propensity score matching
In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that pred ...
in order to reduce confounding
In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlatio ...
, although this has recently come under criticism for exacerbating the very problems it seeks to solve.
Multiple comparison bias
Multiple comparison bias can occur when several hypotheses are tested at the same time. As the number of recorded factors increases, the likelihood increases that at least one of the recorded factors will be highly correlated with the data output simply by chance.
Omitted variable bias
An observer of an uncontrolled experiment (or process) records potential factors and the data output: the goal is to determine the effects of the factors. Sometimes the recorded factors may not be directly causing the differences in the output. There may be more important factors which were not recorded but are, in fact, causal. Also, recorded or unrecorded factors may be correlated which may yield incorrect conclusions.
Selection bias
Another difficulty with observational studies is that researchers may themselves be biased in their observational skills. This would allow for researchers to (either consciously or unconsciously) seek out the information they're looking for while conducting their research. For example, researchers may exaggerate the effect of one variable, or downplay the effect of another: researchers may even select in subjects that fit their conclusions. This selection bias can happen at any stage of the research process. This introduces bias into the data where certain variables are systematically incorrectly measured.
Quality
A 2014 (updated in 2024) Cochrane review concluded that observational studies produce results similar to those conducted as randomized controlled trial
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical ...
s. The review reported little evidence for significant effect differences between observational studies and randomized controlled trials, regardless of design. Differences need to be evaluated by looking at population, comparator, heterogeneity, and outcomes.
See also
* Observational interpretation fallacy
The observational interpretation fallacy is the cognitive bias where associations identified in observational studies are misinterpreted as causal relationships. This misinterpretation often influences clinical guidelines, public health policies, ...
* Correlation does not imply causation
The phrase "correlation does not imply causation" refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or correlation between them. The id ...
* Observation
Observation in the natural sciences is an act or instance of noticing or perceiving and the acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the percep ...
* Difference-in-differences
* Randomized controlled trial
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical ...
(RCT)
* Blinded experiment
In a blind or blinded experiment, information which may influence the participants of the experiment is withheld until after the experiment is complete. Good blinding can reduce or eliminate experimental biases that arise from a participants' expec ...
* Scientific method
The scientific method is an Empirical evidence, empirical method for acquiring knowledge that has been referred to while doing science since at least the 17th century. Historically, it was developed through the centuries from the ancient and ...
References
Further reading
*
"NIST/SEMATECH Handbook on Engineering Statistics"
at NIST
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical s ...
{{DEFAULTSORT:Observational Study
Statistical data types
Design of experiments