A randomized controlled trial (or randomized control trial;
RCT) is a form of
scientific experiment used to control factors not under direct experimental control. Examples of RCTs are
clinical trial
Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, diet ...
s that compare the effects of drugs, surgical techniques,
medical device
A medical device is any device intended to be used for medical purposes. Significant potential for hazards are inherent when using a device for medical purposes and thus medical devices must be proved safe and effective with reasonable assura ...
s,
diagnostic procedure
Diagnosis is the identification of the nature and cause of a certain phenomenon. Diagnosis is used in many different disciplines, with variations in the use of logic, analytics, and experience, to determine "cause and effect
Causality (also ...
s or other medical treatments.
Participants who enroll in RCTs differ from one another in known and unknown ways that can influence study outcomes, and yet cannot be directly controlled. By
randomly allocating participants among compared treatments, an RCT enables ''statistical control'' over these influences. Provided it is designed well, conducted properly, and enrolls enough participants, an RCT may achieve sufficient control over these
confounding factor
In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
s to deliver a useful comparison of the treatments studied.
Definition and examples
An RCT in
clinical research
Clinical research is a branch of healthcare science that determines the safety and effectiveness ( efficacy) of medications, devices, diagnostic products and treatment regimens intended for human use. These may be used for prevention, treat ...
typically compares a proposed new treatment against an existing
standard of care
In tort law, the standard of care is the only degree of prudence and caution required of an individual who is under a duty of care.
The requirements of the standard are closely dependent on circumstances. Whether the standard of care has been b ...
; these are then termed the 'experimental' and 'control' treatments, respectively. When no such generally accepted treatment is available, a
placebo
A placebo ( ) is a substance or treatment which is designed to have no therapeutic value. Common placebos include inert tablets (like sugar pills), inert injections (like Saline (medicine), saline), sham surgery, and other procedures.
In general ...
may be used in the control group so that participants are
blinded to their treatment allocations. This blinding principle is ideally also extended as much as possible to other parties including researchers, technicians, data analysts, and evaluators. Effective blinding experimentally isolates the physiological effects of treatments from various psychological sources of
bias
Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group ...
.
The randomness in the assignment of participants to treatments reduces
selection bias
Selection bias is the bias introduced by the selection of individuals, groups, or data for analysis in such a way that proper randomization is not achieved, thereby failing to ensure that the sample obtained is representative of the population int ...
and allocation bias, balancing both known and unknown prognostic factors, in the assignment of treatments.
Blinding reduces other forms of
experimenter and subject biases.
A well-blinded RCT is considered the
gold standard
A gold standard is a monetary system in which the standard economic unit of account is based on a fixed quantity of gold. The gold standard was the basis for the international monetary system from the 1870s to the early 1920s, and from th ...
for clinical trials. Blinded RCTs are commonly used to test the
efficacy
Efficacy is the ability to perform a task to a satisfactory or expected degree. The word comes from the same roots as ''effectiveness'', and it has often been used synonymously, although in pharmacology a distinction is now often made between ...
of medical
interventions and may additionally provide information about adverse effects, such as
drug reactions. A randomized controlled trial can provide compelling evidence that the study treatment causes an effect on human health.
The terms "RCT" and "randomized trial" are sometimes used synonymously, but the latter term omits mention of controls and can therefore describe studies that compare multiple treatment groups with each other in the absence of a control group.
Similarly, the
initialism
An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
is sometimes expanded as "randomized clinical trial" or "randomized comparative trial", leading to ambiguity in the
scientific literature
: ''For a broader class of literature, see Academic publishing.''
Scientific literature comprises scholarly publications that report original empirical and theoretical work in the natural and social sciences. Within an academic field, scie ...
.
Not all RCTs are randomized ''controlled'' trials (and some of them could never be, as in cases where controls would be impractical or unethical to use). The term randomized controlled clinical trial is an alternative term used in clinical research;
however, RCTs are also employed in other research areas, including
many of the social sciences.
History
The first reported
clinical trial
Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, diet ...
was conducted by
James Lind
James Lind (4 October 1716 – 13 July 1794) was a Scottish doctor. He was a pioneer of naval hygiene in the Royal Navy. By conducting one of the first ever clinical trials, he developed the theory that citrus fruits cured scurvy.
Lind ...
in 1747 to identify treatment for
scurvy
Scurvy is a deficiency disease, disease resulting from a lack of vitamin C (ascorbic acid). Early symptoms of deficiency include weakness, feeling tired and sore arms and legs. Without treatment, anemia, decreased red blood cells, gum disease, ch ...
. The first blind experiment was conducted by the
French Royal Commission on Animal Magnetism in 1784 to investigate the claims of
mesmerism
Animal magnetism, also known as mesmerism, was a protoscientific theory developed by German doctor Franz Mesmer in the 18th century in relation to what he claimed to be an invisible natural force (''Lebensmagnetismus'') possessed by all livi ...
. An early essay advocating the blinding of researchers came from
Claude Bernard
Claude Bernard (; 12 July 1813 – 10 February 1878) was a French physiologist. Historian I. Bernard Cohen of Harvard University called Bernard "one of the greatest of all men of science". He originated the term '' milieu intérieur'', and the ...
in the latter half of the 19th century. Bernard recommended that the observer of an experiment should not have knowledge of the hypothesis being tested. This suggestion contrasted starkly with the prevalent
Enlightenment-era attitude that scientific observation can only be objectively valid when undertaken by a well-educated, informed scientist. The first study recorded to have a blinded researcher was conducted in 1907 by
W. H. R. Rivers and H. N. Webber to investigate the effects of caffeine.
Randomized experiment
In science, randomized experiments are the experiments that allow the greatest reliability and validity of statistical estimates of treatment effects. Randomization-based inference is especially important in experimental design and in survey samp ...
s first appeared in
psychology
Psychology is the science, scientific study of mind and behavior. Psychology includes the study of consciousness, conscious and Unconscious mind, unconscious phenomena, including feelings and thoughts. It is an academic discipline of immens ...
, where they were introduced by
Charles Sanders Peirce
Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism".
Educated as a chemist and employed as a scientist for ...
and
Joseph Jastrow
Joseph Jastrow (January 30, 1863 – January 8, 1944) was a Polish-born American psychologist, noted for inventions in experimental psychology, design of experiments, and psychophysics. He also worked on the phenomena of optical illusions, a ...
in the 1880s, and in
education
Education is a purposeful activity directed at achieving certain aims, such as transmitting knowledge or fostering skills and character traits. These aims may include the development of understanding, rationality, kindness, and honesty ...
.
In the early 20th century, randomized experiments appeared in agriculture, due to
Jerzy Neyman
Jerzy Neyman (April 16, 1894 – August 5, 1981; born Jerzy Spława-Neyman; ) was a Polish mathematician and statistician who spent the first part of his professional career at various institutions in Warsaw, Poland and then at University Colleg ...
and
Ronald A. Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
. Fisher's experimental research and his writings popularized randomized experiments.
[
According to Denis Conniffe:
]
The first published Randomized Controlled Trial in medicine appeared in the 1948 paper entitled "
Streptomycin
Streptomycin is an antibiotic medication used to treat a number of bacterial infections, including tuberculosis, ''Mycobacterium avium'' complex, endocarditis, brucellosis, ''Burkholderia'' infection, plague, tularemia, and rat bite fever. F ...
treatment of pulmonary
tuberculosis
Tuberculosis (TB) is an infectious disease usually caused by '' Mycobacterium tuberculosis'' (MTB) bacteria. Tuberculosis generally affects the lungs, but it can also affect other parts of the body. Most infections show no symptoms, ...
", which described a
Medical Research Council investigation.
One of the authors of that paper was
Austin Bradford Hill
Sir Austin Bradford Hill (8 July 1897 – 18 April 1991) was an English epidemiologist and statistician, pioneered the randomised clinical trial and, together with Richard Doll, demonstrated the connection between cigarette smoking and lung ...
, who is credited as having conceived the modern RCT.
Trial design was further influenced by the large-scale
ISIS
Isis (; ''Ēse''; ; Meroitic: ''Wos'' 'a''or ''Wusa''; Phoenician: 𐤀𐤎, romanized: ʾs) was a major goddess in ancient Egyptian religion whose worship spread throughout the Greco-Roman world. Isis was first mentioned in the Old Kin ...
trials on
heart attack
A myocardial infarction (MI), commonly known as a heart attack, occurs when blood flow decreases or stops to the coronary artery of the heart, causing damage to the heart muscle. The most common symptom is chest pain or discomfort which ma ...
treatments that were conducted in the 1980s.
By the late 20th century, RCTs were recognized as the standard method for "rational therapeutics" in medicine.
As of 2004, more than 150,000 RCTs were in the
Cochrane Library
The Cochrane Library (named after Archie Cochrane) is a collection of databases in medicine and other healthcare specialties provided by Cochrane and other organizations. At its core is the collection of Cochrane Reviews, a database of systemat ...
.
To improve the reporting of RCTs in the medical literature, an international group of scientists and editors published
Consolidated Standards of Reporting Trials CONSORT (Consolidated Standards Of Reporting Trials) encompasses various initiatives developed by the CONSORT Group to alleviate the problems arising from inadequate reporting of randomized controlled trials. It is part of the larger EQUATOR Networ ...
(CONSORT) Statements in 1996, 2001 and 2010, and these have become widely accepted.
Randomization is the process of assigning trial subjects to treatment or control groups using an element of chance to determine the assignments in order to reduce the bias.
Ethics
Although the principle of
clinical equipoise
Clinical equipoise, also known as the principle of equipoise, provides the ethical basis for medical research that involves assigning patients to different treatment arms of a clinical trial. The term was first used by Benjamin Freedman in 1987, al ...
("genuine uncertainty within the expert medical community... about the preferred treatment") common to clinical trials
has been applied to RCTs, the
ethics
Ethics or moral philosophy is a branch of philosophy that "involves systematizing, defending, and recommending concepts of right and wrong behavior".''Internet Encyclopedia of Philosophy'' The field of ethics, along with aesthetics, concer ...
of RCTs have special considerations. For one, it has been argued that equipoise itself is insufficient to justify RCTs.
For another, "collective equipoise" can conflict with a lack of personal equipoise (e.g., a personal belief that an intervention is effective).
Finally,
Zelen's design
Zelen's design is an experimental design for randomized clinical trials proposed by Harvard School of Public Health statistician Marvin Zelen (1927-2014). In this design, patients are randomized to either the treatment or control group before givin ...
, which has been used for some RCTs, randomizes subjects ''before'' they provide informed consent, which may be ethical for RCTs of
screening
Screening may refer to:
* Screening cultures, a type a medical test that is done to find an infection
* Screening (economics), a strategy of combating adverse selection (includes sorting resumes to select employees)
* Screening (environmental), ...
and selected therapies, but is likely unethical "for most therapeutic trials."
Although subjects almost always provide
informed consent
Informed consent is a principle in medical ethics and medical law, that a patient must have sufficient information and understanding before making decisions about their medical care. Pertinent information may include risks and benefits of treat ...
for their participation in an RCT, studies since 1982 have documented that RCT subjects may believe that they are certain to receive treatment that is best for them personally; that is, they do not understand the difference between research and treatment.
Further research is necessary to determine the prevalence of and ways to address this "
therapeutic misconception Therapeutic misconception is a common ethical problem encountered in human subjects research. It was originally described in 1982 by Paul Appelbaum and colleagues. The idea was introduced to the bioethics community in 1987. The formulation given by ...
".
The RCT method variations may also create cultural effects that have not been well understood.
For example, patients with terminal illness may join trials in the hope of being cured, even when treatments are unlikely to be successful.
Trial registration
In 2004, th
International Committee of Medical Journal Editors(ICMJE) announced that all trials starting enrolment after July 1, 2005, must be registered prior to consideration for publication in one of the 12 member journals of the committee.
However, trial registration may still occur late or not at all.
Medical journals have been slow in adapting policies requiring mandatory clinical trial registration as a prerequisite for publication.
Classifications
By study design
One way to classify RCTs is by
study design
Clinical study design is the formulation of trials and experiments, as well as observational studies in medical, clinical and other types of research (e.g., epidemiological) involving human beings. The goal of a clinical study is to assess the saf ...
. From most to least common in the healthcare literature, the major categories of RCT study designs are:
*
Parallel-group – each participant is randomly assigned to a group, and all the participants in the group receive (or do not receive) an intervention.
*
Crossover
Crossover may refer to:
Entertainment
Albums and songs
* ''Cross Over'' (Dan Peek album)
* ''Crossover'' (Dirty Rotten Imbeciles album), 1987
* ''Crossover'' (Intrigue album)
* ''Crossover'' (Hitomi Shimatani album)
* ''Crossover'' (Yoshino ...
– over time, each participant receives (or does not receive) an intervention in a random sequence.
*
Cluster
may refer to:
Science and technology Astronomy
* Cluster (spacecraft), constellation of four European Space Agency spacecraft
* Asteroid cluster, a small asteroid family
* Cluster II (spacecraft), a European Space Agency mission to study th ...
– pre-existing groups of participants (e.g., villages, schools) are randomly selected to receive (or not receive) an intervention.
*
Factorial
In mathematics, the factorial of a non-negative denoted is the product of all positive integers less than or equal The factorial also equals the product of n with the next smaller factorial:
\begin
n! &= n \times (n-1) \times (n-2) \ ...
– each participant is randomly assigned to a group that receives a particular combination of interventions or non-interventions (e.g., group 1 receives vitamin X and vitamin Y, group 2 receives vitamin X and placebo Y, group 3 receives placebo X and vitamin Y, and group 4 receives placebo X and placebo Y).
An analysis of the 616 RCTs indexed in
PubMed
PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintain the ...
during December 2006 found that 78% were parallel-group trials, 16% were crossover, 2% were split-body, 2% were cluster, and 2% were factorial.
By outcome of interest (efficacy vs. effectiveness)
RCTs can be classified as "explanatory" or "pragmatic."
Explanatory RCTs test
efficacy
Efficacy is the ability to perform a task to a satisfactory or expected degree. The word comes from the same roots as ''effectiveness'', and it has often been used synonymously, although in pharmacology a distinction is now often made between ...
in a research setting with highly selected participants and under highly controlled conditions.
In contrast, pragmatic RCTs (pRCTs) test
effectiveness
Effectiveness is the capability of producing a desired result or the ability to produce desired output. When something is deemed effective, it means it has an intended or expected outcome, or produces a deep, vivid impression.
Etymology
The ori ...
in everyday practice with relatively unselected participants and under flexible conditions; in this way, pragmatic RCTs can "inform decisions about practice."
By hypothesis (superiority vs. noninferiority vs. equivalence)
Another classification of RCTs categorizes them as "superiority trials", "noninferiority trials", and "equivalence trials", which differ in methodology and reporting.
Most RCTs are superiority trials, in which one intervention is hypothesized to be superior to another in a
statistically significant
In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
way.
Some RCTs are noninferiority trials "to determine whether a new treatment is no worse than a reference treatment."
Other RCTs are equivalence trials in which the hypothesis is that two interventions are indistinguishable from each other.
Randomization
The advantages of proper
randomization Randomization is the process of making something random. Randomization is not haphazard; instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution d ...
in RCTs include:
* "It eliminates bias in treatment assignment," specifically
selection bias
Selection bias is the bias introduced by the selection of individuals, groups, or data for analysis in such a way that proper randomization is not achieved, thereby failing to ensure that the sample obtained is representative of the population int ...
and
confounding
In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
.
* "It facilitates blinding (masking) of the identity of treatments from investigators, participants, and assessors."
* "It permits the use of probability theory to express the likelihood that any difference in outcome between treatment groups merely indicates chance."
There are two processes involved in randomizing patients to different interventions. First is choosing a ''randomization procedure'' to generate an unpredictable sequence of allocations; this may be a simple random assignment of patients to any of the groups at equal probabilities, may be "restricted", or may be "adaptive." A second and more practical issue is ''allocation concealment'', which refers to the stringent precautions taken to ensure that the group assignment of patients are not revealed prior to definitively allocating them to their respective groups. Non-random "systematic" methods of group assignment, such as alternating subjects between one group and the other, can cause "limitless contamination possibilities" and can cause a breach of allocation concealment.
However empirical evidence that adequate randomization changes outcomes relative to inadequate randomization has been difficult to detect.
Procedures
The treatment allocation is the desired proportion of patients in each treatment arm.
An ideal randomization procedure would achieve the following goals:
* Maximize
statistical power
In statistics, the power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true. It is commonly denoted by 1-\beta, and represents the chances ...
, especially in
subgroup analyses. Generally, equal group sizes maximize statistical power, however, unequal groups sizes may be more powerful for some analyses (e.g., multiple comparisons of placebo versus several doses using Dunnett's procedure ), and are sometimes desired for non-analytic reasons (e.g., patients may be more motivated to enroll if there is a higher chance of getting the test treatment, or regulatory agencies may require a minimum number of patients exposed to treatment).
* Minimize selection bias. This may occur if investigators can consciously or unconsciously preferentially enroll patients between treatment arms. A good randomization procedure will be unpredictable so that investigators cannot guess the next subject's group assignment based on prior treatment assignments. The risk of selection bias is highest when previous treatment assignments are known (as in unblinded studies) or can be guessed (perhaps if a drug has distinctive side effects).
* Minimize allocation bias (or confounding). This may occur when
covariate
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
s that affect the outcome are not equally distributed between treatment groups, and the treatment effect is confounded with the effect of the covariates (i.e., an "accidental bias"
). If the randomization procedure causes an imbalance in covariates related to the outcome across groups, estimates of effect may be
biased if not adjusted for the covariates (which may be unmeasured and therefore impossible to adjust for).
However, no single randomization procedure meets those goals in every circumstance, so researchers must select a procedure for a given study based on its advantages and disadvantages.
Simple
This is a commonly used and intuitive procedure, similar to "repeated fair coin-tossing."
Also known as "complete" or "unrestricted" randomization, it is
robust
Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...
against both selection and accidental biases. However, its main drawback is the possibility of imbalanced group sizes in small RCTs. It is therefore recommended only for RCTs with over 200 subjects.
Restricted
To balance group sizes in smaller RCTs, some form of
"restricted" randomization is recommended.
The major types of restricted randomization used in RCTs are:
*
Permuted-block randomization or blocked randomization: a "block size" and "allocation ratio" (number of subjects in one group versus the other group) are specified, and subjects are allocated randomly within each block.
For example, a block size of 6 and an allocation ratio of 2:1 would lead to random assignment of 4 subjects to one group and 2 to the other. This type of randomization can be combined with "
stratified randomization
In statistics, stratified randomization is a method of sampling which first stratifies the whole study population into subgroups with same attributes or characteristics, known as strata, then followed by simple random sampling from the stratified ...
", for example by center in a
multicenter trial
A multicenter research trial is a clinical trial conducted at more than one medical center or clinic. Most large clinical trials, particularly Clinical trial#Phase III, Phase III trials, are conducted at several clinical research centers.
Benefit ...
, to "ensure good balance of participant characteristics in each group."
A special case of permuted-block randomization is ''random allocation'', in which the entire sample is treated as one block.
The major disadvantage of permuted-block randomization is that even if the block sizes are large and randomly varied, the procedure can lead to selection bias.
Another disadvantage is that "proper" analysis of data from permuted-block-randomized RCTs requires stratification by blocks.
* Adaptive biased-coin randomization methods (of which urn randomization is the most widely known type): In these relatively uncommon methods, the probability of being assigned to a group decreases if the group is overrepresented and increases if the group is underrepresented.
The methods are thought to be less affected by selection bias than permuted-block randomization.
Adaptive
At least two types of "adaptive" randomization procedures have been used in RCTs, but much less frequently than simple or restricted randomization:
* Covariate-adaptive randomization, of which one type is
minimization: The probability of being assigned to a group varies in order to minimize "covariate imbalance."
Minimization is reported to have "supporters and detractors"
because only the first subject's group assignment is truly chosen at random, the method does not necessarily eliminate bias on unknown factors.
* Response-adaptive randomization, also known as outcome-adaptive randomization: The probability of being assigned to a group increases if the responses of the prior patients in the group were favorable.
Although arguments have been made that this approach is more ethical than other types of randomization when the probability that a treatment is effective or ineffective increases during the course of an RCT, ethicists have not yet studied the approach in detail.
Allocation concealment
"Allocation concealment" (defined as "the procedure for protecting the randomization process so that the treatment to be allocated is not known before the patient is entered into the study") is important in RCTs.
In practice, clinical investigators in RCTs often find it difficult to maintain impartiality. Stories abound of investigators holding up sealed envelopes to lights or ransacking offices to determine group assignments in order to dictate the assignment of their next patient.
Such practices introduce selection bias and
confounders
In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
(both of which should be minimized by randomization), possibly distorting the results of the study.
Adequate allocation concealment should defeat patients and investigators from discovering treatment allocation once a study is underway and after the study has concluded. Treatment related side-effects or adverse events may be specific enough to reveal allocation to investigators or patients thereby introducing bias or influencing any subjective parameters collected by investigators or requested from subjects.
Some standard methods of ensuring allocation concealment include sequentially numbered, opaque, sealed envelopes (SNOSE); sequentially numbered containers; pharmacy controlled randomization; and central randomization.
It is recommended that allocation concealment methods be included in an RCT's
protocol
Protocol may refer to:
Sociology and politics
* Protocol (politics), a formal agreement between nation states
* Protocol (diplomacy), the etiquette of diplomacy and affairs of state
* Etiquette, a code of personal behavior
Science and technolog ...
, and that the allocation concealment methods should be reported in detail in a publication of an RCT's results; however, a 2005 study determined that most RCTs have unclear allocation concealment in their protocols, in their publications, or both.
On the other hand, a 2008 study of 146
meta-analyses
A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analyses can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting me ...
concluded that the results of RCTs with inadequate or unclear allocation concealment tended to be biased toward beneficial effects only if the RCTs' outcomes were
subjective as opposed to
objective
Objective may refer to:
* Objective (optics), an element in a camera or microscope
* ''The Objective'', a 2008 science fiction horror film
* Objective pronoun, a personal pronoun that is used as a grammatical object
* Objective Productions, a Brit ...
.
Sample size
The number of treatment units (subjects or groups of subjects) assigned to control and treatment groups, affects an RCT's reliability. If the effect of the treatment is small, the number of treatment units in either group may be insufficient for rejecting the null hypothesis in the respective
statistical test
A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
Hypothesis testing allows us to make probabilistic statements about population parameters.
...
. The failure to reject the
null hypothesis
In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
would imply that the treatment shows no statistically significant effect on the treated ''in a given test''. But as the sample size increases, the same RCT may be able to demonstrate a significant effect of the treatment, even if this effect is small.
Blinding
An RCT may be blinded, (also called "masked") by "procedures that prevent study participants, caregivers, or outcome assessors from knowing which intervention was received."
Unlike allocation concealment, blinding is sometimes inappropriate or impossible to perform in an RCT; for example, if an RCT involves a treatment in which active participation of the patient is necessary (e.g.,
physical therapy
Physical therapy (PT), also known as physiotherapy, is one of the allied health professions. It is provided by physical therapists who promote, maintain, or restore health through physical examination, diagnosis, management, prognosis, patient ...
), participants cannot be blinded to the intervention.
Traditionally, blinded RCTs have been classified as "single-blind", "double-blind", or "triple-blind"; however, in 2001 and 2006 two studies showed that these terms have different meanings for different people.
The 2010
CONSORT Statement specifies that authors and editors should not use the terms "single-blind", "double-blind", and "triple-blind"; instead, reports of blinded RCT should discuss "If done, who was blinded after assignment to interventions (for example, participants, care providers, those assessing outcomes) and how."
RCTs without blinding are referred to as "unblinded",
"open",
or (if the intervention is a medication) "
open-label
An open-label trial, or open trial, is a type of clinical trial in which information is not withheld from trial participants. In particular, both the researchers and participants know which treatment is being administered. This contrasts with a do ...
".
In 2008 a study concluded that the results of unblinded RCTs tended to be biased toward beneficial effects only if the RCTs' outcomes were subjective as opposed to objective;
for example, in an RCT of treatments for
multiple sclerosis
Multiple (cerebral) sclerosis (MS), also known as encephalomyelitis disseminata or disseminated sclerosis, is the most common demyelinating disease, in which the insulating covers of nerve cells in the brain and spinal cord are damaged. This d ...
, unblinded neurologists (but not the blinded neurologists) felt that the treatments were beneficial.
In pragmatic RCTs, although the participants and providers are often unblinded, it is "still desirable and often possible to blind the assessor or obtain an objective source of data for evaluation of outcomes."
Analysis of data
The types of statistical methods used in RCTs depend on the characteristics of the data and include:
* For
dichotomous
A dichotomy is a partition of a whole (or a set) into two parts (subsets). In other words, this couple of parts must be
* jointly exhaustive: everything must belong to one part or the other, and
* mutually exclusive: nothing can belong simultan ...
(binary) outcome data,
logistic regression
In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear function (calculus), linear combination of one or more independent var ...
(e.g., to predict sustained virological response after receipt of
peginterferon alfa-2a
Pegylated interferon alfa-2a, sold under the brand name Pegasys among others, is medication used to treat hepatitis C and hepatitis B. For hepatitis C it is typically used together with ribavirin and cure rates are between 24 and 92%. For hepatit ...
for
hepatitis C
Hepatitis C is an infectious disease caused by the hepatitis C virus (HCV) that primarily affects the liver; it is a type of viral hepatitis. During the initial infection people often have mild or no symptoms. Occasionally a fever, dark urine, a ...
) and other methods can be used.
* For continuous outcome data,
analysis of covariance
Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV) often called a tre ...
(e.g., for changes in blood lipid levels after receipt of
atorvastatin
Atorvastatin is a statin medication used to prevent cardiovascular disease in those at high risk and to treat abnormal lipid levels. For the prevention of cardiovascular disease, statins are a first-line treatment. It is taken by mouth.
Common ...
after
acute coronary syndrome
Acute coronary syndrome (ACS) is a syndrome (a set of signs and symptoms) due to decreased blood flow in the coronary arteries such that part of the heart muscle is unable to function properly or dies. The most common symptom is centrally loca ...
) tests the effects of predictor variables.
* For time-to-event outcome data that may be
censored
Censorship is the suppression of speech, public communication, or other information. This may be done on the basis that such material is considered objectionable, harmful, sensitive, or "inconvenient". Censorship can be conducted by governments ...
,
survival analysis
Survival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysi ...
(e.g.,
Kaplan–Meier estimator
The Kaplan–Meier estimator, also known as the product limit estimator, is a non-parametric statistic used to estimate the survival function from lifetime data. In medical research, it is often used to measure the fraction of patients living ...
s and
Cox proportional hazards model
Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional haz ...
s for time to
coronary heart disease
Coronary artery disease (CAD), also called coronary heart disease (CHD), ischemic heart disease (IHD), myocardial ischemia, or simply heart disease, involves the reduction of blood flow to the heart muscle due to build-up of atherosclerotic pla ...
after receipt of
hormone replacement therapy in menopause) is appropriate.
Regardless of the statistical methods used, important considerations in the analysis of RCT data include:
* Whether an RCT should be stopped early due to interim results. For example, RCTs may be stopped early if an intervention produces "larger than expected benefit or harm", or if "investigators find evidence of no important difference between experimental and control interventions."
* The extent to which the groups can be analyzed exactly as they existed upon randomization (i.e., whether a so-called "
intention-to-treat analysis
In medicine an intention-to-treat (ITT) analysis of the results of a randomized controlled trial is based on the initial treatment assignment and not on the treatment eventually received. ITT analysis is intended to avoid various misleading artifac ...
" is used). A "pure" intention-to-treat analysis is "possible only when complete outcome data are available" for all randomized subjects;
when some outcome data are missing, options include analyzing only cases with known outcomes and using
imputed data.
Nevertheless, the more that analyses can include all participants in the groups to which they were randomized, the less bias that an RCT will be subject to.
* Whether
subgroup analysis should be performed. These are "often discouraged" because
multiple comparisons
In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values.
The more inferences ...
may produce false positive findings that cannot be confirmed by other studies.
Reporting of results
The ''
CONSORT 2010 Statement'' is "an evidence-based, minimum set of recommendations for reporting RCTs."
The CONSORT 2010 checklist contains 25 items (many with sub-items) focusing on "individually randomised, two group, parallel trials" which are the most common type of RCT.
For other RCT study designs, "
CONSORT extensions" have been published, some examples are:
* Consort 2010 Statement: Extension to Cluster Randomised Trials
* Consort 2010 Statement: Non-Pharmacologic Treatment Interventions
Relative importance and observational studies
Two studies published in ''
The New England Journal of Medicine
''The New England Journal of Medicine'' (''NEJM'') is a weekly medical journal published by the Massachusetts Medical Society. It is among the most prestigious peer-reviewed medical journals as well as the oldest continuously published one.
His ...
'' in 2000 found that
observational studies
In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample (statistics), sample to a statistical population, population where the dependent and independent variables, independ ...
and RCTs overall produced similar results.
The authors of the 2000 findings questioned the belief that "observational studies should not be used for defining evidence-based medical care" and that RCTs' results are "evidence of the highest grade."
However, a 2001 study published in ''
Journal of the American Medical Association
''The Journal of the American Medical Association'' (''JAMA'') is a peer-reviewed medical journal published 48 times a year by the American Medical Association. It publishes original research, reviews, and editorials covering all aspects of bio ...
'' concluded that "discrepancies beyond chance do occur and differences in estimated magnitude of treatment effect are very common" between observational studies and RCTs.
According to a 2014 Cochrane review, there is little evidence for significant effect differences between observational studies and randomized controlled trials, regardless of design, heterogeneity, or inclusion of studies of interventions that assessed drug effects.
Two other lines of reasoning question RCTs' contribution to scientific knowledge beyond other types of studies:
* If study designs are ranked by their potential for new discoveries, then
anecdotal evidence
Anecdotal evidence is evidence based only on personal observation, collected in a casual or non-systematic manner. The term is sometimes used in a legal context to describe certain kinds of testimony which are uncorroborated by objective, independ ...
would be at the top of the list, followed by observational studies, followed by RCTs.
* RCTs may be unnecessary for treatments that have dramatic and rapid effects relative to the expected stable or progressively worse natural course of the condition treated.
One example is
combination chemotherapy
The era of cancer chemotherapy began in the 1940s with the first use of nitrogen mustards and folic acid antagonist drugs. The targeted therapy revolution has arrived, but many of the principles and limitations of chemotherapy discovered by t ...
including
cisplatin
Cisplatin is a chemotherapy medication used to treat a number of cancers. These include testicular cancer, ovarian cancer, cervical cancer, breast cancer, bladder cancer, head and neck cancer, esophageal cancer, lung cancer, mesothelioma, br ...
for
metastatic
Metastasis is a pathogenic agent's spread from an initial or primary site to a different or secondary site within the host's body; the term is typically used when referring to metastasis by a cancerous tumor. The newly pathological sites, then, ...
testicular cancer
Testicular cancer is cancer that develops in the testicles, a part of the male reproductive system. Symptoms may include a lump in the testicle, or swelling or pain in the scrotum. Treatment may result in infertility.
Risk factors include an u ...
, which increased the cure rate from 5% to 60% in a 1977 non-randomized study.
Interpretation of statistical results
Like all statistical methods, RCTs are subject to both
type I ("false positive") and type II ("false negative") statistical errors. Regarding Type I errors, a typical RCT will use 0.05 (i.e., 1 in 20) as the probability that the RCT will falsely find two equally effective treatments significantly different.
Regarding Type II errors, despite the publication of a 1978 paper noting that the
sample size
Sample size determination is the act of choosing the number of observations or Replication (statistics), replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make stat ...
s of many "negative" RCTs were too small to make definitive conclusions about the negative results,
by 2005-2006 a sizeable proportion of RCTs still had inaccurate or incompletely reported sample size calculations.
Peer review
Peer review
Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work (peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review ...
of results is an important part of the
scientific method
The scientific method is an empirical method for acquiring knowledge that has characterized the development of science since at least the 17th century (with notable practitioners in previous centuries; see the article history of scientific m ...
. Reviewers examine the study results for potential problems with design that could lead to unreliable results (for example by creating a
systematic bias
Systematic may refer to:
Science
* Short for systematic error
* Systematic fault
* Systematic bias, errors that are not determined by chance but are introduced by an inaccuracy (involving either the observation or measurement process) inheren ...
), evaluate the study in the context of related studies and other evidence, and evaluate whether the study can be reasonably considered to have proven its conclusions. To underscore the need for peer review and the danger of overgeneralizing conclusions, two Boston-area medical researchers performed a randomized controlled trial in which they randomly assigned either a parachute or an empty backpack to 23 volunteers who jumped from either a biplane or a helicopter. The study was able to accurately report that parachutes fail to reduce injury compared to empty backpacks. The key context that limited the general applicability of this conclusion was that the aircraft were parked on the ground, and participants had only jumped about two feet.
Advantages
RCTs are considered to be the most reliable form of
scientific evidence
Scientific evidence is evidence that serves to either support or counter a scientific theory or hypothesis, although scientists also use evidence in other ways, such as when applying theories to practical problems. "Discussions about empirical ev ...
in the
hierarchy of evidence
A hierarchy of evidence (or levels of evidence) is a heuristic used to rank the relative strength of results obtained from scientific research. There is broad agreement on the relative strength of large-scale, epidemiological studies. More than 80 ...
that influences healthcare policy and practice because RCTs reduce spurious causality and bias. Results of RCTs may be combined in
systematic review
A systematic review is a Literature review, scholarly synthesis of the evidence on a clearly presented topic using critical methods to identify, define and assess research on the topic. A systematic review extracts and interprets data from publ ...
s which are increasingly being used in the conduct of
evidence-based practice
Evidence-based practice (EBP) is the idea that occupational practices ought to be based on scientific evidence. While seemingly obviously desirable, the proposal has been controversial, with some arguing that results may not specialize to indiv ...
. Some examples of scientific organizations' considering RCTs or systematic reviews of RCTs to be the highest-quality evidence available are:
* As of 1998, the
National Health and Medical Research Council
The National Health and Medical Research Council (NHMRC) is the main statutory authority of the Australian Government responsible for medical research. It was the eighth largest research funding body in the world in 2016, and NHMRC-funded rese ...
of Australia designated "Level I" evidence as that "obtained from a
systematic review
A systematic review is a Literature review, scholarly synthesis of the evidence on a clearly presented topic using critical methods to identify, define and assess research on the topic. A systematic review extracts and interprets data from publ ...
of all relevant randomised controlled trials" and "Level II" evidence as that "obtained from at least one properly designed randomised controlled trial."
* Since at least 2001, in making
clinical practice guideline
Clinical may refer to: Healthcare
* Of or about a clinic, a healthcare facility
* Of or about the practice of medicine Other uses
* ''Clinical'' (film), a 2017 American horror thriller
See also
*
*
* Clinical chemistry, the analysis of bodily flu ...
recommendations the
United States Preventive Services Task Force
The United States Preventive Services Task Force (USPSTF) is "an independent panel of experts in primary care and prevention that systematically reviews the evidence of effectiveness and develops recommendations for clinical preventive services". ...
has considered both a study's design and its
internal validity
Internal validity is the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study. It is one of the most important properties of scientific studies and is an important concept in reason ...
as indicators of its quality.
It has recognized "evidence obtained from at least one properly randomized controlled trial" with good
internal validity
Internal validity is the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study. It is one of the most important properties of scientific studies and is an important concept in reason ...
(i.e., a rating of "I-good") as the highest quality evidence available to it.
* The
GRADE Working Group concluded in 2008 that "randomised trials without important limitations constitute high quality evidence."
* For issues involving "Therapy/Prevention, Aetiology/Harm", the
Oxford Centre for Evidence-based Medicine
The Centre for Evidence-Based Medicine (CEBM), based in the Nuffield Department of Primary Care Health Sciences at the University of Oxford, is an academic-led centre dedicated to the practice, teaching, and dissemination of high quality evidence ...
as of 2011 defined "Level 1a" evidence as a systematic review of RCTs that are consistent with each other, and "Level 1b" evidence as an "individual RCT (with narrow
Confidence Interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
)."
Notable RCTs with unexpected results that contributed to changes in clinical practice include:
* After
Food and Drug Administration
The United States Food and Drug Administration (FDA or US FDA) is a List of United States federal agencies, federal agency of the United States Department of Health and Human Services, Department of Health and Human Services. The FDA is respon ...
approval, the
antiarrhythmic agents
Antiarrhythmic agents, also known as cardiac dysrhythmia medications, are a group of pharmaceuticals that are used to suppress abnormally fast rhythms ( tachycardias), such as atrial fibrillation, supraventricular tachycardia and ventricular ta ...
flecainide
Flecainide is a medication used to prevent and treat abnormally fast heart rates. This includes ventricular and supraventricular tachycardias. Its use is only recommended in those with dangerous arrhythmias or when significant symptoms cannot ...
and
encainide
Encainide (trade name Enkaid) is a class Ic antiarrhythmic agent. It is no longer used because of its frequent proarrhythmic side effects.
Synthesis
See also
* Iferanserin
* Cardiac Arrhythmia Suppression Trial
The Cardiac Arrhythmia Suppr ...
came to market in 1986 and 1987 respectively.
The non-randomized studies concerning the drugs were characterized as "glowing",
and their sales increased to a combined total of approximately 165,000 prescriptions per month in early 1989.
In that year, however, a preliminary report of an RCT concluded that the two drugs increased mortality.
Sales of the drugs then decreased.
* Prior to 2002, based on observational studies, it was routine for physicians to prescribe hormone replacement therapy for post-menopausal women to prevent
myocardial infarction
A myocardial infarction (MI), commonly known as a heart attack, occurs when blood flow decreases or stops to the coronary artery of the heart, causing damage to the heart muscle. The most common symptom is chest pain or discomfort which may ...
.
In 2002 and 2004, however, published RCTs from the
Women's Health Initiative
The Women's Health Initiative (WHI) was a series of clinical studies initiated by the U.S. National Institutes of Health (NIH) in 1991, to address major health issues causing morbidity and mortality in postmenopausal women. It consisted of three ...
claimed that women taking hormone replacement therapy with estrogen plus progestin had a higher rate of myocardial infarctions than women on a placebo, and that estrogen-only hormone replacement therapy caused no reduction in the incidence of coronary heart disease.
Possible explanations for the discrepancy between the observational studies and the RCTs involved differences in methodology, in the hormone regimens used, and in the populations studied.
The use of hormone replacement therapy decreased after publication of the RCTs.
Disadvantages
Many papers discuss the disadvantages of RCTs.
Among the most frequently cited drawbacks are:
Time and costs
RCTs can be expensive;
one study found 28
Phase III RCTs funded by the
National Institute of Neurological Disorders and Stroke
The National Institute of Neurological Disorders and Stroke (NINDS) is a part of the U.S. National Institutes of Health (NIH). It conducts and funds research on brain and nervous system disorders and has a budget of just over US$2.03 billion. The ...
prior to 2000 with a total cost of US$335 million,
for a
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set.
For a data set, the ''arithme ...
cost of US$12 million per RCT. Nevertheless, the
return on investment
Return on investment (ROI) or return on costs (ROC) is a ratio between net income (over a period) and investment (costs resulting from an investment of some resources at a point in time). A high ROI means the investment's gains compare favourably ...
of RCTs may be high, in that the same study projected that the 28 RCTs produced a "net benefit to society at 10-years" of 46 times the cost of the trials program, based on evaluating a
quality-adjusted life year
The quality-adjusted life year (QALY) is a generic measure of disease burden, including both the quality and the quantity of life lived. It is used in economic evaluation to assess the value of medical interventions. One QALY equates to one year i ...
as equal to the prevailing mean
per capita
''Per capita'' is a Latin phrase literally meaning "by heads" or "for each head", and idiomatically used to mean "per person". The term is used in a wide variety of social sciences and statistical research contexts, including government statistic ...
gross domestic product
Gross domestic product (GDP) is a money, monetary Measurement in economics, measure of the market value of all the final goods and services produced and sold (not resold) in a specific time period by countries. Due to its complex and subjec ...
.
The conduct of an RCT takes several years until being published; thus, data is restricted from the medical community for long years and may be of less relevance at time of publication.
It is costly to maintain RCTs for the years or decades that would be ideal for evaluating some interventions.
Interventions to prevent events that occur only infrequently (e.g.,
sudden infant death syndrome
Sudden infant death syndrome (SIDS) is the sudden unexplained death of a child of less than one year of age. Diagnosis requires that the death remain unexplained even after a thorough autopsy and detailed death scene investigation. SIDS usuall ...
) and uncommon adverse outcomes (e.g., a rare side effect of a drug) would require RCTs with extremely large sample sizes and may, therefore, best be assessed by observational studies.
Due to the costs of running RCTs, these usually only inspect one variable or very few variables, rarely reflecting the full picture of a complicated medical situation; whereas the
case report In medicine, a case report is a detailed report of the symptoms, signs, diagnosis, treatment, and follow-up of an individual patient. Case reports may contain a demographic profile of the patient, but usually describe an unusual or novel occurrence ...
, for example, can detail many aspects of the patient's
medical
Medicine is the science and practice of caring for a patient, managing the diagnosis, prognosis, prevention, treatment, palliation of their injury or disease, and promoting their health. Medicine encompasses a variety of health care practic ...
situation (e.g.
patient history
The medical history, case history, or anamnesis (from Greek: ἀνά, ''aná'', "open", and μνήσις, ''mnesis'', "memory") of a patient is information gained by a physician by asking specific questions, either to the patient or to other peo ...
,
physical examination
In a physical examination, medical examination, or clinical examination, a medical practitioner examines a patient for any possible medical signs or symptoms of a medical condition. It generally consists of a series of questions about the patien ...
,
diagnosis
Diagnosis is the identification of the nature and cause of a certain phenomenon. Diagnosis is used in many different disciplines, with variations in the use of logic, analytics, and experience, to determine " cause and effect". In systems engin ...
,
psychosocial
The psychosocial approach looks at individuals in the context of the combined influence that psychological factors and the surrounding social environment have on their physical and mental wellness and their ability to function. This approach is ...
aspects, follow up).
Conflict of interest dangers
A 2011 study done to disclose possible
conflicts of interest
A conflict of interest (COI) is a situation in which a person or organization is involved in multiple interests, financial or otherwise, and serving one interest could involve working against another. Typically, this relates to situations i ...
s in underlying research studies used for medical meta-analyses reviewed 29 meta-analyses and found that conflicts of interests in the studies underlying the meta-analyses were rarely disclosed. The 29 meta-analyses included 11 from general medicine journals; 15 from specialty medicine journals, and 3 from the
Cochrane Database of Systematic Reviews. The 29 meta-analyses reviewed an aggregate of 509 randomized controlled trials (RCTs). Of these, 318 RCTs reported funding sources with 219 (69%) industry funded. 132 of the 509 RCTs reported author conflict of interest disclosures, with 91 studies (69%) disclosing industry financial ties with one or more authors. The information was, however, seldom reflected in the meta-analyses. Only two (7%) reported RCT funding sources and none reported RCT author-industry ties. The authors concluded "without acknowledgment of COI due to industry funding or author industry financial ties from RCTs included in meta-analyses, readers' understanding and appraisal of the evidence from the meta-analysis may be compromised."
Some RCTs are fully or partly funded by the health care industry (e.g., the
pharmaceutical industry
The pharmaceutical industry discovers, develops, produces, and markets drugs or pharmaceutical drugs for use as medications to be administered to patients (or self-administered), with the aim to cure them, vaccinate them, or alleviate symptoms. ...
) as opposed to government, nonprofit, or other sources. A systematic review published in 2003 found four 1986–2002 articles comparing industry-sponsored and nonindustry-sponsored RCTs, and in all the articles there was a correlation of industry sponsorship and positive study outcome.
A 2004 study of 1999–2001 RCTs published in leading medical and surgical journals determined that industry-funded RCTs "are more likely to be associated with statistically significant pro-industry findings."
These results have been mirrored in trials in surgery, where although industry funding did not affect the rate of trial discontinuation it was however associated with a lower odds of publication for completed trials.
One possible reason for the pro-industry results in industry-funded published RCTs is
publication bias
In published academic research, publication bias occurs when the outcome of an experiment or research study biases the decision to publish or otherwise distribute it. Publishing only results that show a significant finding disturbs the balance o ...
.
Other authors have cited the differing goals of academic and industry sponsored research as contributing to the difference. Commercial sponsors may be more focused on performing trials of drugs that have already shown promise in early stage trials, and on replicating previous positive results to fulfill regulatory requirements for drug approval.
Ethics
If a
disruptive innovation
In business theory, disruptive innovation is innovation that creates a new market and value network or enters at the bottom of an existing market and eventually displaces established market-leading firms, products, and alliances. The concept was ...
in medical technology is developed, it may be difficult to test this ethically in an RCT if it becomes "obvious" that the control subjects have poorer outcomes—either due to other foregoing testing, or within the initial phase of the RCT itself. Ethically it may be necessary to abort the RCT prematurely, and getting ethics approval (and patient agreement) to withhold the innovation from the control group in future RCT's may not be feasible.
Historical control trials (HCT) exploit the data of previous RCTs to reduce the sample size; however, these approaches are controversial in the scientific community and must be handled with care.
In social science
Due to the recent emergence of RCTs in social science, the use of RCTs in social sciences is a contested issue. Some writers from a medical or health background have argued that existing research in a range of social science disciplines lacks rigour, and should be improved by greater use of randomized control trials.
Transport science
Researchers in transport science argue that public spending on programmes such as school travel plans could not be justified unless their efficacy is demonstrated by randomized controlled trials.
Graham-Rowe and colleagues
reviewed 77 evaluations of transport interventions found in the literature, categorising them into 5 "quality levels". They concluded that most of the studies were of low quality and advocated the use of randomized controlled trials wherever possible in future transport research.
Dr. Steve Melia
[Melia(2011]
Randomised Control Trials Offer a Solution to 'low Quality' Transport Research?''
Bristol: University of the West of England] took issue with these conclusions, arguing that claims about the advantages of RCTs, in establishing causality and avoiding bias, have been exaggerated. He proposed the following eight criteria for the use of RCTs in contexts where interventions must change human behaviour to be effective:
The intervention:
# Has not been applied to all members of a unique group of people (e.g. the population of a whole country, all employees of a unique organisation etc.)
# Is applied in a context or setting similar to that which applies to the control group
# Can be isolated from other activities—and the purpose of the study is to assess this isolated effect
# Has a short timescale between its implementation and maturity of its effects
And the causal mechanisms:
#
Are either known to the researchers, or else all possible alternatives can be tested
# Do not involve significant feedback mechanisms between the intervention group and external environments
# Have a stable and predictable relationship to exogenous factors
# Would act in the same way if the control group and intervention group were reversed
Criminology
A 2005 review found 83 randomized experiments in criminology published in 1982–2004, compared with only 35 published in 1957–1981. The authors classified the studies they found into five categories: "policing", "prevention", "corrections", "court", and "community". Focusing only on offending behavior programs, Hollin (2008) argued that RCTs may be difficult to implement (e.g., if an RCT required "passing sentences that would randomly assign offenders to programmes") and therefore that experiments with quasi-experimental design
A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental desig ...
are still necessary.
Education
RCTs have been used in evaluating a number of educational interventions. Between 1980 and 2016, over 1,000 reports of RCTs have been published. For example, a 2009 study randomized 260 elementary school teachers' classrooms to receive or not receive a program of behavioral screening, classroom intervention, and parent training, and then measured the behavioral and academic performance of their students. Another 2009 study randomized classrooms for 678 first-grade children to receive a classroom-centered intervention, a parent-centered intervention, or no intervention, and then followed their academic outcomes through age 19.
Criticism
A 2018 review of the 10 most cited randomised controlled trials noted poor distribution of background traits, difficulties with blinding, and discussed other assumptions and biases inherent in randomised controlled trials. These include the "unique time period assessment bias", the "background traits remain constant assumption", the "average treatment effects limitation", the "simple treatment at the individual level limitation", the "all preconditions are fully met assumption", the "quantitative variable limitation" and the "placebo only or conventional treatment only limitation".
See also
* Drug development
Drug development is the process of bringing a new pharmaceutical drug to the market once a lead compound has been identified through the process of drug discovery. It includes preclinical research on microorganisms and animals, filing for re ...
* Hypothesis testing
A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
Hypothesis testing allows us to make probabilistic statements about population parameters.
...
* Impact evaluation
Impact evaluation assesses the changes that can be attributed to a particular intervention, such as a project, program or policy, both the intended ones, as well as ideally the unintended ones. In contrast to outcome monitoring, which examines whe ...
* Jadad scale
The Jadad scale, sometimes known as Jadad scoring or the Oxford quality scoring system, is a procedure to independently assess the methodological quality of a clinical trial. It is named after Colombian physician Alex Jadad who in 1996 described a ...
* Pipeline planning
* Royal Commission on Animal Magnetism
The Royal Commission on Animal Magnetism involved two entirely separate and independent French Royal Commissions, each appointed by Louis XVI in 1784, that were conducted simultaneously by a committee composed of four physicians from the Paris ...
* Statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...
References
External links
* Bland M
Directory of randomisation software and services.
University of York, 2008 March 19.
* Evans I, Thornton H, Chalmers I
Testing treatments: better research for better health care.
London: Pinter & Martin, 2010. .
* Gelband H
The impact of randomized clinical trials on health policy and medical practice: background paper.
Washington, DC: U.S. Congress, Office of Technology Assessment, 1983. (Report OTA-BP-H-22.)
REFLECT (Reporting guidElines For randomized controLled trials for livEstoCk and food safeTy) Statement
* Wathen JK, Cook JD
Power and bias in adaptively randomized clinical trials.
M. D. Anderson Cancer Center, University of Texas, 2006 July 12.
{{DEFAULTSORT:Randomized Controlled Trial
Clinical research
Epidemiological study projects
Evidence-based practices
Design of experiments
Causal inference
Experiments
Research methods