science Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earliest archeological evidence for ...

, randomized experiments are the

experiment An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into Causality, cause-and-effect by demonstrating what outcome oc ...

s that allow the greatest reliability and validity of statistical estimates of treatment effects. Randomization-based inference is especially important in

experimental design The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...

and in

survey sampling In statistics, survey sampling describes the process of selecting a sample of elements from a target population to conduct a survey. The term " survey" may refer to many different types or techniques of observation. In survey sampling it most ofte ...

Overview

In the statistical theory of

design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...

, randomization involves randomly allocating the experimental units across the treatment groups. For example, if an experiment compares a new drug against a standard drug, then the patients should be allocated to either the new drug or to the standard drug control using randomization. Randomized experimentation is ''not'' haphazard. Randomization reduces

bias Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, ...

by equalising other factors that have not been explicitly accounted for in the experimental design (according to the law of large numbers). Randomization also produces ignorable designs, which are valuable in

model A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...

-based

statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...

, especially

Bayesian Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister. Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...

likelihood The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...

-based. In the design of experiments, the simplest design for comparing treatments is the "completely randomized design". Some "restriction on randomization" can occur with blocking and experiments that have hard-to-change factors; additional restrictions on randomization can occur when a full randomization is infeasible or when it is desirable to reduce the

variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...

of estimators of selected effects. Randomization of treatment in

clinical trials Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, dietar ...

pose ethical problems. In some cases, randomization reduces the therapeutic options for both physician and patient, and so randomization requires

clinical equipoise Clinical equipoise, also known as the principle of equipoise, provides the ethical basis for medical research that involves assigning patients to different treatment arms of a clinical trial. The term was first used by Benjamin Freedman in 1987, al ...

regarding the treatments.

Online randomized controlled experiments

Web sites can run randomized controlled experiments to create a feedback loop. Key differences between offline experimentation and online experiments include: * Logging: user interactions can be logged reliably. * Number of users: large sites, such as Amazon, Bing/Microsoft, and Google run experiments, each with over a million users. * Number of concurrent experiments: large sites run tens of overlapping, or concurrent, experiments. * Robots, whether

web crawlers A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (''web spid ...

from valid sources or malicious

internet bots An Internet bot, web robot, robot or simply bot, is a software application that runs automated tasks (scripts) over the Internet, usually with the intent to imitate human activity on the Internet, such as messaging, on a large scale. An Internet b ...

. * Ability to ramp-up experiments from low percentages to higher percentages. * Speed / performance has significant impact on key metrics. * Ability to use the pre-experiment period as an A/A test to reduce variance.

History

A controlled experiment appears to have been suggested in the Old Testament's Book of Daniel. King Nebuchadnezzar proposed that some Israelites eat "a daily amount of food and wine from the king's table." Daniel preferred a vegetarian diet, but the official was concerned that the king would "see you looking worse than the other young men your age? The king would then have my head because of you." Daniel then proposed the following controlled experiment: "Test your servants for ten days. Give us nothing but vegetables to eat and water to drink. Then compare our appearance with that of the young men who eat the royal food, and treat your servants in accordance with what you see". (Daniel 1, 12– 13). Randomized experiments were institutionalized in psychology and education in the late eighteen-hundreds, following the invention of randomized experiments by

C. S. Peirce Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism". Educated as a chemist and employed as a scientist for t ...

. Outside of psychology and education, randomized experiments were popularized by

R.A. Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who ...

in his book '' Statistical Methods for Research Workers'', which also introduced additional principles of experimental design.

Statistical interpretation

The

Rubin Causal Model The Rubin causal model (RCM), also known as the Neyman–Rubin causal model, is an approach to the statistical analysis of cause and effect based on the framework of potential outcomes, named after Donald Rubin. The name "Rubin causal model" was ...

provides a common way to describe a randomized experiment. While the Rubin Causal Model provides a framework for defining the causal parameters (i.e., the effects of a randomized treatment on an outcome), the analysis of experiments can take a number of forms. Most commonly, randomized experiments are analyzed using

ANOVA Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician ...

student's t-test A ''t''-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of ...

regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

, or a similar

statistical test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...

Empirical evidence that randomization makes a difference

Empirically differences between randomized and non-randomized studies, and between adequately and inadequately randomized trials have been difficult to detect.

References