The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with
experiments
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when ...
in which the design introduces conditions that directly affect the variation, but may also refer to the design of
quasi-experiment
A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design ...
s, in which
natural
Nature, in the broadest sense, is the physical world or universe. "Nature" can refer to the phenomena of the physical world, and also to life in general. The study of nature is a large, if not the only, part of science. Although humans are p ...
conditions that influence the variation are selected for observation.
In its simplest form, an experiment aims at predicting the outcome by introducing a change of the preconditions, which is represented by one or more
independent variables
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
, also referred to as "input variables" or "predictor variables." The change in one or more independent variables is generally hypothesized to result in a change in one or more
dependent variables
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
, also referred to as "output variables" or "response variables." The experimental design may also identify
control variables that must be held constant to prevent external factors from affecting the results. Experimental design involves not only the selection of suitable independent, dependent, and control variables, but planning the delivery of the experiment under statistically optimal conditions given the constraints of available resources. There are multiple approaches for determining the set of design points (unique combinations of the settings of the independent variables) to be used in the experiment.
Main concerns in experimental design include the establishment of
validity
Validity or Valid may refer to:
Science/mathematics/statistics:
* Validity (logic), a property of a logical argument
* Scientific:
** Internal validity, the validity of causal inferences within scientific studies, usually based on experiments
** ...
,
reliability
Reliability, reliable, or unreliable may refer to:
Science, technology, and mathematics Computing
* Data reliability (disambiguation), a property of some disk arrays in computer storage
* High availability
* Reliability (computer networking), a ...
, and
replicability
Reproducibility, also known as replicability and repeatability, is a major principle underpinning the scientific method. For the findings of a study to be reproducible means that results obtained by an experiment or an observational study or in a ...
. For example, these concerns can be partially addressed by carefully choosing the independent variable, reducing the risk of measurement error, and ensuring that the documentation of the method is sufficiently detailed. Related concerns include achieving appropriate levels of
statistical power
In statistics, the power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true. It is commonly denoted by 1-\beta, and represents the chances ...
and
sensitivity.
Correctly designed experiments advance knowledge in the natural and social sciences and engineering. Other applications include marketing and policy making. The study of the design of experiments is an important topic in
metascience
Metascience (also known as meta-research) is the use of scientific methodology to study science itself. Metascience seeks to increase the quality of scientific research while reducing inefficiency. It is also known as "''research on research''" ...
.
History
Statistical experiments, following Charles S. Peirce
A theory of
statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...
was developed by
Charles S. Peirce
Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism".
Educated as a chemist and employed as a scientist for t ...
in "
Illustrations of the Logic of Science" (1877–1878) and "
A Theory of Probable Inference" (1883), two publications that emphasized the importance of randomization-based inference in statistics.
Randomized experiments
Charles S. Peirce randomly assigned volunteers to a
blinded,
repeated-measures design to evaluate their ability to discriminate weights.
[of
][
][
]
Peirce's experiment inspired other researchers in psychology and education, which developed a research tradition of randomized experiments in laboratories and specialized textbooks in the 1800s.
Optimal designs for regression models
Charles S. Peirce
Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism".
Educated as a chemist and employed as a scientist for t ...
also contributed the first English-language publication on an
optimal design
In the design of experiments, optimal designs (or optimum designs) are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statist ...
for
regression models
A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure.
Models c ...
in 1876. A pioneering
optimal design
In the design of experiments, optimal designs (or optimum designs) are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statist ...
for
polynomial regression
In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable ''x'' and the dependent variable ''y'' is modelled as an ''n''th degree polynomial in ''x''. Polynomial regression fi ...
was suggested by
Gergonne
Joseph Diez Gergonne (19 June 1771 at Nancy, France – 4 May 1859 at Montpellier, France) was a French mathematician and logician.
Life
In 1791, Gergonne enlisted in the French army as a captain. That army was undergoing rapid expansion becau ...
in 1815. In 1918,
Kirstine Smith
Kirstine Smith (April 12, 1878 – November 11, 1939) was a Danish statistician. She is credited with the creation of the field of optimal design of experiments.
Background
Smith grew up in the town of Nykøbing Mors, Denmark. In 1903, she grad ...
published optimal designs for polynomials of degree six (and less).
Sequences of experiments
The use of a sequence of experiments, where the design of each may depend on the results of previous experiments, including the possible decision to stop experimenting, is within the scope of
sequential analysis
In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre- ...
, a field that was pioneered by
Abraham Wald
Abraham Wald (; hu, Wald Ábrahám, yi, אברהם וואַלד; – ) was a Jewish Hungarian mathematician who contributed to decision theory, geometry, and econometrics and founded the field of statistical sequential analysis. One of ...
in the context of sequential tests of statistical hypotheses.
Herman Chernoff
Herman Chernoff (born July 1, 1923) is an American applied mathematician, statistician and physicist. He was formerly a professor at University of Illinois Urbana-Champaign, Stanford, and MIT, currently emeritus at Harvard University.
Early l ...
wrote an overview of optimal sequential designs,
while
adaptive designs have been surveyed by S. Zacks. One specific type of sequential design is the "two-armed bandit", generalized to the
multi-armed bandit
In probability theory and machine learning, the multi-armed bandit problem (sometimes called the ''K''- or ''N''-armed bandit problem) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices ...
, on which early work was done by
Herbert Robbins
Herbert Ellis Robbins (January 12, 1915 – February 12, 2001) was an American mathematician and statistician. He did research in topology, measure theory, statistics, and a variety of other fields.
He was the co-author, with Richard Courant ...
in 1952.
Fisher's principles
A methodology for designing experiments was proposed by
Ronald Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
, in his innovative books: ''The Arrangement of Field Experiments'' (1926) and ''
The Design of Experiments
''The Design of Experiments'' is a 1935 book by the English statistician Ronald Fisher about the design of experiments and is considered a foundational work in experimental design. Among other contributions, the book introduced the concept of the ...
'' (1935). Much of his pioneering work dealt with agricultural applications of statistical methods. As a mundane example, he described how to test the
lady tasting tea
In the design of experiments in statistics, the lady tasting tea is a randomized experiment devised by Ronald Fisher and reported in his book ''The Design of Experiments'' (1935). The experiment is the original exposition of Fisher's notion of ...
hypothesis
A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous obse ...
, that a certain lady could distinguish by flavour alone whether the milk or the tea was first placed in the cup. These methods have been broadly adapted in biological, psychological, and agricultural research.
[ Miller, Geoffrey (2000). ''The Mating Mind: how sexual choice shaped the evolution of human nature'', London: Heineman, (also Doubleday, ) "To biologists, he was an architect of the 'modern synthesis' that used mathematical models to integrate Mendelian genetics with Darwin's selection theories. To psychologists, Fisher was the inventor of various statistical tests that are still supposed to be used whenever possible in psychology journals. To farmers, Fisher was the founder of experimental agricultural research, saving millions from starvation through rational crop breeding programs." p.54.]
;Comparison
:In some fields of study it is not possible to have independent measurements to a traceable
metrology standard. Comparisons between treatments are much more valuable and are usually preferable, and often compared against a
scientific control
A scientific control is an experiment or observation designed to minimize the effects of variables other than the independent variable (i.e. confounding variables). This increases the reliability of the results, often through a comparison betwe ...
or traditional treatment that acts as baseline.
;
Randomization Randomization is the process of making something random. Randomization is not haphazard; instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution d ...
:Random assignment is the process of assigning individuals at random to groups or to different groups in an experiment, so that each individual of the population has the same chance of becoming a participant in the study. The random assignment of individuals to groups (or conditions within a group) distinguishes a rigorous, "true" experiment from an observational study or "quasi-experiment". There is an extensive body of mathematical theory that explores the consequences of making the allocation of units to treatments by means of some random mechanism (such as tables of random numbers, or the use of randomization devices such as playing cards or dice). Assigning units to treatments at random tends to mitigate
confounding
In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
, which makes effects due to factors other than the treatment to appear to result from the treatment.
:The risks associated with random allocation (such as having a serious imbalance in a key characteristic between a treatment group and a control group) are calculable and hence can be managed down to an acceptable level by using enough experimental units. However, if the population is divided into several subpopulations that somehow differ, and the research requires each subpopulation to be equal in size, stratified sampling can be used. In that way, the units in each subpopulation are randomized, but not the whole sample. The results of an experiment can be generalized reliably from the experimental units to a larger
statistical population
In statistics, a population is a set of similar items or events which is of interest for some question or experiment. A statistical population can be a group of existing objects (e.g. the set of all stars within the Milky Way galaxy) or a hypoth ...
of units only if the experimental units are a
random sample
In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt ...
from the larger population; the probable error of such an extrapolation depends on the sample size, among other things.
;
Statistical replication
:Measurements are usually subject to variation and
measurement uncertainty
In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a measured quantity. All measurements are subject to uncertainty and a measurement result is complete only when it is accompanied by ...
; thus they are repeated and full experiments are replicated to help identify the sources of variation, to better estimate the true effects of treatments, to further strengthen the experiment's reliability and validity, and to add to the existing knowledge of the topic. However, certain conditions must be met before the replication of the experiment is commenced: the original research question has been published in a
peer-review
Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work (peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review ...
ed journal or widely cited, the researcher is independent of the original experiment, the researcher must first try to replicate the original findings using the original data, and the write-up should state that the study conducted is a replication study that tried to follow the original study as strictly as possible.
;
Blocking
:Blocking is the non-random arrangement of experimental units into groups (blocks) consisting of units that are similar to one another. Blocking reduces known but irrelevant sources of variation between units and thus allows greater precision in the estimation of the source of variation under study.
;
Orthogonality
In mathematics, orthogonality is the generalization of the geometric notion of ''perpendicularity''.
By extension, orthogonality is also used to refer to the separation of specific features of a system. The term also has specialized meanings in ...
:Orthogonality concerns the forms of comparison (contrasts) that can be legitimately and efficiently carried out. Contrasts can be represented by vectors and sets of orthogonal contrasts are uncorrelated and independently distributed if the data are normal. Because of this independence, each orthogonal treatment provides different information to the others. If there are ''T'' treatments and ''T'' – 1 orthogonal contrasts, all the information that can be captured from the experiment is obtainable from the set of contrasts.
;
Factorial experiment
In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all s ...
s
:Use of factorial experiments instead of the one-factor-at-a-time method. These are efficient at evaluating the effects and possible
interactions
Interaction is action that occurs between two or more objects, with broad use in philosophy and the sciences. It may refer to:
Science
* Interaction hypothesis, a theory of second language acquisition
* Interaction (statistics)
* Interactions o ...
of several factors (independent variables). Analysis of
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into Causality, cause-and-effect by demonstrating what outcome oc ...
design is built on the foundation of the
analysis of variance
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statisticia ...
, a collection of models that partition the observed variance into components, according to what factors the experiment must estimate or test.
Example
This example of design experiments is attributed to
Harold Hotelling
Harold Hotelling (; September 29, 1895 – December 26, 1973) was an American mathematical statistician and an influential economic theorist, known for Hotelling's law, Hotelling's lemma, and Hotelling's rule in economics, as well as Hotelling's T ...
, building on examples from
Frank Yates.
Herman Chernoff
Herman Chernoff (born July 1, 1923) is an American applied mathematician, statistician and physicist. He was formerly a professor at University of Illinois Urbana-Champaign, Stanford, and MIT, currently emeritus at Harvard University.
Early l ...
, ''Sequential Analysis and Optimal Design'', SIAM
Thailand ( ), historically known as Siam () and officially the Kingdom of Thailand, is a country in Southeast Asia, located at the centre of the Mainland Southeast Asia, Indochinese Peninsula, spanning , with a population of almost 70 mi ...
Monograph, 1972. The experiments designed in this example involve
combinatorial design
Combinatorial design theory is the part of combinatorial mathematics that deals with the existence, construction and properties of systems of finite sets whose arrangements satisfy generalized concepts of ''balance'' and/or ''symmetry''. These co ...
s.
Weights of eight objects are measured using a
pan balance
A scale or balance is a device used to measure weight or mass. These are also known as mass scales, weight scales, mass balances, and weight balances.
The traditional scale consists of two plates or bowls suspended at equal distances from a ...
and set of standard weights. Each weighing measures the weight difference between objects in the left pan and any objects in the right pan by adding calibrated weights to the lighter pan until the balance is in equilibrium. Each measurement has a
random error
Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a " mistake ...
. The average error is zero; the
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
s of the
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of the errors is the same number σ on different weighings; errors on different weighings are
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independ ...
. Denote the true weights by
:
We consider two different experiments:
# Weigh each object in one pan, with the other pan empty. Let ''X''
''i'' be the measured weight of the object, for ''i'' = 1, ..., 8.
# Do the eight weighings according to the following schedule—a
weighing matrix
In mathematics, a weighing matrix of order n and weight w is a matrix W with entries from the set \ such that:
:WW^\mathsf = wI_n
Where W^\mathsf is the transpose of W and I_n is the identity matrix of order n. The weight w is also called th ...
—and let ''Y''
''i'' be the measured difference for ''i'' = 1, ..., 8:
::
: Then the estimated value of the weight ''θ''
1 is
::
:Similar estimates can be found for the weights of the other items. For example
::
The question of design of experiments is: which experiment is better?
The variance of the estimate ''X''
1 of ''θ''
1 is ''σ''
2 if we use the first experiment. But if we use the second experiment, the variance of the estimate given above is ''σ''
2/8. Thus the second experiment gives us 8 times as much precision for the estimate of a single item, and estimates all items simultaneously, with the same precision. What the second experiment achieves with eight would require 64 weighings if the items are weighed separately. However, note that the estimates for the items obtained in the second experiment have errors that correlate with each other.
Many problems of the design of experiments involve
combinatorial design
Combinatorial design theory is the part of combinatorial mathematics that deals with the existence, construction and properties of systems of finite sets whose arrangements satisfy generalized concepts of ''balance'' and/or ''symmetry''. These co ...
s, as in this example and others.
Avoiding false positives
False positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
conclusions, often resulting from the
pressure to publish or the author's own
confirmation bias
Confirmation bias is the tendency to search for, interpret, favor, and recall information in a way that confirms or supports one's prior beliefs or values. People display this bias when they select information that supports their views, ignoring ...
, are an inherent hazard in many fields. A good way to prevent biases potentially leading to false positives in the data collection phase is to use a double-blind design. When a double-blind design is used, participants are randomly assigned to experimental groups but the researcher is unaware of what participants belong to which group. Therefore, the researcher can not affect the participants' response to the intervention.
Experimental designs with undisclosed degrees of freedom are a problem. This can lead to conscious or unconscious "
p-hacking
Data dredging (also known as data snooping or ''p''-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives. T ...
": trying multiple things until you get the desired result. It typically involves the manipulation – perhaps unconsciously – of the process of
statistical analysis
Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers propertie ...
and the degrees of freedom until they return a figure below the p<.05 level of statistical significance. So the design of the experiment should include a clear statement proposing the analyses to be undertaken. P-hacking can be prevented by preregistering researches, in which researchers have to send their data analysis plan to the journal they wish to publish their paper in before they even start their data collection, so no data manipulation is possible (https://osf.io). Another way to prevent this is taking the double-blind design to the data-analysis phase, where the data are sent to a data-analyst unrelated to the research who scrambles up the data so there is no way to know which participants belong to before they are potentially taken away as outliers.
Clear and complete documentation of the experimental methodology is also important in order to support replication of results.
Discussion topics when setting up an experimental design
An experimental design or randomized clinical trial requires careful consideration of several factors before actually doing the experiment. An experimental design is the laying out of a detailed experimental plan in advance of doing the experiment. Some of the following topics have already been discussed in the principles of experimental design section:
# How many factors does the design have, and are the levels of these factors fixed or random?
# Are control conditions needed, and what should they be?
# Manipulation checks: did the manipulation really work?
# What are the background variables?
# What is the sample size? How many units must be collected for the experiment to be generalisable and have enough
power
Power most often refers to:
* Power (physics), meaning "rate of doing work"
** Engine power, the power put out by an engine
** Electric power
* Power (social and political), the ability to influence people or events
** Abusive power
Power may a ...
?
# What is the relevance of interactions between factors?
# What is the influence of delayed effects of substantive factors on outcomes?
# How do response shifts affect self-report measures?
# How feasible is repeated administration of the same measurement instruments to the same units at different occasions, with a post-test and follow-up tests?
# What about using a proxy pretest?
# Are there
lurking variables?
# Should the client/patient, researcher or even the analyst of the data be blind to conditions?
# What is the feasibility of subsequent application of different conditions to the same units?
# How many of each control and noise factors should be taken into account?
The independent variable of a study often has many levels or different groups. In a true experiment, researchers can have an experimental group, which is where their intervention testing the hypothesis is implemented, and a control group, which has all the same element as the experimental group, without the interventional element. Thus, when everything else except for one intervention is held constant, researchers can certify with some certainty that this one element is what caused the observed change. In some instances, having a control group is not ethical. This is sometimes solved using two different experimental groups. In some cases, independent variables cannot be manipulated, for example when testing the difference between two groups who have a different disease, or testing the difference between genders (obviously variables that would be hard or unethical to assign participants to). In these cases, a quasi-experimental design may be used.
Causal attributions
In the pure experimental design, the independent (predictor) variable is manipulated by the researcher – that is – every participant of the research is chosen randomly from the population, and each participant chosen is assigned randomly to conditions of the independent variable. Only when this is done is it possible to certify with high probability that the reason for the differences in the outcome variables are caused by the different conditions. Therefore, researchers should choose the experimental design over other design types whenever possible. However, the nature of the independent variable does not always allow for manipulation. In those cases, researchers must be aware of not certifying about causal attribution when their design doesn't allow for it. For example, in observational designs, participants are not assigned randomly to conditions, and so if there are differences found in outcome variables between conditions, it is likely that there is something other than the differences between the conditions that causes the differences in outcomes, that is – a third variable. The same goes for studies with correlational design. (Adér & Mellenbergh, 2008).
Statistical control
It is best that a process be in reasonable statistical control prior to conducting designed experiments. When this is not possible, proper blocking, replication, and randomization allow for the careful conduct of designed experiments.
To control for nuisance variables, researchers institute control checks as additional measures. Investigators should ensure that uncontrolled influences (e.g., source credibility perception) do not skew the findings of the study. A
manipulation check
Manipulation check is a term in experimental research in the social sciences which refers to certain kinds of secondary evaluations of an experiment.
Overview
Manipulation checks are measured variables that show what the manipulated variables co ...
is one example of a control check. Manipulation checks allow investigators to isolate the chief variables to strengthen support that these variables are operating as planned.
One of the most important requirements of experimental research designs is the necessity of eliminating the effects of
spurious
Spurious may refer to:
* Spurious relationship in statistics
* Spurious emission or spurious tone in radio engineering
* Spurious key in cryptography
* Spurious interrupt in computing
* Spurious wakeup in computing
* ''Spurious'', a 2011 novel ...
, intervening, and
antecedent variables. In the most basic model, cause (X) leads to effect (Y). But there could be a third variable (Z) that influences (Y), and X might not be the true cause at all. Z is said to be a spurious variable and must be controlled for. The same is true for
intervening variables (a variable in between the supposed cause (X) and the effect (Y)), and anteceding variables (a variable prior to the supposed cause (X) that is the true cause). When a third variable is involved and has not been controlled for, the relation is said to be a
zero order relationship. In most practical applications of experimental research designs there are several causes (X1, X2, X3). In most designs, only one of these causes is manipulated at a time.
Experimental designs after Fisher
Some efficient designs for estimating several main effects were found independently and in near succession by
Raj Chandra Bose
Raj Chandra Bose (19 June 1901 – 31 October 1987) was an Indian American mathematician and statistician best known for his work in design theory, finite geometry and the theory of error-correcting codes in which the class of BCH codes is par ...
and
K. Kishen in 1940 at the
Indian Statistical Institute
Indian Statistical Institute (ISI) is a higher education and research institute which is recognized as an Institute of National Importance by the 1959 act of the Indian parliament. It grew out of the Statistical Laboratory set up by Prasanta C ...
, but remained little known until the
Plackett–Burman design
Plackett–Burman designs are experimental designs presented in 1946 by Robin L. Plackett and J. P. Burman while working in the British Ministry of Supply.
Their goal was to find experimental designs for investigating the dependence of some meas ...
s were published in ''
Biometrika
''Biometrika'' is a peer-reviewed scientific journal published by Oxford University Press for thBiometrika Trust The editor-in-chief is Paul Fearnhead (Lancaster University). The principal focus of this journal is theoretical statistics. It was es ...
'' in 1946. About the same time,
C. R. Rao
Calyampudi Radhakrishna Rao FRS (born 10 September 1920), commonly known as C. R. Rao, is an Indian-American mathematician and statistician. He is currently professor emeritus at Pennsylvania State University and Research Professor at the Un ...
introduced the concepts of
orthogonal array
In mathematics, an orthogonal array is a "table" (array) whose entries come from a fixed finite set of symbols (typically, ), arranged in such a way that there is an integer ''t'' so that for every selection of ''t'' columns of the table, all order ...
s as experimental designs. This concept played a central role in the development of
Taguchi methods
Taguchi methods ( ja, タグチメソッド) are statistical methods, sometimes called robust design methods, developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to engineering, biotechnology, ...
by
Genichi Taguchi
was an engineer and statistician. From the 1950s onwards, Taguchi developed a methodology for applying statistics to improve the quality of manufactured goods. Taguchi methods have been controversial among some conventional Western statisticians, ...
, which took place during his visit to Indian Statistical Institute in early 1950s. His methods were successfully applied and adopted by Japanese and Indian industries and subsequently were also embraced by US industry albeit with some reservations.
In 1950,
Gertrude Mary Cox
Gertrude Mary Cox (January 13, 1900 – October 17, 1978) was an American statistician and founder of the department of Experimental Statistics at North Carolina State University. She was later appointed director of both the Institute of Statist ...
and
William Gemmell Cochran
William Gemmell Cochran (15 July 1909 – 29 March 1980) was a prominent statistician. He was born in Scotland but spent most of his life in the United States.
Cochran studied mathematics at the University of Glasgow and the University of Cam ...
published the book ''Experimental Designs,'' which became the major reference work on the design of experiments for statisticians for years afterwards.
Developments of the theory of
linear model
In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term ...
s have encompassed and surpassed the cases that concerned early writers. Today, the theory rests on advanced topics in
linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as:
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as:
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrices.
...
,
algebra
Algebra () is one of the broad areas of mathematics. Roughly speaking, algebra is the study of mathematical symbols and the rules for manipulating these symbols in formulas; it is a unifying thread of almost all of mathematics.
Elementary a ...
and
combinatorics
Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
.
As with other branches of statistics, experimental design is pursued using both
frequentist
Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pr ...
and
Bayesian
Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister.
Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...
approaches: In evaluating statistical procedures like experimental designs,
frequentist statistics
Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pro ...
studies the
sampling distribution
In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given random-sample-based statistic. If an arbitrarily large number of samples, each involving multiple observations (data points), were s ...
while
Bayesian statistics
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
updates a
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
on the parameter space.
Some important contributors to the field of experimental designs are
C. S. Peirce
Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism".
Educated as a chemist and employed as a scientist for t ...
,
R. A. Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who ...
,
F. Yates,
R. C. Bose
Raj Chandra Bose (19 June 1901 – 31 October 1987) was an Indian American mathematician and statistician best known for his work in design theory, finite geometry and the theory of error-correcting codes in which the class of BCH codes is p ...
,
A. C. Atkinson,
R. A. Bailey,
D. R. Cox,
G. E. P. Box,
W. G. Cochran,
W. T. Federer,
V. V. Fedorov,
A. S. Hedayat,
J. Kiefer,
O. Kempthorne,
J. A. Nelder,
Andrej Pázman,
Friedrich Pukelsheim,
D. Raghavarao,
C. R. Rao
Calyampudi Radhakrishna Rao FRS (born 10 September 1920), commonly known as C. R. Rao, is an Indian-American mathematician and statistician. He is currently professor emeritus at Pennsylvania State University and Research Professor at the Un ...
,
Shrikhande S. S.,
J. N. Srivastava
Jagdish Narain Srivastava (1933-2010) was an Indian-born mathematician, statistician and a professor at Colorado State University. Srivastava is known for the research in the area of Design of experiments, Multivariate analysis and Combinatorial ma ...
,
William J. Studden,
G. Taguchi and
H. P. Wynn.
The textbooks of D. Montgomery, R. Myers, and G. Box/W. Hunter/J.S. Hunter have reached generations of students and practitioners.
Some discussion of experimental design in the context of
system identification (model building for static or dynamic models) is given in and.
Human participant constraints
Laws and ethical considerations preclude some carefully designed
experiments with human subjects. Legal constraints are dependent on
jurisdiction
Jurisdiction (from Latin 'law' + 'declaration') is the legal term for the legal authority granted to a legal entity to enact justice. In federations like the United States, areas of jurisdiction apply to local, state, and federal levels.
Jur ...
. Constraints may involve
institutional review board
An institutional review board (IRB), also known as an independent ethics committee (IEC), ethical review board (ERB), or research ethics board (REB), is a committee that applies research ethics by reviewing the methods proposed for research to ens ...
s,
informed consent
Informed consent is a principle in medical ethics and medical law, that a patient must have sufficient information and understanding before making decisions about their medical care. Pertinent information may include risks and benefits of treatme ...
and
confidentiality
Confidentiality involves a set of rules or a promise usually executed through confidentiality agreements that limits the access or places restrictions on certain types of information.
Legal confidentiality
By law, lawyers are often required ...
affecting both clinical (medical) trials and
behavioral and social science experiments.
In the field of toxicology, for example, experimentation is performed
on laboratory ''animals'' with the goal of defining safe exposure limits
for ''humans''. Balancing
the constraints are views from the medical field.
Regarding the randomization of patients,
"... if no one knows which therapy is better, there is no ethical
imperative to use one therapy or another." (p 380) Regarding
experimental design, "...it is clearly not ethical to place subjects
at risk to collect data in a poorly designed study when this situation
can be easily avoided...". (p 393)
See also
*
Adversarial collaboration
In science, adversarial collaboration is a term used when two or more scientists with opposing views work together. This can take the form of a scientific experiment conducted by two groups of experimenters with competing hypotheses, with the aim ...
*
Bayesian experimental design Bayesian experimental design provides a general probability-theoretical framework from which other theories on experimental design can be derived. It is based on Bayesian inference to interpret the observations/data acquired during the experiment. ...
*
Block design
In combinatorial mathematics, a block design is an incidence structure consisting of a set together with a family of subsets known as ''blocks'', chosen such that frequency of the elements satisfies certain conditions making the collection of bl ...
*
Box–Behnken design
In statistics, Box–Behnken designs are experimental designs for response surface methodology, devised by George E. P. Box and Donald Behnken in 1960, to achieve the following goals:
* Each factor, or independent variable, is placed at one o ...
*
Central composite design In statistics, a central composite design is an experimental design, useful in response surface methodology, for building a second order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment.
...
*
Clinical trial
Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, dietar ...
*
Clinical study design
Clinical study design is the formulation of trials and experiments, as well as observational studies in medical, clinical and other types of research (e.g., epidemiological) involving human beings. The goal of a clinical study is to assess the saf ...
*
Computer experiment A computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an in silico system. This area includes computational physics, computational chemistry, computational biology and other similar ...
*
Control variable
A control variable (or scientific constant) in scientific experimentation is an experimental element which is constant (controlled) and unchanged throughout the course of the investigation. Control variables could strongly influence experimental ...
*
Controlling for a variable
In causal models, controlling for a variable means binning data according to measured values of the variable. This is typically done so that the variable can no longer act as a confounder in, for example, an observational study or experiment.
...
*
Experimetrics (
econometrics
Econometrics is the application of Statistics, statistical methods to economic data in order to give Empirical evidence, empirical content to economic relationships.M. Hashem Pesaran (1987). "Econometrics," ''The New Palgrave: A Dictionary of ...
-related experiments)
*
Factor analysis
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
*
Fractional factorial design
In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. The subset is chosen so as to exploit the sparsity-of-effects principle ...
*
Glossary of experimental design
A glossary of terms used in experimental research. Concerned fields
* Statistics
* Experimental design
* Estimation theory
Glossary
* Alias: When the estimate of an effect also includes the influence of one or more other effects (usually hi ...
*
Grey box model
In mathematics, statistics, and computational modelling, a grey box modelKroll, Andreas (2000). Grey-box models: Concepts and application. In: New Frontiers in Computational Intelligence and its Applications, vol.57 of Frontiers in artificial int ...
*
Industrial engineering
Industrial engineering is an engineering profession that is concerned with the optimization of complex process (engineering), processes, systems, or organizations by developing, improving and implementing integrated systems of people, money, kno ...
*
Instrument effect
*
Law of large numbers
In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials shou ...
*
Manipulation checks
Manipulation check is a term in experimental research in the social sciences which refers to certain kinds of secondary evaluations of an experiment.
Overview
Manipulation checks are measured variables that show what the manipulated variables co ...
*
Multifactor design of experiments software
Software that is used for designing factorial experiments plays an important role in scientific experiments and represents a route to the implementation of design of experiments procedures that derive from statistical and combinatorial theory. In ...
*
One-factor-at-a-time method The one-factor-at-a-time method, also known as one-variable-at-a-time, OFAT, OF@T,
OFaaT, OVAT, OV@T, OVaaT, or monothetic analysis is a method of designing experiments involving the testing of factors, or causes, one at a time instead of multiple ...
*
Optimal design
In the design of experiments, optimal designs (or optimum designs) are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statist ...
*
Plackett–Burman design
Plackett–Burman designs are experimental designs presented in 1946 by Robin L. Plackett and J. P. Burman while working in the British Ministry of Supply.
Their goal was to find experimental designs for investigating the dependence of some meas ...
*
Probabilistic design
Probabilistic design is a discipline within engineering design. It deals primarily with the consideration of the effects of random variability upon the performance of an engineering system during the design phase. Typically, these effects are rela ...
*
Protocol (natural sciences)
In natural and social science research, a protocol is most commonly a predefined procedural method in the design and implementation of an experiment. Protocols are written whenever it is desirable to standardize a laboratory method to ensure succe ...
*
Quasi-experimental design
A quasi-experiment is an empirical interventional study used to estimate the causal impact of an intervention on target population without random assignment. Quasi-experimental research shares similarities with the traditional experimental design ...
*
Randomized block design
In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) that are similar to one another. Blocking can be used to tackle the problem of pseudoreplication.
Use
Blocking reduces un ...
*
Randomized controlled trial
A randomized controlled trial (or randomized control trial; RCT) is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical te ...
*
Research design
Research design refers to the overall strategy utilized to carry out research that defines a succinct and logical plan to tackle established research question(s) through the collection, interpretation, analysis, and discussion of data.
Incorporat ...
*
Robust parameter design
*
Sample size determination
Sample size determination is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a populatio ...
*
Supersaturated
In physical chemistry, supersaturation occurs with a solution when the concentration of a solute exceeds the concentration specified by the value of solubility at equilibrium. Most commonly the term is applied to a solution of a solid in a liqu ...
design
*
Royal Commission on Animal Magnetism
The Royal Commission on Animal Magnetism involved two entirely separate and independent French Royal Commissions, each appointed by Louis XVI in 1784, that were conducted simultaneously by a committee composed of four physicians from the Paris ...
*
Survey sampling In statistics, survey sampling describes the process of selecting a sample of elements from a target population to conduct a survey.
The term "survey" may refer to many different types or techniques of observation. In survey sampling it most often ...
*
System identification
*
Taguchi methods
Taguchi methods ( ja, タグチメソッド) are statistical methods, sometimes called robust design methods, developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to engineering, biotechnology, ...
References
Sources
*
Peirce, C. S. (1877–1878), "Illustrations of the Logic of Science" (series), ''Popular Science Monthly'', vols. 12–13. Relevant individual papers:
** (1878 March), "The Doctrine of Chances", ''Popular Science Monthly'', v. 12, March issue, pp
604615. ''Internet Archive'
Eprint
** (1878 April), "The Probability of Induction", ''Popular Science Monthly'', v. 12, pp
705718. ''Internet Archive'
Eprint
** (1878 June), "The Order of Nature", ''Popular Science Monthly'', v. 13, pp
203217.''Internet Archive'
Eprint
** (1878 August), "Deduction, Induction, and Hypothesis", ''Popular Science Monthly'', v. 13, pp
470482. ''Internet Archive'
Eprint
** (1883), "A Theory of Probable Inference", ''Studies in Logic'', pp
126–181 Little, Brown, and Company. (Reprinted 1983, John Benjamins Publishing Company, )
External links
*
from
"NIST/SEMATECH Handbook on Engineering Statistics"at
NIST
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...
Box–Behnken designsfrom
"NIST/SEMATECH Handbook on Engineering Statistics"at
NIST
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...
Detailed mathematical developments of most common DoEin the Opera Magistris v3.6 online reference Chapter 15, section 7.4, .
{{DEFAULTSORT:Design of Experiments
Experiments
Industrial engineering
Metascience
Quantitative research
Statistical process control
Statistical theory
Systems engineering
Mathematics in medicine