HOME

TheInfoList



OR:

In the
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
, optimal designs (or optimum designs) are a class of experimental designs that are
optimal Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
with respect to some
statistical Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industri ...
criterion Criterion, or its plural form criteria, may refer to: General * Criterion, Oregon, a historic unincorporated community in the United States * Criterion Place, a proposed skyscraper in West Yorkshire, England * Criterion Restaurant, in London, En ...
. The creation of this field of statistics has been credited to Danish statistician
Kirstine Smith Kirstine Smith (April 12, 1878 – November 11, 1939) was a Danish statistician. She is credited with the creation of the field of optimal design of experiments. Background Smith grew up in the town of Nykøbing Mors, Denmark. In 1903, she gradu ...
. In the
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
for
estimating Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is der ...
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
s, optimal designs allow parameters to be estimated without bias and with minimum variance. A non-optimal design requires a greater number of experimental runs to
estimate Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is der ...
the
parameters A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
with the same
precision Precision, precise or precisely may refer to: Science, and technology, and mathematics Mathematics and computing (general) * Accuracy and precision, measurement deviation from true value and its scatter * Significant figures, the number of digit ...
as an optimal design. In practical terms, optimal experiments can reduce the costs of experimentation. The optimality of a design depends on the
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
and is assessed with respect to a statistical criterion, which is related to the variance-matrix of the estimator. Specifying an appropriate model and specifying a suitable criterion function both require understanding of
statistical theory The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistica ...
and practical knowledge with designing experiments.


Advantages

Optimal designs offer three advantages over sub-optimal experimental designs: #Optimal designs reduce the costs of experimentation by allowing
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
s to be estimated with fewer experimental runs. #Optimal designs can accommodate multiple types of factors, such as process, mixture, and discrete factors. #Designs can be optimized when the design-space is constrained, for example, when the mathematical process-space contains factor-settings that are practically infeasible (e.g. due to safety concerns).


Minimizing the variance of estimators

Experimental designs are evaluated using statistical criteria. It is known that the
least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the re ...
estimator minimizes the
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
of
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithm ...
-
unbiased Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, ...
estimators In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
(under the conditions of the
Gauss–Markov theorem In statistics, the Gauss–Markov theorem (or simply Gauss theorem for some authors) states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in th ...
). In the
estimation Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is de ...
theory for
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
s with one
real Real may refer to: Currencies * Brazilian real (R$) * Central American Republic real * Mexican real * Portuguese real * Spanish real * Spanish colonial real Music Albums * ''Real'' (L'Arc-en-Ciel album) (2000) * ''Real'' (Bright album) (2010) ...
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
, the
reciprocal Reciprocal may refer to: In mathematics * Multiplicative inverse, in mathematics, the number 1/''x'', which multiplied by ''x'' gives the product 1, also known as a ''reciprocal'' * Reciprocal polynomial, a polynomial obtained from another pol ...
of the variance of an ( "efficient") estimator is called the "
Fisher information In mathematical statistics, the Fisher information (sometimes simply called information) is a way of measuring the amount of information that an observable random variable ''X'' carries about an unknown parameter ''θ'' of a distribution that mode ...
" for that estimator. Because of this reciprocity, ''minimizing'' the ''variance'' corresponds to ''maximizing'' the ''information''. When the
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
has several
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s, however, the
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithm ...
of the parameter-estimator is a
vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...
and its
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
is a
matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
. The
inverse matrix In linear algebra, an -by- square matrix is called invertible (also nonsingular or nondegenerate), if there exists an -by- square matrix such that :\mathbf = \mathbf = \mathbf_n \ where denotes the -by- identity matrix and the multiplicati ...
of the variance-matrix is called the "information matrix". Because the variance of the estimator of a parameter vector is a matrix, the problem of "minimizing the variance" is complicated. Using
statistical theory The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistica ...
, statisticians compress the information-matrix using real-valued
summary statistics In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount of information as simply as possible. Statisticians commonly try to describe the observations in * a measure of ...
; being real-valued functions, these "information criteria" can be maximized. The traditional optimality-criteria are invariants of the
information Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
matrix; algebraically, the traditional optimality-criteria are functionals of the
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...
s of the information matrix. *A-optimality ("average" or trace) **One criterion is A-optimality, which seeks to minimize the
trace Trace may refer to: Arts and entertainment Music * ''Trace'' (Son Volt album), 1995 * ''Trace'' (Died Pretty album), 1993 * Trace (band), a Dutch progressive rock band * ''The Trace'' (album) Other uses in arts and entertainment * ''Trace' ...
of the inverse of the information matrix. This criterion results in minimizing the average variance of the estimates of the regression coefficients. *C-optimality **This criterion minimizes the variance of a
best linear unbiased estimator Best or The Best may refer to: People * Best (surname), people with the surname Best * Best (footballer, born 1968), retired Portuguese footballer Companies and organizations * Best & Co., an 1879–1971 clothing chain * Best Lock Corporation, ...
of a predetermined linear combination of model parameters. *D-optimality (determinant) **A popular criterion is D-optimality, which seeks to minimize , (X'X)−1, , or equivalently maximize the
determinant In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. It characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if an ...
of the
information matrix In mathematical statistics, the Fisher information (sometimes simply called information) is a way of measuring the amount of information that an observable random variable ''X'' carries about an unknown parameter ''θ'' of a distribution that model ...
X'X of the design. This criterion results in maximizing the differential Shannon information content of the parameter estimates. *E-optimality (eigenvalue) **Another design is E-optimality, which maximizes the minimum
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...
of the information matrix. *S-optimality **This criterion maximizes a quantity measuring the mutual column orthogonality of X and the
determinant In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. It characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if an ...
of the information matrix. *T-optimality **This criterion maximizes the discrepancy between two proposed models at the design locations. Other optimality-criteria are concerned with the variance of
predictions A prediction (Latin ''præ-'', "before," and ''dicere'', "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exact ...
: *G-optimality **A popular criterion is G-optimality, which seeks to minimize the maximum entry in the
diagonal In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek δ ...
of the
hat matrix In statistics, the projection matrix (\mathbf), sometimes also called the influence matrix or hat matrix (\mathbf), maps the vector of response values (dependent variable values) to the vector of fitted values (or predicted values). It describes ...
X(X'X)−1X'. This has the effect of minimizing the maximum variance of the predicted values. *I-optimality (integrated) **A second criterion on prediction variance is I-optimality, which seeks to minimize the average prediction variance ''over the design space''. *V-optimality (variance) **A third criterion on prediction variance is V-optimality, which seeks to minimize the average prediction variance over a set of m specific points.


Contrasts

In many applications, the statistician is most concerned with a "parameter of interest" rather than with "nuisance parameters". More generally, statisticians consider linear combinations of parameters, which are estimated via linear combinations of treatment-means in the
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
and in the
analysis of variance Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician ...
; such linear combinations are called contrasts. Statisticians can use appropriate optimality-criteria for such parameters of interest and for contrasts.


Implementation

Catalogs of optimal designs occur in books and in software libraries. In addition, major statistical systems like SAS and R have procedures for optimizing a design according to a user's specification. The experimenter must specify a
model A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
for the design and an optimality-criterion before the method can compute an optimal design.


Practical considerations

Some advanced topics in optimal design require more
statistical theory The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistica ...
and practical knowledge in designing experiments.


Model dependence and robustness

Since the optimality criterion of most optimal designs is based on some function of the information matrix, the 'optimality' of a given design is ''
model A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
dependent'': While an optimal design is best for that
model A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
, its performance may deteriorate on other
models A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
. On other
models A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
, an ''optimal'' design can be either better or worse than a non-optimal design. Therefore, it is important to
benchmark Benchmark may refer to: Business and economics * Benchmarking, evaluating performance within organizations * Benchmark price * Benchmark (crude oil), oil-specific practices Science and technology * Benchmark (surveying), a point of known elevatio ...
the performance of designs under alternative
models A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
.


Choosing an optimality criterion and robustness

The choice of an appropriate optimality criterion requires some thought, and it is useful to benchmark the performance of designs with respect to several optimality criteria. Cornell writes that Indeed, there are several classes of designs for which all the traditional optimality-criteria agree, according to the theory of "universal optimality" of Kiefer. The experience of practitioners like Cornell and the "universal optimality" theory of Kiefer suggest that robustness with respect to changes in the ''optimality-criterion'' is much greater than is robustness with respect to changes in the ''model''.


Flexible optimality criteria and convex analysis

High-quality statistical software provide a combination of libraries of optimal designs or iterative methods for constructing approximately optimal designs, depending on the model specified and the optimality criterion. Users may use a standard optimality-criterion or may program a custom-made criterion. All of the traditional optimality-criteria are convex (or concave) functions, and therefore optimal-designs are amenable to the mathematical theory of
convex analysis Convex analysis is the branch of mathematics devoted to the study of properties of convex functions and convex sets, often with applications in convex minimization, a subdomain of optimization theory. Convex sets A subset C \subseteq X of so ...
and their computation can use specialized methods of
convex minimization Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization probl ...
. The practitioner need not select ''exactly one'' traditional, optimality-criterion, but can specify a custom criterion. In particular, the practitioner can specify a convex criterion using the maxima of convex optimality-criteria and nonnegative combinations of optimality criteria (since these operations preserve convex functions). For ''convex'' optimality criteria, the Kiefer- Wolfowitzbr>equivalence theorem
allows the practitioner to verify that a given design is globally optimal. The Kiefer- Wolfowitzbr>equivalence theorem
is related with the Legendre- Fenchel
conjugacy In mathematics, especially group theory, two elements a and b of a group are conjugate if there is an element g in the group such that b = gag^. This is an equivalence relation whose equivalence classes are called conjugacy classes. In other wo ...
for convex functions. If an optimality-criterion lacks
convexity Convex or convexity may refer to: Science and technology * Convex lens, in optics Mathematics * Convex set, containing the whole line segment that joins points ** Convex polygon, a polygon which encloses a convex set of points ** Convex polytope, ...
, then finding a
global optimum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given r ...
and verifying its optimality often are difficult.


Model uncertainty and Bayesian approaches


Model selection

When scientists wish to test several theories, then a statistician can design an experiment that allows optimal tests between specified models. Such "discrimination experiments" are especially important in the
biostatistics Biostatistics (also known as biometry) are the development and application of statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experimen ...
supporting
pharmacokinetics Pharmacokinetics (from Ancient Greek ''pharmakon'' "drug" and ''kinetikos'' "moving, putting in motion"; see chemical kinetics), sometimes abbreviated as PK, is a branch of pharmacology dedicated to determining the fate of substances administered ...
and
pharmacodynamics Pharmacodynamics (PD) is the study of the biochemical and physiologic effects of drugs (especially pharmaceutical drugs). The effects can include those manifested within animals (including humans), microorganisms, or combinations of organisms (for ...
, following the work of
Cox Cox may refer to: * Cox (surname), including people with the name Companies * Cox Enterprises, a media and communications company ** Cox Communications, cable provider ** Cox Media Group, a company that owns television and radio stations * ...
and Atkinson.


Bayesian experimental design

When practitioners need to consider multiple
models A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
, they can specify a probability-measure on the models and then select any design maximizing the
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
of such an experiment. Such probability-based optimal-designs are called optimal
Bayesian Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister. Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a follower ...
designs A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design'' ...
. Such Bayesian designs are used especially for
generalized linear models In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and b ...
(where the response follows an exponential-family distribution). The use of a Bayesian design does not force statisticians to use
Bayesian methods Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and e ...
to analyze the data, however. Indeed, the "Bayesian" label for probability-based experimental-designs is disliked by some researchers. Alternative terminology for "Bayesian" optimality includes "on-average" optimality or "population" optimality.


Iterative experimentation

Scientific experimentation is an iterative process, and statisticians have developed several approaches to the optimal design of sequential experiments.


Sequential analysis

Sequential analysis In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre- ...
was pioneered by
Abraham Wald Abraham Wald (; hu, Wald Ábrahám, yi, אברהם וואַלד;  – ) was a Jewish Hungarian mathematician who contributed to decision theory, geometry, and econometrics and founded the field of statistical sequential analysis. One of ...
. In 1972,
Herman Chernoff Herman Chernoff (born July 1, 1923) is an American applied mathematician, statistician and physicist. He was formerly a professor at University of Illinois Urbana-Champaign, Stanford, and MIT, currently emeritus at Harvard University. Early li ...
wrote an overview of optimal sequential designs, while adaptive designs were surveyed later by S. Zacks. Of course, much work on the optimal design of experiments is related to the theory of
optimal decision An optimal decision is a decision that leads to at least as good a known or expected outcome as all other available decision options. It is an important concept in decision theory. In order to compare the different decision outcomes, one commonly ...
s, especially the
statistical decision theory Decision theory (or the theory of choice; not to be confused with choice theory) is a branch of applied probability theory concerned with the theory of making decisions based on assigning probabilities to various factors and assigning numerical ...
of
Abraham Wald Abraham Wald (; hu, Wald Ábrahám, yi, אברהם וואַלד;  – ) was a Jewish Hungarian mathematician who contributed to decision theory, geometry, and econometrics and founded the field of statistical sequential analysis. One of ...
.


Response-surface methodology

Optimal designs for response-surface models are discussed in the textbook by Atkinson, Donev and Tobias, and in the survey of Gaffke and Heiligers and in the mathematical text of Pukelsheim. The blocking of optimal designs is discussed in the textbook of Atkinson, Donev and Tobias and also in the monograph by Goos. The earliest optimal designs were developed to estimate the parameters of regression models with continuous variables, for example, by J. D. Gergonne in 1815 (Stigler). In English, two early contributions were made by
Charles S. Peirce Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism". Educated as a chemist and employed as a scientist for t ...
an
Kirstine Smith
Pioneering designs for multivariate response-surfaces were proposed by George E. P. Box. However, Box's designs have few optimality properties. Indeed, the
Box–Behnken design In statistics, Box–Behnken designs are experimental designs for response surface methodology, devised by George E. P. Box and Donald Behnken in 1960, to achieve the following goals: * Each factor, or independent variable, is placed at one of ...
requires excessive experimental runs when the number of variables exceeds three. Box's "central-composite" designs require more experimental runs than do the optimal designs of Kôno.


System identification and stochastic approximation

The optimization of sequential experimentation is studied also in
stochastic programming In the field of mathematical optimization, stochastic programming is a framework for modeling optimization problems that involve uncertainty. A stochastic program is an optimization problem in which some or all problem parameters are uncertain, ...
and in
systems A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environment, is described by its boundaries, structure and purpose and express ...
and control. Popular methods include
stochastic approximation Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving l ...
and other methods of
stochastic optimization Stochastic optimization (SO) methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involves random objective functi ...
. Much of this research has been associated with the subdiscipline of
system identification The field of system identification uses statistical methods to build mathematical models of dynamical systems from measured data. System identification also includes the optimal design of experiments for efficiently generating informative data f ...
. In computational
optimal control Optimal control theory is a branch of mathematical optimization that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering an ...
, D. Judin & A. Nemirovskii an
Boris Polyak
has described methods that are more efficient than the ( Armijo-style) step-size rules introduced by
G. E. P. Box George Edward Pelham Box (18 October 1919 – 28 March 2013) was a British statistician, who worked in the areas of quality control, time-series analysis, design of experiments, and Bayesian inference. He has been called "one of the gre ...
in response-surface methodology. Adaptive designs are used in
clinical trials Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, dietar ...
, and optimal adaptive designs are surveyed in the ''Handbook of Experimental Designs'' chapter by Shelemyahu Zacks.


Specifying the number of experimental runs


Using a computer to find a good design

There are several methods of finding an optimal design, given an ''a priori'' restriction on the number of experimental runs or replications. Some of these methods are discussed by Atkinson, Donev and Tobias and in the paper by Hardin and Sloane. Of course, fixing the number of experimental runs ''a priori'' would be impractical. Prudent statisticians examine the other optimal designs, whose number of experimental runs differ.


Discretizing probability-measure designs

In the mathematical theory on optimal experiments, an optimal design can be a
probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more ge ...
that is supported on an infinite set of observation-locations. Such optimal probability-measure designs solve a mathematical problem that neglected to specify the cost of observations and experimental runs. Nonetheless, such optimal probability-measure designs can be
discretized In applied mathematics, discretization is the process of transferring continuous functions, models, variables, and equations into discrete counterparts. This process is usually carried out as a first step toward making them suitable for numerical ...
to furnish approximately optimal designs. In some cases, a finite set of observation-locations suffices to
support Support may refer to: Arts, entertainment, and media * Supporting character Business and finance * Support (technical analysis) * Child support * Customer support * Income Support Construction * Support (structure), or lateral support, a ...
an optimal design. Such a result was proved by Kôno and Kiefer in their works on response-surface designs for quadratic models. The Kôno–Kiefer analysis explains why optimal designs for response-surfaces can have discrete supports, which are very similar as do the less efficient designs that have been traditional in
response surface methodology In statistics, response surface methodology (RSM) explores the relationships between several explanatory variables and one or more response variables. The method was introduced by George E. P. Box and K. B. Wilson in 1951. The main idea of RSM ...
.


History

In 1815, an article on optimal designs for
polynomial regression In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable ''x'' and the dependent variable ''y'' is modelled as an ''n''th degree polynomial in ''x''. Polynomial regression fi ...
was published by
Joseph Diaz Gergonne Joseph Diez Gergonne (19 June 1771 at Nancy, France – 4 May 1859 at Montpellier, France) was a French mathematician and logician. Life In 1791, Gergonne enlisted in the French army as a captain. That army was undergoing rapid expansion becau ...
, according to Stigler.
Charles S. Peirce Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism". Educated as a chemist and employed as a scientist for t ...
proposed an economic theory of scientific experimentation in 1876, which sought to maximize the precision of the estimates. Peirce's optimal allocation immediately improved the accuracy of gravitational experiments and was used for decades by Peirce and his colleagues. In his 1882 published lecture at
Johns Hopkins University Johns Hopkins University (Johns Hopkins, Hopkins, or JHU) is a private research university in Baltimore, Maryland. Founded in 1876, Johns Hopkins is the oldest research university in the United States and in the western hemisphere. It consiste ...
, Peirce introduced experimental design with these words:
Logic will not undertake to inform you what kind of experiments you ought to make in order best to determine the acceleration of gravity, or the value of the Ohm; but it will tell you how to proceed to form a plan of experimentation.

...Unfortunately practice generally precedes theory, and it is the usual fate of mankind to get things done in some boggling way first, and find out afterward how they could have been done much more easily and perfectly.Peirce, C. S. (1882), "Introductory Lecture on the Study of Logic" delivered September 1882, published in ''Johns Hopkins University Circulars'', v. 2, n. 19, pp. 11–12, November 1882, see p. 11, ''Google Books'
Eprint
Reprinted in ''Collected Papers'' v. 7, paragraphs 59–76, see 59, 63, ''Writings of Charles S. Peirce'' v. 4, pp. 378–82, see 378, 379, and ''The Essential Peirce'' v. 1, pp. 210–14, see 210–1, also lower down on 211.
Kirstine Smith Kirstine Smith (April 12, 1878 – November 11, 1939) was a Danish statistician. She is credited with the creation of the field of optimal design of experiments. Background Smith grew up in the town of Nykøbing Mors, Denmark. In 1903, she gradu ...
proposed optimal designs for polynomial models in 1918. (Kirstine Smith had been a student of the Danish statistician
Thorvald N. Thiele Thorvald Nicolai Thiele (24 December 1838 – 26 September 1910) was a Danish astronomer and director of the Copenhagen Observatory. He was also an actuary and mathematician, most notable for his work in statistics, interpolation and the three-b ...
and was working with
Karl Pearson Karl Pearson (; born Carl Pearson; 27 March 1857 – 27 April 1936) was an English mathematician and biostatistician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university st ...
in London.)


See also

*
Bayesian experimental design Bayesian experimental design provides a general probability-theoretical framework from which other theories on experimental design can be derived. It is based on Bayesian inference to interpret the observations/data acquired during the experiment. T ...
*
Blocking (statistics) In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) that are similar to one another. Blocking can be used to tackle the problem of pseudoreplication. Use Blocking reduces une ...
*
Computer experiment A computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an in silico system. This area includes computational physics, computational chemistry, computational biology and other similar ...
* Convex function *
Convex minimization Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization probl ...
*
Design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
*
Efficiency (statistics) In statistics, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, needs fewer input data or observations than a less efficient one to achie ...
*
Entropy (information theory) In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable X, which takes values in the alphabet \ ...
*
Fisher information In mathematical statistics, the Fisher information (sometimes simply called information) is a way of measuring the amount of information that an observable random variable ''X'' carries about an unknown parameter ''θ'' of a distribution that mode ...
*
Glossary of experimental design A glossary of terms used in experimental research. Concerned fields *Statistics *Experimental design *Estimation theory Glossary * Alias: When the estimate of an effect also includes the influence of one or more other effects (usually high o ...
* Hadamard's maximal determinant problem *
Information theory Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...
* Kiefer, Jack *
Replication (statistics) In engineering, science, and statistics, replication is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated. ASTM, in standard E1847, defines replication as "... the repetition of the ...
*
Response surface methodology In statistics, response surface methodology (RSM) explores the relationships between several explanatory variables and one or more response variables. The method was introduced by George E. P. Box and K. B. Wilson in 1951. The main idea of RSM ...
*
Statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
* Wald, Abraham * Wolfowitz, Jacob


Notes


References

* * * * * * * * * *


Further reading


Textbooks for practitioners and students


Textbooks emphasizing regression and response-surface methodology

The textbook by Atkinson, Donev and Tobias has been used for short courses for industrial practitioners as well as university courses. * *


Textbooks emphasizing block designs

Optimal block designs are discussed by Bailey and by Bapat. The first chapter of Bapat's book reviews the
linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices. ...
used by Bailey (or the advanced books below). Bailey's exercises and discussion of
randomization Randomization is the process of making something random. Randomization is not haphazard; instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution d ...
both emphasize statistical concepts (rather than algebraic computations). * Draft available on-line. (Especially Chapter 11.8 "Optimality") * (Chapter 5 "Block designs and optimality", pages 99–111) Optimal block designs are discussed in the advanced monograph by Shah and Sinha and in the survey-articles by Cheng and by Majumdar.


Books for professional statisticians and researchers

* * * * * * * Republication with errata-list and new preface of Wiley (0-471-61971-X) 1993 *


Articles and chapters

* * ** ** ** ** ** ** ** *


Historical

* * * (Appendix No. 14)
NOAA PDF Eprint
Reprinted in paragraphs 139–157, and in * {{Statistics, collection, state=collapsed Design of experiments Regression analysis Statistical theory Optimal decisions Mathematical optimization Industrial engineering Systems engineering Statistical process control