In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose

experimental unit In statistics, a unit is one member of a set of entities being studied. It is the main source for the mathematical abstraction of a " random variable". Common examples of a unit would be a single person, animal, plant, manufactured item, or countr ...

s take on all possible combinations of these levels across all such factors. A full factorial design may also be called a fully crossed design. Such an experiment allows the investigator to study the effect of each factor on the

response variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...

, as well as the effects of

interaction Interaction is action that occurs between two or more objects, with broad use in philosophy and the sciences. It may refer to: Science * Interaction hypothesis, a theory of second language acquisition * Interaction (statistics) * Interactions o ...

s between factors on the response variable. For the vast majority of factorial experiments, each factor has only two levels. For example, with two factors each taking two levels, a factorial experiment would have four treatment combinations in total, and is usually called a ''2×2 factorial design''. In such a design, the interaction between the variables is often the most important. This applies even to scenarios where a main effect and an interaction is present. If the number of combinations in a full factorial design is too high to be logistically feasible, a

fractional factorial design In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. The subset is chosen so as to exploit the sparsity-of-effects principle ...

may be done, in which some of the possible combinations (usually at least half) are omitted.

History

Factorial designs were used in the 19th century by

John Bennet Lawes Sir John Bennet Lawes, 1st Baronet, FRS (28 December 1814 – 31 August 1900) was an English entrepreneur and agricultural scientist. He founded an experimental farm at his home at Rothamsted Manor that eventually became Rothamsted Research, ...

and

Joseph Henry Gilbert Sir Joseph Henry Gilbert, Fellow of the Royal Society (1 August 1817 – 23 December 1901) was an English chemist, noteworthy for his long career spent improving the methods of practical agriculture. He was a fellow of the Royal Society. Life He ...

of the

Rothamsted Experimental Station Rothamsted Research, previously known as the Rothamsted Experimental Station and then the Institute of Arable Crops Research, is one of the oldest agricultural research institutions in the world, having been founded in 1843. It is located at Har ...

Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who ...

argued in 1926 that "complex" designs (such as factorial designs) were more efficient than studying one factor at a time. Fisher wrote, Nature, he suggests, will best respond to "a logical and carefully thought out questionnaire". A factorial design allows the effect of several factors and even interactions between them to be determined with the same number of trials as are necessary to determine any one of the effects by itself with the same degree of accuracy. Frank Yates made significant contributions, particularly in the analysis of designs, by the

Yates analysis In statistics, a Yates analysis is an approach to analyzing data obtained from a designed experiment, where a factorial design has been used. Full- and fractional-factorial designs are common in designed experiments for engineering and scientific ...

. The term "factorial" may not have been used in print before 1935, when Fisher used it in his book ''

The Design of Experiments ''The Design of Experiments'' is a 1935 book by the English statistician Ronald Fisher about the design of experiments and is considered a foundational work in experimental design. Among other contributions, the book introduced the concept of th ...

''.

Advantages of factorial experiments

Many people examine the effect of only a single factor or variable. Compared to such one-factor-at-a-time (OFAT) experiments, factorial experiments offer several advantages * Factorial designs are more efficient than OFAT experiments. They provide more information at similar or lower cost. They can find optimal conditions faster than OFAT experiments. * Factorial designs allow additional factors to be examined at no additional cost. * When the effect of one factor is different for different levels of another factor, it cannot be detected by an OFAT experiment design. Factorial designs are required to detect such

interactions Interaction is action that occurs between two or more objects, with broad use in philosophy and the sciences. It may refer to: Science * Interaction hypothesis, a theory of second language acquisition * Interaction (statistics) * Interactions o ...

. Use of OFAT when interactions are present can lead to serious misunderstanding of how the response changes with the factors. * Factorial designs allow the effects of a factor to be estimated at several levels of the other factors, yielding conclusions that are valid over a range of experimental conditions.

Example of advantages of factorial experiments

In his book, ''Improving Almost Anything: Ideas and Essays'', statistician George Box gives many examples of the benefits of factorial experiments. Here is one. Engineers at the bearing manufacturer SKF wanted to know if changing to a less expensive "cage" design would affect bearing life. The engineers asked Christer Hellstrand, a statistician, for help in designing the experiment. Cube_plot_for_bearing_life

Box reports the following. "The results were assessed by an accelerated life test. … The runs were expensive because they needed to be made on an actual production line and the experimenters were planning to make four runs with the standard cage and four with the modified cage. Christer asked if there were other factors they would like to test. They said there were, but that making added runs would exceed their budget. Christer showed them how they could test two additional factors "for free" – without increasing the number of runs and without reducing the accuracy of their estimate of the cage effect. In this arrangement, called a 2×2×2 factorial design, each of the three factors would be run at two levels and all the eight possible combinations included. The various combinations can conveniently be shown as the vertices of a cube ... " "In each case, the standard condition is indicated by a minus sign and the modified condition by a plus sign. The factors changed were heat treatment, outer ring osculation, and cage design. The numbers show the relative lengths of lives of the bearings. If you look at he cube plot you can see that the choice of cage design did not make a lot of difference. … But, if you average the pairs of numbers for cage design, you get the able below which shows what the two other factors did. … It led to the extraordinary discovery that, in this particular application, the life of a bearing can be increased fivefold if the two factor(s) outer ring osculation and inner ring heat treatments are increased together." "Remembering that bearings like this one have been made for decades, it is at first surprising that it could take so long to discover so important an improvement. A likely explanation is that, because most engineers have, until recently, employed only one factor at a time experimentation,

effects have been missed."

Example

The simplest factorial experiment contains two levels for each of two factors. Suppose an engineer wishes to study the total power used by each of two different motors, A and B, running at each of two different speeds, 2000 or 3000 RPM. The factorial experiment would consist of four experimental units: motor A at 2000 RPM, motor B at 2000 RPM, motor A at 3000 RPM, and motor B at 3000 RPM. Each combination of a single level selected from every factor is present once. This experiment is an example of a 2² (or 2×2) factorial experiment, so named because it considers two levels (the base) for each of two factors (the power or superscript), or #levels^#factors, producing 2²=4 factorial points. Factorial Design

Designs can involve many independent variables. As a further example, the effects of three input variables can be evaluated in eight experimental conditions shown as the corners of a cube. This can be conducted with or without replication, depending on its intended purpose and available resources. It will provide the effects of the three independent variables on the dependent variable and possible interactions.

Notation

The notation used to denote factorial experiments conveys a lot of information. When a design is denoted a 2³ factorial, this identifies the number of factors (3); how many levels each factor has (2); and how many experimental conditions there are in the design (2³ = 8). Similarly, a 2⁵ design has five factors, each with two levels, and 2⁵ = 32 experimental conditions. Factorial experiments can involve factors with different numbers of levels. A 2⁴3 design has five factors, four with two levels and one with three levels, and has 16 × 3 = 48 experimental conditions. To save space, the points in a two-level factorial experiment are often abbreviated with strings of plus and minus signs. The strings have as many symbols as factors, and their values dictate the level of each factor: conventionally,

-

for the first (or low) level, and

+

for the second (or high) level. The points in this experiment can thus be represented as

--

+-

-+

, and

++

. The factorial points can also be abbreviated by (1), a, b, and ab, where the presence of a letter indicates that the specified factor is at its high (or second) level and the absence of a letter indicates that the specified factor is at its low (or first) level (for example, "a" indicates that factor A is on its high setting, while all other factors are at their low (or first) setting). (1) is used to indicate that all factors are at their lowest (or first) values. In an

s_1 \cdots s_k

(or

s_1 \times \cdots \times s_k

) factorial experiment, there are k factors, the ith factor at

s_i

levels. If we let

A_i

be the set of levels of the ith factor, then the set of treatment combinations (or "cells") is the Cartesian product

T = A_1 \times \cdots \times A_k

. A treatment combination is thus a k-tuple

\mathbf = (t_1, \ldots, t_k)

. If

s_1 = \cdots = s_k \equiv s

, say, the experiment is said to be symmetric, and we use the same set

A

to denote the set of levels of each factor. In a 2-level experiment, for example, we may take

A = \

, as above; the treatment combination

--

is by denoted (1),

+-

by a, and so on.

Implementation

For more than two factors, a 2^''k'' factorial experiment can usually be recursively designed from a 2^''k''−1 factorial experiment by replicating the 2^''k''−1 experiment, assigning the first replicate to the first (or low) level of the new factor, and the second replicate to the second (or high) level. This framework can be generalized to, ''e.g.'', designing three replicates for three level factors, ''etc''. A factorial experiment allows for estimation of

experimental error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a " mistak ...

in two ways. The experiment can be replicated, or the

sparsity-of-effects principle In the statistical analysis of the results from factorial experiments, the sparsity-of-effects principle states that a system is usually dominated by main effects and low-order interactions Interaction is action that occurs between two or more o ...

can often be exploited. Replication is more common for small experiments and is a very reliable way of assessing experimental error. When the number of factors is large (typically more than about 5 factors, but this does vary by application), replication of the design can become operationally difficult. In these cases, it is common to only run a single replicate of the design, and to assume that factor interactions of more than a certain order (say, between three or more factors) are negligible. Under this assumption, estimates of such high order interactions are estimates of an exact zero, thus really an estimate of experimental error. When there are many factors, many experimental runs will be necessary, even without replication. For example, experimenting with 10 factors at two levels each produces 2¹⁰=1024 combinations. At some point this becomes infeasible due to high cost or insufficient resources. In this case,

fractional factorial designs In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. The subset is chosen so as to exploit the sparsity-of-effects principle t ...

may be used. As with any statistical experiment, the experimental runs in a factorial experiment should be randomized to reduce the impact that

bias Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group ...

could have on the experimental results. In practice, this can be a large operational challenge. Factorial experiments can be used when there are more than two levels of each factor. However, the number of experimental runs required for three-level (or more) factorial designs will be considerably greater than for their two-level counterparts. Factorial designs are therefore less attractive if a researcher wishes to consider more than two levels.

Main effects and interactions

A fundamental concept in experimental design is the contrast. Let

\mu(\mathbf)

be the expected response to treatment combination t, and let

T

be the set of treatment combinations. A contrast in

\mu

is a linear expression

\sum_ c(\mathbf) \mu(\mathbf)

such that

\sum_ c(\mathbf) = 0

. The function

c(\mathbf)

is a contrast function, or a contrast vector if we fix an order of the treatment combinations. For example, in a one-factor experiment the expression

2\mu(1) - \mu(2) - \mu(3)

represents a contrast between level 1 of the factor and the combined impact of levels 2 and 3. The corresponding contrast function

c

is given by

c(1) = 2, c(2) = c(3) = -1

, and the contrast vector is

,-1,-1

. It's easy to see that if

c

and

d

are contrast functions, so is

c + d

, and so is

rc

for any real number

r

. We say that

c

and

d

are orthogonal (denoted

c \perp d

) if

\sum_ c(\mathbf) d(\mathbf) = 0

. More generally, we say that a contrast function

c(\mathbf) = c(t_1, \ldots, t_k)

belongs to the main effect of factor i if its value depends only on the value of

t_i

. For example, in the

2 \times 2

example above, the contrast

-\mu(--) - \mu(+-) + \mu(-+) + \mu(++)

represents the main effect of factor

B

. The corresponding contrast function is given by

c(--) = c(+-) = -1, c(-+) = c(++) = 1

; as a contrast vector, it is displayed in the column for factor

B

in the table above. The contrast function

c(t_1, \ldots, t_k)

belongs to the interaction between factors i and j if (a) the value of

c

depends only on

t_i

and

t_j

, and (b) it is orthogonal to the contrast functions for the main effects of factors

i

and

j

. Similarly, for any subset

I

\

having more than two elements, a contrast function

c

belongs the interaction between the factors listed in

I

if (a) its values depend only on the levels

t_i, i \in I

and (b) it is orthogonal to all contrasts of lower order among those factors. Let

U_i

denote the set of contrast functions belonging to the main effect of factor

i

U_

the set of those belonging to the interaction between factors

i

and

j

, and more generally

U_I

the set of contrast functions belonging to the interaction between the factors listed in

I

for any subset

I \subset \

with

, I,  \ge 2

(here

, I,

denotes cardinality). We also let

U_

denote the set of constant functions on

T

. We have thus defined a set

U_I

corresponding to each

I \subset \

. It is not hard to see that each

U_I

is a vector space, a subspace of the vector space

\mathbb^T

consisting of all real-valued functions on

T

. We have the following well-known, fundamental facts: # If

I \neq J

then

U_I \perp U_J

. #

\mathbb^T = \oplus_I U_I

, the (orthogonal) sum running over all subsets

I \subset \

. # For each

I

, dim

U_I = \prod_ (s_i-1)

. These results underpin the usual

analysis of variance Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician ...

or ANOVA (see below), in which a total sum of squares is partitioned into the sums of squares for each effect (main effect or interaction), as introduced by Fisher. The dimension of

U_I

is the degrees of freedom for the corresponding effect. In a two-factor or

a \times b

experiment, the orthogonal sum reads

\mathbb^T = U_ \oplus U_1 \oplus U_2 \oplus U_

, and the dorresponding dimensions are

ab = 1 + (a-1) + (b-1) + (a-1)(b-1)

, giving the usual formulas for degrees of freedom for main effects and interaction (the total degrees of freedom is

ab - 1

Analysis

A factorial experiment can be analyzed using

ANOVA Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician ...

regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

. To compute the main effect of a factor "A" in a 2-level experiment, subtract the average response of all experimental runs for which A was at its low (or first) level from the average response of all experimental runs for which A was at its high (or second) level. Other useful exploratory analysis tools for factorial experiments include

main effect In the design of experiments and analysis of variance, a main effect is the effect of an independent variable on a dependent variable averaged across the levels of any other independent variables. The term is frequently used in the context of facto ...

s plots, interaction plots, Pareto plots, and a

normal probability plot The normal probability plot is a graphical technique to identify substantive departures from normality. This includes identifying outliers, skewness, kurtosis, a need for transformations, and mixtures. Normal probability plots are made of raw ...

of the estimated effects. When the factors are continuous, two-level factorial designs assume that the effects are

linear Linearity is the property of a mathematical relationship ('' function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear ...

. If a quadratic effect is expected for a factor, a more complicated experiment should be used, such as a central composite design. Optimization of factors that could have quadratic effects is the primary goal of

response surface methodology In statistics, response surface methodology (RSM) explores the relationships between several explanatory variables and one or more response variables. The method was introduced by George E. P. Box and K. B. Wilson in 1951. The main idea of RSM ...

Analysis example

Montgomery gives the following example of analysis of a factorial experiment:.

An engineer would like to increase the filtration rate (output) of a process to produce a chemical, and to reduce the amount of formaldehyde used in the process. Previous attempts to reduce the formaldehyde have lowered the filtration rate. The current filtration rate is 75 gallons per hour. Four factors are considered: temperature (A), pressure (B), formaldehyde concentration (C), and stirring rate (D). Each of the four factors will be tested at two levels.

Onwards, the minus (−) and plus (+) signs will indicate whether the factor is run at a low or high level, respectively. File:Montgomery filtration rates.svg, Plot of the main effects showing the filtration rates for the low (−) and high (+) settings for each factor. File:Interaction plots filtration rate.png, Plot of the interaction effects showing the mean filtration rate at each of the four possible combinations of levels for a given pair of factors. The non-parallel lines in the A:C interaction plot indicate that the effect of factor A depends on the level of factor C. A similar results holds for the A:D interaction. The graphs indicate that factor B has little effect on filtration rate. The analysis of variance (ANOVA) including all 4 factors and all possible interaction terms between them yields the coefficient estimates shown in the table below. Pareto_plot_filtration_rate

Because there are 16 observations and 16 coefficients (intercept, main effects, and interactions), p-values cannot be calculated for this model. The coefficient values and the graphs suggest that the important factors are A, C, and D, and the interaction terms A:C and A:D. The coefficients for A, C, and D are all positive in the ANOVA, which would suggest running the process with all three variables set to the high value. However, the main effect of each variable is the average over the levels of the other variables. The A:C interaction plot above shows that the effect of factor A depends on the level of factor C, and vice versa. Factor A (temperature) has very little effect on filtration rate when factor C is at the + level. But Factor A has a large effect on filtration rate when factor C (formaldehyde) is at the − level. The combination of A at the + level and C at the − level gives the highest filtration rate. This observation indicates how one-factor-at-a-time analyses can miss important interactions. Only by varying both factors A and C at the same time could the engineer discover that the effect of factor A depends on the level of factor C. Montgomery_filtration_cube_plot

The best filtration rate is seen when A and D are at the high level, and C is at the low level. This result also satisfies the objective of reducing formaldehyde (factor C). Because B does not appear to be important, it can be dropped from the model. Performing the ANOVA using factors A, C, and D, and the interaction terms A:C and A:D, gives the result shown in the following table, in which all the terms are significant (p-value < 0.05).

Notes

References

External links

Introduction to Factorial Experimental Designs (The Methodology Center, Penn State University)GOV.UK Factorial randomised controlled trials (Public Health England)
{{Statistics, collection Design of experiments Statistical process control