statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, Tukey's test of additivity, named for

John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...

, is an approach used in two-way ANOVA ( regression analysis involving two qualitative factors) to assess whether the factor variables ( categorical variables) are additively related to the

expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...

of the response variable. It can be applied when there are no replicated values in the data set, a situation in which it is impossible to directly estimate a fully general non-additive regression structure and still have information left to estimate the error variance. The

test statistic Test statistic is a quantity derived from the sample for statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specified in terms of a tes ...

proposed by Tukey has one degree of freedom under the null hypothesis, hence this is often called "Tukey's one-degree-of-freedom test."

Introduction

The most common setting for Tukey's test of additivity is a two-way factorial

analysis of variance Analysis of variance (ANOVA) is a family of statistical methods used to compare the Mean, means of two or more groups by analyzing variance. Specifically, ANOVA compares the amount of variation ''between'' the group means to the amount of variati ...

(ANOVA) with one observation per cell. The response variable ''Y''_''ij'' is observed in a table of cells with the rows indexed by ''i'' = 1,..., ''m'' and the columns indexed by ''j'' = 1,..., ''n''. The rows and columns typically correspond to various types and levels of treatment that are applied in combination. The additive model states that the expected response can be expressed ''EY''_''ij'' = ''μ'' + ''α''_''i'' + ''β''_''j'', where the ''α''_''i'' and ''β''_''j'' are unknown constant values. The unknown model parameters are usually estimated as :

\widehat = \bar_

\widehat_i = \bar_ - \bar_

\widehat_j = \bar_ - \bar_

where ''Y''_''i''• is the mean of the ''i''^th row of the data table, ''Y''_•''j'' is the mean of the ''j''^th column of the data table, and ''Y''_•• is the overall mean of the data table. The additive model can be generalized to allow for arbitrary interaction effects by setting ''EY''_''ij'' = ''μ'' + ''α''_''i'' + ''β''_''j'' + ''γ''_''ij''. However, after fitting the natural estimator of ''γ''_''ij'', :

\widehat_ = Y_ - (\widehat + \widehat_i + \widehat_j),

the fitted values :

\widehat_ = \widehat + \widehat_i + \widehat_j + \widehat_ \equiv Y_

fit the data exactly. Thus there are no remaining degrees of freedom to estimate the variance σ², and no hypothesis tests about the ''γ''_''ij'' can performed. Tukey therefore proposed a more constrained interaction model of the form :

\operatorname Y_ = \mu + \alpha_i + \beta_j + \lambda\alpha_i\beta_j

By testing the null hypothesis that λ = 0, we are able to detect some departures from additivity based only on the single parameter λ.

Method

To carry out Tukey's test, set :

SS_A \equiv n \sum_ (\bar_-\bar_)^2

SS_B \equiv m \sum_ (\bar_ - \bar_)^2

SS_ \equiv \frac

SS_T \equiv \sum_ (Y_ - \bar_)^2

SS_E \equiv SS_T - SS_A - SS_B - SS_

Then use the following test statistic Alin, A. and Kurt, S. (2006). “Testing non-additivity (interaction) in two-way ANOVA tables with no replication”. ''Statistical Methods in Medical Research'' 15, 63–85. :

\frac.

Under the null hypothesis, the test statistic has an ''F'' distribution with 1, ''q'' degrees of freedom, where ''q'' = ''mn'' − (''m'' + ''n'') is the degrees of freedom for estimating the error variance.

References

Analysis of variance Statistical tests

Introduction

Method

See also

References