Tukey's Range Test
   HOME

TheInfoList



OR:

Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSD (honestly significant difference) test, Also occasionally as "honestly," see e.g. is a single-step
multiple comparison In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. The more inferences ...
procedure and
statistical test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
. It can be used to find means that are significantly different from each other. Named after John Tukey, it compares all possible pairs of
means Means may refer to: * Means LLC, an anti-capitalist media worker cooperative * Means (band), a Christian hardcore band from Regina, Saskatchewan * Means, Kentucky, a town in the US * Means (surname) * Means Johnston Jr. (1916–1989), US Navy adm ...
, and is based on a
studentized range distribution In probability and statistics, studentized range distribution is the continuous probability distribution of the studentized range of an i.i.d. sample from a normally distributed population. Suppose that we take a sample of size ''n'' from each of ...
(''q'') (this distribution is similar to the distribution of ''t'' from the ''t''-test. See below).Linton, L.R., Harder, L.D. (2007) Biology 315 – Quantitative Biology Lecture Notes. University of Calgary, Calgary, AB Tukey's test compares the means of every treatment to the means of every other treatment; that is, it applies simultaneously to the set of all pairwise comparisons :\mu_i-\mu_j \, and identifies any difference between two means that is greater than the expected
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error o ...
. The
confidence coefficient In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown Statistical parameter, parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but ...
for the
set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...
, when all sample sizes are equal, is exactly 1 - \alpha for any 0 \le \alpha \le 1. For unequal sample sizes, the confidence coefficient is greater than 1 − α. In other words, the Tukey method is conservative when there are unequal sample sizes. A common mistaken belief is that Tukey's HSD should only be used following a significant ANOVA. The ANOVA is not necessary because the Tukey test controls the Type I error rate on its own. This test is often followed by the Compact Letter Display (CLD) statistical procedure to render the output of this test more transparent to non-statistician audiences.


Assumptions

#The observations being tested are
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
within and among the groups. #The groups associated with each mean in the test are normally distributed. #There is equal within-group variance across the groups associated with each mean in the test ( homogeneity of variance).


The test statistic

Tukey's test is based on a formula very similar to that of the -test. In fact, Tukey's test is essentially a -test, except that it corrects for
family-wise error rate In statistics, family-wise error rate (FWER) is the probability of making one or more false discoveries, or type I errors when performing multiple hypotheses tests. Familywise and Experimentwise Error Rates Tukey (1953) developed the concept of a ...
. The formula for Tukey's test is : q_s = \frac, where is the larger of the two means being compared, is the smaller of the two means being compared, and SE is the
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error o ...
of the sum of the means. This value can then be compared to a value from the
studentized range distribution In probability and statistics, studentized range distribution is the continuous probability distribution of the studentized range of an i.i.d. sample from a normally distributed population. Suppose that we take a sample of size ''n'' from each of ...
. If the value is ''larger'' than the critical value obtained from the distribution, the two means are said to be significantly different at level Since the
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
for Tukey's test states that all means being compared are from the same population (i.e. = = = ... = ), the means should be normally distributed (according to the
central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themsel ...
). This gives rise to the normality assumption of Tukey's test.


The studentized range (''q'') distribution

The Tukey method uses the
studentized range distribution In probability and statistics, studentized range distribution is the continuous probability distribution of the studentized range of an i.i.d. sample from a normally distributed population. Suppose that we take a sample of size ''n'' from each of ...
. Suppose that we take a sample of size ''n'' from each of ''k'' populations with the same
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
''N''(''μ'', ''σ''2) and suppose that \barmin is the smallest of these sample means and ''\bar''max is the largest of these sample means, and suppose ''S''2 is the pooled sample variance from these samples. Then the following random variable has a Studentized range distribution. :q = \frac This value of ''q'' is the basis of the critical value of ''q'', based on three factors: # α (the
Type I error In statistical hypothesis testing, a type I error is the mistaken rejection of an actually true null hypothesis (also known as a "false positive" finding or conclusion; example: "an innocent person is convicted"), while a type II error is the fa ...
rate, or the probability of rejecting a true null hypothesis) # ''k'' (the number of populations) # ''df'' (the number of degrees of freedom (''N'' – ''k'') where ''N'' is the total number of observations) The distribution of ''q'' has been tabulated and appears in many textbooks on statistics. In some tables the distribution of ''q'' has been tabulated without the \sqrt factor. To understand which table it is, we can compute the result for ''k'' = 2 and compare it to the result of the Student's t-distribution with the same degrees of freedom and the same ''α''. In addition, R offers a cumulative distribution function (ptukey) and a quantile function (qtukey) for ''q''.


Confidence limits

The Tukey
confidence limits In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown Statistical parameter, parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but ...
for all pairwise comparisons with confidence coefficient of at least 1 − α are :\bar_-\bar_ \pm \frac\widehat_\varepsilon \sqrt \qquad i,j=1,\ldots,k\quad i\neq j. Notice that the point estimator and the estimated variance are the same as those for a single pairwise comparison. The only difference between the confidence limits for simultaneous comparisons and those for a single comparison is the multiple of the estimated standard deviation. Also note that the sample sizes must be equal when using the studentized range approach. \widehat_\varepsilon is the standard deviation of the entire design, not just that of the two groups being compared. It is possible to work with unequal sample sizes. In this case, one has to calculate the estimated standard deviation for each pairwise comparison as formalized by
Clyde Kramer Clyde may refer to: People * Clyde (given name) * Clyde (surname) Places For townships see also Clyde Township Australia * Clyde, New South Wales * Clyde, Victoria * Clyde River, New South Wales Canada * Clyde, Alberta * Clyde, Ontario, a tow ...
in 1956, so the procedure for unequal sample sizes is sometimes referred to as the Tukey–Kramer method which is as follows: :\bar_-\bar_ \pm \frac\widehat_\varepsilon \sqrt \qquad where ''n'' ''i'' and ''n'' ''j'' are the sizes of groups ''i'' and ''j'' respectively. The degrees of freedom for the whole design is also applied.


Comparing ANOVA and Tukey–Kramer tests

Both ANOVA and Tukey–Kramer tests are based on the same assumptions. However, these two tests for ''k'' groups (i.e. = = ... = ) may result in logical contradictions when ''k''> 2, even if the assumptions hold. It is possible to generate a set of pseudorandom samples of strictly positive measure such that hypothesis = is rejected at significance level 1-\alpha > 0.95 while = = is not rejected even at 1-\alpha = 0.975.


See also

*
Family-wise error rate In statistics, family-wise error rate (FWER) is the probability of making one or more false discoveries, or type I errors when performing multiple hypotheses tests. Familywise and Experimentwise Error Rates Tukey (1953) developed the concept of a ...
*
Newman–Keuls method The Newman–Keuls or Student–Newman–Keuls (SNK) method is a stepwise multiple comparisons procedure used to identify sample means that are significantly different from each other. It was named after Student (1927), D. Newman, and M. Keuls. Th ...


Notes


Further reading

* {{cite book , first=Douglas C. , last=Montgomery , year=2013 , title=Design and Analysis of Experiments , edition=Eighth , publisher=Wiley Section 3.5.7.


External links


NIST/SEMATECH e-Handbook of Statistical Methods: Tukey's method
Analysis of variance Statistical tests Multiple comparisons