The Newman–Keuls or Student–Newman–Keuls (SNK) method is a stepwise
multiple comparisons
In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values.
The more inferences ...
procedure used to identify
sample
Sample or samples may refer to:
Base meaning
* Sample (statistics), a subset of a population – complete data set
* Sample (signal), a digital discrete sample of a continuous analog signal
* Sample (material), a specimen or small quantity of s ...
means
Means may refer to:
* Means LLC, an anti-capitalist media worker cooperative
* Means (band), a Christian hardcore band from Regina, Saskatchewan
* Means, Kentucky, a town in the US
* Means (surname)
* Means Johnston Jr. (1916–1989), US Navy adm ...
that are
significantly different from each other.
It was named after
Student
A student is a person enrolled in a school or other educational institution.
In the United Kingdom and most commonwealth countries, a "student" attends a secondary school or higher (e.g., college or university); those in primary or elementar ...
(1927),
D. Newman,
and M. Keuls.
This procedure is often used as a
post-hoc test whenever a significant difference between three or more sample means has been revealed by an
analysis of variance (ANOVA).
The Newman–Keuls method is similar to
Tukey's range test
Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSD (honestly significant difference) test, Also occasionally as "honestly," see e.g. is a single-step multiple comparison procedure and ...
as both procedures use
studentized range statistics.
Unlike Tukey's range test, the Newman–Keuls method uses different
critical value
Critical value may refer to:
*In differential topology, a critical value of a differentiable function between differentiable manifolds is the image (value of) ƒ(''x'') in ''N'' of a critical point ''x'' in ''M''.
*In statistical hypothesis ...
s for different pairs of mean comparisons. Thus, the procedure is more likely to reveal significant differences between group means and to commit
type I errors
In statistical hypothesis testing, a type I error is the mistaken rejection of an actually true null hypothesis (also known as a "false positive" finding or conclusion; example: "an innocent person is convicted"), while a type II error is the fa ...
by incorrectly rejecting a null hypothesis when it is true. In other words, the Neuman-Keuls procedure is more
powerful but less conservative than Tukey's range test.
History and type I error rate control
The Newman–Keuls method was introduced by Newman in 1939 and developed further by Keuls in 1952. This was before
Tukey
John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the Cooley–Tukey FFT algorithm, fast Fourier Transform (FFT) algorithm and box plot. The Tukey's range test ...
presented various definitions of error rates (1952a,
1952b,
1953
).
The Newman–Keuls method controls the
Family-Wise Error Rate
In statistics, family-wise error rate (FWER) is the probability of making one or more false discoveries, or type I errors when performing multiple hypotheses tests.
Familywise and Experimentwise Error Rates
Tukey (1953) developed the concept of a ...
(FWER) in the weak sense but not the strong sense:
the Newman–Keuls procedure controls the risk of rejecting the null hypothesis if all means are equal (global null hypothesis) but does not control the risk of rejecting partial null hypotheses. For instance, when four means are compared, under the partial null hypothesis that µ1=µ2 and µ3=µ4=µ+delta with a non-zero delta, the Newman–Keuls procedure has a probability greater than alpha of rejecting µ1=µ2 or µ3=µ4 or both. In that example, if delta is very large, the Newman–Keuls procedure is almost equivalent to two Student t tests testing µ1=µ2 and µ3=µ4 at nominal type I error rate alpha, without multiple testing procedure; therefore the FWER is almost doubled.
In the worst case, the FWER of Newman–Keuls procedure is 1-(1-alpha)^int(J/2) where int(J/2) represents the
integer part
In mathematics and computer science, the floor function is the function that takes as input a real number , and gives as output the greatest integer less than or equal to , denoted or . Similarly, the ceiling function maps to the least inte ...
of the total number of groups divided by 2.
Therefore, with two or three groups, the Newman–Keuls procedure has strong control over the FWER but not for four groups or more.
In 1995 Benjamini and Hochberg presented a new, more liberal, and more powerful criterion for those types of problems:
False discovery rate
In statistics, the false discovery rate (FDR) is a method of conceptualizing the rate of type I errors in null hypothesis testing when conducting multiple comparisons. FDR-controlling procedures are designed to control the FDR, which is the expe ...
(FDR) control.
In 2006, Shaffer showed (by extensive simulation) that the Newman–Keuls method controls the FDR with some constraints.
Required assumptions
The assumptions of the Newman–Keuls test are essentially the same as for an independent groups
t-test
A ''t''-test is any statistical hypothesis testing, statistical hypothesis test in which the test statistic follows a Student's t-distribution, Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test stati ...
:
normality,
homogeneity of variance
In statistics, a sequence (or a vector) of random variables is homoscedastic () if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The ...
, and
independent observations. The test is quite robust to violations of normality. Violating homogeneity of variance can be more problematic than in the two-sample case since the MSE is based on data from all groups. The assumption of independence of observations is important and should not be violated.
Procedures
The Newman–Keuls method employs a stepwise approach when comparing sample means.
Prior to any mean comparison, all sample means are rank-ordered in ascending or descending order, thereby producing an ordered range (''p'') of sample means.
A comparison is then made between the largest and smallest sample means within the largest range.
Assuming that the largest range is four means (or ''p'' = 4), a significant difference between the largest and smallest means as revealed by the Newman–Keuls method would result in a rejection of the
null hypothesis
In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
for that specific range of means. The next largest comparison of two sample means would then be made within a smaller range of three means (or ''p'' = 3). Unless there is no significant differences between two sample means within any given range, this stepwise comparison of sample means will continue until a final comparison is made with the smallest range of just two means. If there is no significant difference between the two sample means, then all the null hypotheses within that range would be retained and no further comparisons within smaller ranges are necessary.
To determine if there is a significant difference between two means with equal sample sizes, the Newman–Keuls method uses a formula that is identical to the one used in
Tukey's range test
Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSD (honestly significant difference) test, Also occasionally as "honestly," see e.g. is a single-step multiple comparison procedure and ...
, which calculates the ''q'' value by taking the difference between two sample means and dividing it by the standard error:
:
where
represents the
studentized range In statistics, the studentized range, denoted ''q'', is the difference between the largest and smallest data in a sample normalized by the sample standard deviation.
It is named after William Sealy Gosset (who wrote under the pseudonym "''Student'' ...
value,
and
are the largest and smallest sample means within a range,
is the error variance taken from the ANOVA table, and
is the sample size (number of observations within a sample). If comparisons are made with means of unequal sample sizes (
), then the Newman–Keuls formula would be adjusted as follows:
:
where
and
represent the sample sizes of the two sample means. On both cases,
MSE (mean squared error) is taken from the ANOVA conducted in the first stage of the analysis.
Once calculated, the computed ''q'' value can be compared to a ''q'' critical value (or
), which can be found in a ''q'' distribution table based on the
significance level
In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the ...
(
), the error
degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
(
) from the ANOVA table, and the range (
) of sample means to be tested.
If the computed ''q'' value is equal to or greater than the ''q'' critical value, then the null hypothesis (''H''
0: ''μ''
A = ''μ''
B) for that specific range of means can be rejected.
Because the number of means within a range changes with each successive pairwise comparison, the critical value of the ''q'' statistic also changes with each comparison, which makes the Neuman-Keuls method more lenient and hence more powerful than Tukey's range test. Thus, if a pairwise comparison was found to be significantly different using the Newman–Keuls method, it may not necessarily be significantly different when analyzed with Tukey's range test.
Conversely, if the pairwise comparison was found not to be significantly different using the Newman–Keuls method, it cannot be significantly different with Tukey's range test either.
Limitations
The Newman–Keuls procedure cannot produce a confidence interval for each mean difference, or for multiplicity adjusted exact p-values due to its sequential nature. Results are somewhat difficult to interpret since it is difficult to articulate what are the null hypotheses that were tested.
See also
*
Multiple comparisons
In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values.
The more inferences ...
*
Post-hoc analysis
In a scientific study, post hoc analysis (from Latin '' post hoc'', "after this") consists of statistical analyses that were specified after the data were seen. They are usually used to uncover specific differences between three or more group mea ...
*
Tukey's range test
Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSD (honestly significant difference) test, Also occasionally as "honestly," see e.g. is a single-step multiple comparison procedure and ...
References
{{DEFAULTSORT:Newman-Keuls method
Multiple comparisons