In statistics, the studentized range, denoted ''q'', is the difference between the largest and smallest data in a sample normalized by the

sample standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...

. It is named after William Sealy Gosset (who wrote under the pseudonym "''Student''"), and was introduced by him in 1927. The concept was later discussed by Newman (1939), Keuls (1952), and

John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...

in some unpublished notes. Its statistical distribution is the '' studentized range distribution'', which is used for multiple comparison procedures, such as the single step procedure Tukey's range test, the Newman–Keuls method, and the Duncan's step down procedure, and establishing

confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as ...

s that are still valid after

data snooping Data dredging (also known as data snooping or ''p''-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives. T ...

has occurred.

Description

The value of the studentized range, most often represented by the variable ''q'', can be defined based on a random sample ''x''₁, ..., ''x''_''n'' from the ''N''(0, 1) distribution of numbers, and another random variable ''s'' that is independent of all the ''x_i'', and ''νs''² has a ''χ''² distribution with ''ν'' degrees of freedom. Then :

q _= \frac = \max_ \left\

has the Studentized range distribution for ''n'' groups and ''ν'' degrees of freedom. In applications, the ''x_i'' are typically the means of samples each of size ''m'', ''s''² is the pooled variance, and the degrees of freedom are ''ν'' = ''n''(''m'' − 1). The critical value of ''q'' is based on three factors: #''α'' (the probability of rejecting a true

null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...

) #''n'' (the number of observations or groups) #''ν'' (the degrees of freedom used to estimate the

sample variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...

)

Distribution

If ''X''₁, ..., ''X''_''n'' are independent identically distributed

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...

s that are normally distributed, the probability distribution of their studentized range is what is usually called the ''studentized range distribution''. Note that the definition of ''q'' does not depend on the

expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...

or the standard deviation of the distribution from which the sample is drawn, and therefore its probability distribution is the same regardless of those parameters.

''Studentization''

Generally, the term ''studentized'' means that the variable's scale was adjusted by dividing by an estimate of a population standard deviation (see also studentized residual). The fact that the standard deviation is a ''sample'' standard deviation rather than the ''population'' standard deviation, and thus something that differs from one random sample to the next, is essential to the definition and the distribution of the ''Studentized'' data. The variability in the value of the ''sample'' standard deviation contributes additional uncertainty into the values calculated. This complicates the problem of finding the probability distribution of any statistic that is ''studentized''.

Description

Distribution

''Studentization''

See also

References

Further reading