Range (statistics)
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the range of a set of data is the difference between the largest and smallest values, the result of subtracting the sample maximum and minimum. It is expressed in the same
units Unit may refer to: Arts and entertainment * UNIT, a fictional military organization in the science fiction television series ''Doctor Who'' * Unit of action, a discrete piece of action (or beat) in a theatrical presentation Music * Unit (album), ...
as the data. In
descriptive statistics A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and an ...
, range is the size of the smallest interval which contains all the data and provides an indication of
statistical dispersion In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a Probability distribution, distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard de ...
. Since it only depends on two of the observations, it is most useful in representing the dispersion of small data sets.


For continuous IID random variables

For ''n'' independent and identically distributed continuous random variables ''X''1, ''X''2, ..., ''X''''n'' with the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
G(''x'') and a
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
g(''x''), let T denote the range of them, that is, T= max(''X''1, ''X''2, ..., ''X''''n'')- min(''X''1, ''X''2, ..., ''X''''n'').


Distribution

The range, T, has the cumulative distribution function ::F(t)= n \int_^\infty g(x)
(x+t)-G(x) X, or x, is the twenty-fourth and third-to-last letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''"ex"'' (pronounced ), ...
\, \textx. Gumbel notes that the "beauty of this formula is completely marred by the facts that, in general, we cannot express ''G''(''x'' + ''t'') by ''G''(''x''), and that the numerical integration is lengthy and tiresome." If the distribution of each ''X''''i'' is limited to the right (or left) then the asymptotic distribution of the range is equal to the asymptotic distribution of the largest (smallest) value. For more general distributions the asymptotic distribution can be expressed as a
Bessel function Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions of Bessel's differential equation x^2 \frac + x \frac + \left(x^2 - \alpha^2 \right)y = 0 for an arbitrar ...
.


Moments

The mean range is given by ::n \int_0^1 x(G) ^-(1-G)^\,\textG where ''x''(''G'') is the inverse function. In the case where each of the ''X''''i'' has a standard normal distribution, the mean range is given by ::\int_^\infty (1-(1-\Phi(x))^n-\Phi(x)^n ) \,\textx.


For continuous non-IID random variables

For ''n'' nonidentically distributed independent continuous random variables ''X''1, ''X''2, ..., ''X''''n'' with cumulative distribution functions ''G''1(''x''), ''G''2(''x''), ..., ''G''''n''(''x'') and probability density functions ''g''1(''x''), ''g''2(''x''), ..., ''g''''n''(''x''), the range has cumulative distribution function ::F(t) = \sum_^n \int_^\infty g_i(x) \prod_^n _j(x+t)-G_j(x)\, \textx.


For discrete IID random variables

For ''n'' independent and identically distributed discrete random variables ''X''1, ''X''2, ..., ''X''''n'' with
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
''G''(''x'') and probability mass function ''g''(''x'') the range of the ''X''''i'' is the range of a sample of size ''n'' from a population with distribution function ''G''(''x''). We can assume
without loss of generality ''Without loss of generality'' (often abbreviated to WOLOG, WLOG or w.l.o.g.; less commonly stated as ''without any loss of generality'' or ''with no loss of generality'') is a frequently used expression in mathematics. The term is used to indicat ...
that the support of each ''X''''i'' is where ''N'' is a positive integer or infinity.


Distribution

The range has probability mass function ::f(t)=\begin \sum_^N
(x) An emoticon (, , rarely , ), short for "emotion icon", also known simply as an emote, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers, and letters—to express a person's feelings, ...
n & t=0 \\ pt\sum_^\left(\begin & (x+t)-G(x-1)n\\ -&
(x+t)-G(x) X, or x, is the twenty-fourth and third-to-last letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''"ex"'' (pronounced ), ...
n\\ -&
(x+t-1)-G(x-1) X, or x, is the twenty-fourth and third-to-last letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''"ex"'' (pronounced ), ...
n\\ +& (x+t-1)-G(x)n \\ \end \right)& t=1,2,3\ldots,N-1. \end


Example

If we suppose that ''g''(''x'') = 1/''N'', the discrete uniform distribution for all ''x'', then we find ::f(t)=\begin \frac & t=0 \\ pt\sum_^\left(\left frac\rightn -2\left frac\rightn +\left frac\rightn \right) & t=1,2,3\ldots ,N-1. \end


Derivation

The probability of having a specific range value, ''t'', can be determined by adding the probabilities of having two samples differing by ''t'', and every other sample having a value between the two extremes. The probability of one sample having a value of ''x'' is ng(x). The probability of another having a value ''t'' greater than ''x'' is: :(n-1)g(x+t). The probability of all other values lying between these two extremes is: :\left(\int_x^ g(x)\,\textx\right)^ = \left(G(x+t)-G(x)\right)^. Combining the three together yields: :f(t)= n(n-1)\int_^\infty g(x)g(x+t)
(x+t)-G(x) X, or x, is the twenty-fourth and third-to-last letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''"ex"'' (pronounced ), ...
\, \textx


Related quantities

The range is a specific example of
order statistic In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference. Importan ...
s. In particular, the range is a linear function of order statistics, which brings it into the scope of L-estimation.


See also

*
Interquartile range In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference ...
* Studentized range


References

{{DEFAULTSORT:Range (Statistics) Statistical deviation and dispersion Scale statistics Summary statistics