probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...

and

statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

, a scale parameter is a special kind of numerical parameter of a

parametric family In mathematics and its applications, a parametric family or a parameterized family is a indexed family, family of objects (a set of related objects) whose differences depend only on the chosen values for a set of parameters. Common examples are p ...

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...

s. The larger the scale parameter, the more spread out the distribution.

Definition

If a family of

s is such that there is a parameter ''s'' (and other parameters ''θ'') for which the

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...

satisfies :

F(x;s,\theta) = F(x/s;1,\theta), \!

then ''s'' is called a scale parameter, since its value determines the " scale" or

statistical dispersion In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a Probability distribution, distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard de ...

of the probability distribution. If ''s'' is large, then the distribution will be more spread out; if ''s'' is small then it will be more concentrated. Effects of a scale parameter on a positive-support probability distribution

Effects of a scale parameter on a positive-support probability distribution

Effect of a scale parameter over a mixture of two normal probability distributions

If the

probability density In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...

exists for all values of the complete parameter set, then the density (as a function of the scale parameter only) satisfies :

f_s(x) = f(x/s)/s, \!

where ''f'' is the density of a standardized version of the density, i.e.

f(x) \equiv f_(x)

. An

estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...

of a scale parameter is called an estimator of scale.

Families with Location Parameters

In the case where a parametrized family has a

location parameter In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...

, a slightly different definition is often used as follows. If we denote the location parameter by

m

, and the scale parameter by

s

, then we require that

F(x;s,m,\theta)=F((x-m)/s;1,0,\theta)

where

F(x,s,m,\theta)

is the cmd for the parametrized family. This modification is necessary in order for the standard deviation of a non-central Gaussian to be a scale parameter, since otherwise the mean would change when we rescale

x

. However, this alternative definition is not consistently used.

Simple manipulations

We can write

f_s

in terms of

g(x) = x/s

, as follows: :

f_s(x) = f\left(\frac\right) \cdot \frac = f(g(x))g'(x).

Because ''f'' is a probability density function, it integrates to unity: :

1 = \int_^ f(x)\,dx
   = \int_^ f(x)\,dx.

By the substitution rule of integral calculus, we then have :

1 = \int_^ f(g(x)) g'(x)\,dx
   = \int_^ f_s(x)\,dx.

f_s

is also properly normalized.

Rate parameter

Some families of distributions use a rate parameter (or "inverse scale parameter"), which is simply the reciprocal of the ''scale parameter''. So for example the

exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...

with scale parameter β and probability density :

f(x;\beta ) = \frac e^ ,\; x \ge 0

could equivalently be written with rate parameter λ as :

f(x;\lambda) = \lambda e^ ,\; x \ge 0.

Examples

* The uniform distribution can be parameterized with a

(a+b)/2

and a scale parameter

, b-a,

. * The

normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...

has two parameters: a

\mu

and a scale parameter

\sigma

. In practice the normal distribution is often parameterized in terms of the ''squared'' scale

\sigma^2

, which corresponds to the

variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...

of the distribution. * The

gamma distribution In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distri ...

is usually parameterized in terms of a scale parameter

\theta

or its inverse. * Special cases of distributions where the scale parameter equals unity may be called "standard" under certain conditions. For example, if the location parameter equals zero and the scale parameter equals one, the

is known as the ''standard'' normal distribution, and the

Cauchy distribution The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) fun ...

as the ''standard'' Cauchy distribution.

Estimation

A statistic can be used to estimate a scale parameter so long as it: * Is location-invariant, * Scales linearly with the scale parameter, and * Converges as the sample size grows. Various measures of statistical dispersion satisfy these. In order to make the statistic a

consistent estimator In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter ''θ''0—having the property that as the number of data points used increases indefinitely, the result ...

for the scale parameter, one must in general multiply the statistic by a constant

scale factor In affine geometry, uniform scaling (or isotropic scaling) is a linear transformation that enlarges (increases) or shrinks (diminishes) objects by a '' scale factor'' that is the same in all directions. The result of uniform scaling is similar ...

. This scale factor is defined as the theoretical value of the value obtained by dividing the required scale parameter by the asymptotic value of the statistic. Note that the scale factor depends on the distribution in question. For instance, in order to use the

median absolute deviation In statistics, the median absolute deviation (MAD) is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated by the MAD calculated from a sample. For a u ...

(MAD) to estimate the

standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...

of the

, one must multiply it by the factor :

1/\Phi^(3/4) \approx 1.4826,

where Φ⁻¹ is the

quantile function In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value equ ...

(inverse of the

) for the standard normal distribution. (See MAD for details.) That is, the MAD is not a consistent estimator for the standard deviation of a normal distribution, but 1.4826... MAD is a consistent estimator. Similarly, the

average absolute deviation The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median, m ...

needs to be multiplied by approximately 1.2533 to be a consistent estimator for standard deviation. Different factors would be required to estimate the standard deviation if the population did not follow a normal distribution.

Definition

Families with Location Parameters

Simple manipulations

Rate parameter

Examples

Estimation

See also

References

Further reading