HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
and applications of statistics, normalization can have a range of meanings. In the simplest cases, normalization of ratings means adjusting values measured on different scales to a notionally common scale, often prior to averaging. In more complicated cases, normalization may refer to more sophisticated adjustments where the intention is to bring the entire probability distributions of adjusted values into alignment. In the case of normalization of scores in educational assessment, there may be an intention to align distributions to a
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
. A different approach to normalization of probability distributions is quantile normalization, where the
quantile In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in the same way. There is one fewer quantile th ...
s of the different measures are brought into alignment. In another usage in statistics, normalization refers to the creation of shifted and scaled versions of statistics, where the intention is that these normalized values allow the comparison of corresponding normalized values for different datasets in a way that eliminates the effects of certain gross influences, as in an
anomaly time series In the natural sciences, especially in atmospheric sciences, atmospheric and Earth sciences involving applied statistics, an ''anomaly'' is a persisting deviation (statistics), deviation in a physical quantity from its expected value, e.g., the syst ...
. Some types of normalization involve only a rescaling, to arrive at values relative to some size variable. In terms of levels of measurement, such ratios only make sense for ''ratio'' measurements (where ratios of measurements are meaningful), not ''interval'' measurements (where only distances are meaningful, but not ratios). In theoretical statistics, parametric normalization can often lead to pivotal quantities – functions whose
sampling distribution In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given random-sample-based statistic. If an arbitrarily large number of samples, each involving multiple observations (data points), were s ...
does not depend on the parameters – and to
ancillary statistic An ancillary statistic is a measure of a sample whose distribution (or whose pmf or pdf) does not depend on the parameters of the model. An ancillary statistic is a pivotal quantity that is also a statistic. Ancillary statistics can be used to c ...
s – pivotal quantities that can be computed from observations, without knowing parameters.


Examples

There are different types of normalizations in statistics – nondimensional ratios of errors, residuals, means and
standard deviations In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
, which are hence
scale invariant In physics, mathematics and statistics, scale invariance is a feature of objects or laws that do not change if scales of length, energy, or other variables, are multiplied by a common factor, and thus represent a universality. The technical ter ...
– some of which may be summarized as follows. Note that in terms of levels of measurement, these ratios only make sense for ''ratio'' measurements (where ratios of measurements are meaningful), not ''interval'' measurements (where only distances are meaningful, but not ratios). See also :Statistical ratios. Note that some other ratios, such as the
variance-to-mean ratio In probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a ...
\left(\frac\right), are also done for normalization, but are not nondimensional: the units do not cancel, and thus the ratio has units, and is not scale-invariant.


Other types

Other non-dimensional normalizations that can be used with no assumptions on the distribution include: * Assignment of
percentiles In statistics, a ''k''-th percentile (percentile score or centile) is a score ''below which'' a given percentage ''k'' of scores in its frequency distribution falls (exclusive definition) or a score ''at or below which'' a given percentage falls ...
. This is common on standardized tests. See also quantile normalization. * Normalization by adding and/or multiplying by constants so values fall between 0 and 1. This is used for
probability density functions In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
, with applications in fields such as physical chemistry in assigning probabilities to .


See also

*
Normal score The term normal score is used with two different meanings in statistics. One of them relates to creating a single value which can be treated as if it had arisen from a standard normal distribution (zero mean, unit variance). The second one relates t ...
*
Ratio distribution A ratio distribution (also known as a quotient distribution) is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two (usually independent) random variables ''X'' ...
*
Standard score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean ...
*
Feature scaling Feature scaling is a method used to normalize the range of independent variables or features of data. In data processing, it is also known as data normalization and is generally performed during the data preprocessing step. Motivation Since the ...


References

{{reflist, refs= Dodge, Y (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. {{ISBN, 0-19-920613-9 (entry for normalization of scores) Statistical ratios Statistical data transformation Equivalence (mathematics)