statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

and applications of statistics, normalization can have a range of meanings. In the simplest cases, normalization of ratings means adjusting values measured on different scales to a notionally common scale, often prior to averaging. In more complicated cases, normalization may refer to more sophisticated adjustments where the intention is to bring the entire

probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...

s of adjusted values into alignment. In the case of normalization of scores in educational assessment, there may be an intention to align distributions to a

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

. A different approach to normalization of probability distributions is quantile normalization, where the

quantile In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities or dividing the observations in a sample in the same way. There is one fewer quantile t ...

s of the different measures are brought into alignment. In another usage in statistics, normalization refers to the creation of shifted and scaled versions of statistics, where the intention is that these normalized values allow the comparison of corresponding normalized values for different datasets in a way that eliminates the effects of certain gross influences, as in an anomaly time series. Some types of normalization involve only a rescaling, to arrive at values relative to some size variable. In terms of

levels of measurement Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to dependent and independent variables, variables. Psychologist Stanley Smith Stevens developed the best-known class ...

, such ratios only make sense for ''ratio'' measurements (where ratios of measurements are meaningful), not ''interval'' measurements (where only distances are meaningful, but not ratios). In theoretical statistics, parametric normalization can often lead to pivotal quantities – functions whose

sampling distribution In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given random-sample-based statistic. For an arbitrarily large number of samples where each sample, involving multiple observations (data poi ...

does not depend on the parameters – and to

ancillary statistic In statistics, ancillarity is a property of a statistic computed on a sample dataset in relation to a parametric model of the dataset. An ancillary statistic has the same distribution regardless of the value of the parameters and thus provides no i ...

s – pivotal quantities that can be computed from observations, without knowing parameters.

History

Standard score (Z-score)

The concept of normalization emerged alongside the study of the

Abraham De Moivre Abraham de Moivre FRS (; 26 May 166727 November 1754) was a French mathematician known for de Moivre's formula, a formula that links complex numbers and trigonometry, and for his work on the normal distribution and probability theory. He move ...

Pierre-Simon Laplace Pierre-Simon, Marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French polymath, a scholar whose work has been instrumental in the fields of physics, astronomy, mathematics, engineering, statistics, and philosophy. He summariz ...

, and

Carl Friedrich Gauss Johann Carl Friedrich Gauss (; ; ; 30 April 177723 February 1855) was a German mathematician, astronomer, geodesist, and physicist, who contributed to many fields in mathematics and science. He was director of the Göttingen Observatory and ...

from the 18th to the 19th century. As the name “standard” refers to the particular normal distribution with expectation zero and standard deviation one, that is, the

standard normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac e^ ...

, normalization, in this case, “standardization”, was then used to refer to the rescaling of any

distribution Distribution may refer to: Mathematics *Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations *Probability distribution, the probability of a particular value or value range of a varia ...

data set A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more table (database), database tables, where every column (database), column of a table represents a particular Variable (computer sci ...

to have mean zero and standard deviation one. While the study of normal distribution structured the process of standardization, the result of this process, also known as the

Z-score In statistics, the standard score or ''z''-score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores ...

, given by the difference between sample value and

population mean In statistics, a population is a set of similar items or events which is of interest for some question or experiment. A statistical population can be a group of existing objects (e.g. the set of all stars within the Milky Way galaxy) or a hyp ...

divided by population standard deviation and measuring the number of standard deviations of a value from its population mean, was not formalized and popularized until

Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...

and

Karl Pearson Karl Pearson (; born Carl Pearson; 27 March 1857 – 27 April 1936) was an English biostatistician and mathematician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university ...

elaborated the concept as part of the broader framework of

statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...

and

hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...

in the early 20th century.

Student’s t-Statistic

William Sealy Gosset William Sealy Gosset (13 June 1876 – 16 October 1937) was an English statistician, chemist and brewer who worked for Guinness. In statistics, he pioneered small sample experimental design. Gosset published under the pen name Student and develo ...

initiated the adjustment of normal distribution and standard score on small sample size. Educated in Chemistry and Mathematics at Winchester and Oxford, Gosset was employed by

Guinness Brewery St. James's Gate Brewery is a brewery founded in 1759 in Dublin, Ireland, by Arthur Guinness. The company is now a part of Diageo, a company formed from the merger of Guinness and Grand Metropolitan in 1997. The main product of the brewery is ...

, the biggest brewer in

Ireland Ireland (, ; ; Ulster Scots dialect, Ulster-Scots: ) is an island in the North Atlantic Ocean, in Northwestern Europe. Geopolitically, the island is divided between the Republic of Ireland (officially Names of the Irish state, named Irelan ...

back then, and was tasked with precise

quality control Quality control (QC) is a process by which entities review the quality of all factors involved in production. ISO 9000 defines quality control as "a part of quality management focused on fulfilling quality requirements". This approach plac ...

. It was through small-sample experiments that Gosset discovered that the distribution of the means using small-scaled samples slightly deviated from the distribution of the means using large-scaled samples – the normal distribution – and appeared “taller and narrower” in comparison. This finding was later published in a Guinness internal report titled ''The application of the “Law of Error” to the work of the brewery'' and was sent to

for further discussion, which later yielded a formal publishment titled ''The probable error of a mean'' in the year of 1908. Under Guinness Brewery’s privacy restrictions, Gosset published the paper under the pseudo “Student”. Gosset’s work was later enhanced and transformed by

to the form that is used today, and was, alongside the names “ Student’s t distribution” – referring to the adjusted normal distribution Gosset proposed, and “ Student’s t-statistic” – referring to the

test statistic Test statistic is a quantity derived from the sample for statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specified in terms of a tes ...

used in measuring the departure of the estimated value of a

parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...

from its hypothesized value divided by its

standard error The standard error (SE) of a statistic (usually an estimator of a parameter, like the average or mean) is the standard deviation of its sampling distribution or an estimate of that standard deviation. In other words, it is the standard deviati ...

, popularized through Fisher’s publishment titled ''Applications of “Student’s” distribution''.

Feature Scaling

The rise of

computers A computer is a machine that can be programmed to automatically carry out sequences of arithmetic or logical operations ('' computation''). Modern digital electronic computers can perform generic sets of operations known as ''programs'', ...

and

multivariate statistics Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., '' multivariate random variables''. Multivariate statistics concerns understanding the differ ...

in mid-20th century necessitated normalization to process data with different units, hatching feature scaling – a method used to rescale data to a fixed range – like min-max scaling and robust scaling. This modern normalization process especially targeting large-scaled data became more formalized in fields including

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

pattern recognition Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...

, and

neural networks A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either Cell (biology), biological cells or signal pathways. While individual neurons are simple, many of them together in a netwo ...

in late 20th century.

Batch Normalization

Batch normalization was proposed by Sergey Ioffe and Christian Szegedy in 2015 to enhance the efficiency of training in

Examples

There are different types of normalizations in statistics – nondimensional ratios of errors, residuals, means and

standard deviations In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its mean. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the ...

, which are hence scale invariant – some of which may be summarized as follows. Note that in terms of

, these ratios only make sense for ''ratio'' measurements (where ratios of measurements are meaningful), not ''interval'' measurements (where only distances are meaningful, but not ratios). See also :Statistical ratios. Note that some other ratios, such as the

variance-to-mean ratio In probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a Normalization (statistics), normalized measure ...

\left(\frac\right)

, are also done for normalization, but are not nondimensional: the units do not cancel, and thus the ratio has units, and is not scale-invariant.

Other types

Other non-dimensional normalizations that can be used with no assumptions on the distribution include: * Assignment of

percentiles In statistics, a ''k''-th percentile, also known as percentile score or centile, is a score (e.g., a data point) a given percentage ''k'' of all scores in its frequency distribution exists ("exclusive" definition) or a score a given percentage ...

. This is common on standardized tests. See also quantile normalization. * Normalization by adding and/or multiplying by constants so values fall between 0 and 1. This is used for probability density functions, with applications in fields such as quantum mechanics in assigning probabilities to .

References

{{reflist, refs= Dodge, Y (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. {{ISBN, 0-19-920613-9 (entry for normalization of scores) Statistical ratios Statistical data transformation Equivalence (mathematics)