Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and

ratio In mathematics, a ratio shows how many times one number contains another. For example, if there are eight oranges and six lemons in a bowl of fruit, then the ratio of oranges to lemons is eight to six (that is, 8:6, which is equivalent to the ...

. This framework of distinguishing levels of measurement originated in psychology and is widely criticized by scholars in other disciplines. Other classifications include those by Mosteller and Tukey, and by Chrisman.

Stevens's typology

Overview

Stevens proposed his typology in a 1946 ''

Science Science is a systematic endeavor that Scientific method, builds and organizes knowledge in the form of Testability, testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earli ...

'' article titled "On the theory of scales of measurement". In that article, Stevens claimed that all measurement in science was conducted using four different types of scales that he called "nominal", "ordinal", "interval", and "ratio", unifying both "

qualitative Qualitative descriptions or distinctions are based on some quality or characteristic rather than on some quantity or measured value. Qualitative may also refer to: *Qualitative property, a property that can be observed but not measured numericall ...

" (which are described by his "nominal" type) and "

quantitative Quantitative may refer to: * Quantitative research, scientific investigation of quantitative properties * Quantitative analysis (disambiguation) * Quantitative verse, a metrical system in poetry * Statistics, also known as quantitative analysis ...

" (to a different degree, all the rest of his scales). The concept of scale types later received the mathematical rigour that it lacked at its inception with the work of mathematical psychologists Theodore Alper (1985, 1987), Louis Narens (1981a, b), and

R. Duncan Luce Robert Duncan Luce (May 16, 1925 – August 11, 2012) was an American mathematician and social scientist, and one of the most preeminent figures in the field of mathematical psychology. At the end of his life, he held the position of Distingu ...

(1986, 1987, 2001). As Luce (1997, p. 395) wrote:

Comparison

Nominal level

The nominal type differentiates between items or subjects based only on their names or (meta-)categories and other qualitative classifications they belong to; thus

dichotomous A dichotomy is a partition of a whole (or a set) into two parts (subsets). In other words, this couple of parts must be * jointly exhaustive: everything must belong to one part or the other, and * mutually exclusive: nothing can belong simul ...

data involves the construction of classifications as well as the classification of items. Discovery of an exception to a classification can be viewed as progress. Numbers may be used to represent the variables but the numbers do not have numerical value or relationship: for example, a

globally unique identifier A universally unique identifier (UUID) is a 128-bit label used for information in computer systems. The term globally unique identifier (GUID) is also used. When generated according to the standard methods, UUIDs are, for practical purposes, un ...

. Examples of these classifications include gender, nationality, ethnicity, language, genre, style, biological species, and form. In a university one could also use hall of affiliation as an example. Other concrete examples are * in

grammar In linguistics, the grammar of a natural language is its set of structure, structural constraints on speakers' or writers' composition of clause (linguistics), clauses, phrases, and words. The term can also refer to the study of such constraint ...

, the parts of speech: noun, verb, preposition, article, pronoun, etc. * in politics,

power projection Power projection (or force projection or strength projection), in international relations, is the capacity of a state to deploy and sustain forces outside its territory. The ability of a state to project its power into an area may serve as an e ...

: hard power, soft power, etc. * in biology, the

taxonomic rank In biological classification, taxonomic rank is the relative level of a group of organisms (a taxon) in an ancestral or hereditary hierarchy. A common system consists of species, genus, family, order, class, phylum, kingdom, domain. While ol ...

s below domains: Archaea, Bacteria, and Eukarya * in

software engineering Software engineering is a systematic engineering approach to software development. A software engineer is a person who applies the principles of software engineering to design, develop, maintain, test, and evaluate computer software. The term ' ...

, type of faults: specification faults, design faults, and code faults Nominal scales were often called qualitative scales, and measurements made on qualitative scales were called qualitative data. However, the rise of qualitative research has made this usage confusing. If numbers are assigned as labels in nominal measurement, they have no specific numerical value or meaning. No form of arithmetic computation (+, −, ×, etc.) may be performed on nominal measures. The nominal level is the lowest measurement level used from a statistical point of view.

Mathematical operations

Equality and other operations that can be defined in terms of equality, such as inequality and

set membership In mathematics, an element (or member) of a set is any one of the distinct objects that belong to that set. Sets Writing A = \ means that the elements of the set are the numbers 1, 2, 3 and 4. Sets of elements of , for example \, are subset ...

, are the only non-trivial

operation Operation or Operations may refer to: Arts, entertainment and media * ''Operation'' (game), a battery-operated board game that challenges dexterity * Operation (music), a term used in musical set theory * ''Operations'' (magazine), Multi-Man ...

s that generically apply to objects of the nominal type.

Central tendency

The mode, i.e. the ''most common'' item, is allowed as the measure of

central tendency In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications ...

for the nominal type. On the other hand, the median, i.e. the ''middle-ranked'' item, makes no sense for the nominal type of data since ranking is meaningless for the nominal type.

Ordinal scaling

The ordinal type allows for

rank order A ranking is a relationship between a set of items such that, for any two items, the first is either "ranked higher than", "ranked lower than" or "ranked equal to" the second. In mathematics, this is known as a weak order or total preorder of o ...

(1st, 2nd, 3rd, etc.) by which data can be sorted but still does not allow for a relative ''degree of difference'' between them. Examples include, on one hand, dichotomous data with dichotomous (or dichotomized) values such as 'sick' vs. 'healthy' when measuring health, 'guilty' vs. 'not-guilty' when making judgments in courts, 'wrong/false' vs. 'right/true' when measuring

truth value In logic and mathematics, a truth value, sometimes called a logical value, is a value indicating the relation of a proposition to truth, which in classical logic has only two possible values ('' true'' or ''false''). Computing In some prog ...

, and, on the other hand, non-dichotomous data consisting of a spectrum of values, such as 'completely agree', 'mostly agree', 'mostly disagree', 'completely disagree' when measuring

opinion An opinion is a judgment, viewpoint, or statement that is not conclusive, rather than facts, which are true statements. Definition A given opinion may deal with subjective matters in which there is no conclusive finding, or it may deal with ...

. The ordinal scale places events in order, but there is no attempt to make the intervals of the scale equal in terms of some rule. Rank orders represent ordinal scales and are frequently used in research relating to qualitative phenomena. A student's rank in his graduation class involves the use of an ordinal scale. One has to be very careful in making a statement about scores based on ordinal scales. For instance, if Devi's position in his class is 10 and Ganga's position is 40, it cannot be said that Devi's position is four times as good as that of Ganga. The statement would make no sense at all. Ordinal scales only permit the ranking of items from highest to lowest. Ordinal measures have no absolute values, and the real differences between adjacent ranks may not be equal. All that can be said is that one person is higher or lower on the scale than another, but more precise comparisons cannot be made. Thus, the use of an ordinal scale implies a statement of 'greater than' or 'less than' (an equality statement is also acceptable) without our being able to state how much greater or less. The real difference between ranks 1 and 2, for instance, may be more or less than the difference between ranks 5 and 6. Since the numbers of this scale have only a rank meaning, the appropriate measure of central tendency is the median. A percentile or quartile measure is used for measuring dispersion. Correlations are restricted to various rank order methods. Measures of statistical significance are restricted to the non-parametric methods (R. M. Kothari, 2004).

Central tendency

The median, i.e. ''middle-ranked'', item is allowed as the measure of

; however, the mean (or average) as the measure of

is not allowed. The mode is allowed. In 1946, Stevens observed that psychological measurement, such as measurement of opinions, usually operates on ordinal scales; thus means and standard deviations have no

validity Validity or Valid may refer to: Science/mathematics/statistics: * Validity (logic), a property of a logical argument * Scientific: ** Internal validity, the validity of causal inferences within scientific studies, usually based on experiments ...

, but they can be used to get ideas for how to improve

operationalization In research design, especially in psychology, social sciences, life sciences and physics, operationalization or operationalisation is a process of defining the measurement of a phenomenon which is not directly measurable, though its existence is ...

of variables used in

questionnaire A questionnaire is a research instrument that consists of a set of questions (or other types of prompts) for the purpose of gathering information from respondents through survey or statistical study. A research questionnaire is typically a mix of ...

s. Most

psychological Psychology is the scientific study of mind and behavior. Psychology includes the study of conscious and unconscious phenomena, including feelings and thoughts. It is an academic discipline of immense scope, crossing the boundaries betw ...

data collected by

psychometric Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally refers to specialized fields within psychology and education devoted to testing, measurement, assessment, and ...

instruments and tests, measuring

cognitive Cognition refers to "the mental action or process of acquiring knowledge and understanding through thought, experience, and the senses". It encompasses all aspects of intellectual functions and processes such as: perception, attention, thought ...

and other abilities, are ordinal, although some theoreticians have argued they can be treated as interval or ratio scales. However, there is little

prima facie ''Prima facie'' (; ) is a Latin expression meaning ''at first sight'' or ''based on first impression''. The literal translation would be 'at first face' or 'at first appearance', from the feminine forms of ''primus'' ('first') and ''facies'' (' ...

evidence to suggest that such attributes are anything more than ordinal (Cliff, 1996; Cliff & Keats, 2003; Michell, 2008). In particular, IQ scores reflect an ordinal scale, in which all scores are meaningful for comparison only. There is no absolute zero, and a 10-point difference may carry different meanings at different points of the scale.

Interval scale

The interval type allows for the ''degree of difference'' between items, but not the ratio between them. Examples include '' temperature scales'' with the Celsius scale, which has two defined points (the freezing and boiling point of water at specific conditions) and then separated into 100 intervals, ''date'' when measured from an arbitrary epoch (such as AD), ''location'' in Cartesian coordinates, and ''direction'' measured in degrees from true or magnetic north. Ratios are not meaningful since 20 °C cannot be said to be "twice as hot" as 10 °C (unlike temperature in

Kelvin The kelvin, symbol K, is the primary unit of temperature in the International System of Units (SI), used alongside its prefixed forms and the degree Celsius. It is named after the Belfast-born and University of Glasgow-based engineer and ph ...

s), nor can multiplication/division be carried out between any two dates directly. However, ''ratios of differences'' can be expressed; for example, one difference can be twice another. Interval type variables are sometimes also called "scaled variables", but the formal mathematical term is an

affine space In mathematics, an affine space is a geometric structure that generalizes some of the properties of Euclidean spaces in such a way that these are independent of the concepts of distance and measure of angles, keeping only the properties relat ...

(in this case an affine line).

Central tendency and statistical dispersion

The mode, median, and

arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the '' average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The coll ...

are allowed to measure central tendency of interval variables, while measures of statistical dispersion include range and standard deviation. Since one can only divide by ''differences'', one cannot define measures that require some ratios, such as the

coefficient of variation In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed ...

. More subtly, while one can define moments about the

origin Origin(s) or The Origin may refer to: Arts, entertainment, and media Comics and manga * ''Origin'' (comics), a Wolverine comic book mini-series published by Marvel Comics in 2002 * ''The Origin'' (Buffy comic), a 1999 ''Buffy the Vampire Sl ...

, only central moments are meaningful, since the choice of origin is arbitrary. One can define standardized moments, since ratios of differences are meaningful, but one cannot define the coefficient of variation, since the mean is a moment about the origin, unlike the standard deviation, which is (the square root of) a central moment.

Ratio scale

:''See also'': The ratio type takes its name from the fact that measurement is the estimation of the ratio between a magnitude of a continuous quantity and a

unit of measurement A unit of measurement is a definite magnitude of a quantity, defined and adopted by convention or by law, that is used as a standard for measurement of the same kind of quantity. Any other quantity of that kind can be expressed as a mult ...

of the same kind (Michell, 1997, 1999). Most measurement in the physical sciences and engineering is done on ratio scales. Examples include

mass Mass is an intrinsic property of a body. It was traditionally believed to be related to the quantity of matter in a physical body, until the discovery of the atom and particle physics. It was found that different atoms and different element ...

length Length is a measure of distance. In the International System of Quantities, length is a quantity with dimension distance. In most systems of measurement a base unit for length is chosen, from which all other units are derived. In the Inte ...

, duration,

plane angle In Euclidean geometry, an angle is the figure formed by two rays, called the '' sides'' of the angle, sharing a common endpoint, called the '' vertex'' of the angle. Angles formed by two rays lie in the plane that contains the rays. Angles ...

energy In physics, energy (from Ancient Greek: ἐνέργεια, ''enérgeia'', “activity”) is the quantitative property that is transferred to a body or to a physical system, recognizable in the performance of work and in the form of hea ...

and

electric charge Electric charge is the physical property of matter that causes charged matter to experience a force when placed in an electromagnetic field. Electric charge can be ''positive'' or ''negative'' (commonly carried by protons and electrons respecti ...

. In contrast to interval scales, ratios can be compared using division. Very informally, many ratio scales can be described as specifying "how much" of something (i.e. an amount or magnitude). Ratio scale is often used to express an order of magnitude such as for temperature in Orders of magnitude (temperature).

Central tendency and statistical dispersion

The

geometric mean In mathematics, the geometric mean is a mean or average which indicates a central tendency of a set of numbers by using the product of their values (as opposed to the arithmetic mean which uses their sum). The geometric mean is defined as the ...

and the

harmonic mean In mathematics, the harmonic mean is one of several kinds of average, and in particular, one of the Pythagorean means. It is sometimes appropriate for situations when the average rate is desired. The harmonic mean can be expressed as the recipr ...

are allowed to measure the central tendency, in addition to the mode, median, and arithmetic mean. The

studentized range In statistics, the studentized range, denoted ''q'', is the difference between the largest and smallest data in a sample normalized by the sample standard deviation. It is named after William Sealy Gosset (who wrote under the pseudonym "''Studen ...

and the

are allowed to measure statistical dispersion. All statistical measures are allowed because all necessary mathematical operations are defined for the ratio scale.

Debate on Stevens's typology

While Stevens's typology is widely adopted, it is still being challenged by other theoreticians, particularly in the cases of the nominal and ordinal types (Michell, 1986). Some however have argued that the degree of discord can be overstated. Hand says, "Basic psychology texts often begin with Stevens's framework and the ideas are ubiquitous. Indeed, the essential soundness of his hierarchy has been established for representational measurement by mathematicians, determining the invariance properties of mappings from empirical systems to real number continua. Certainly the ideas have been revised, extended, and elaborated, but the remarkable thing is his insight given the relatively limited formal apparatus available to him and how many decades have passed since he coined them." Duncan (1986) objected to the use of the word ''measurement'' in relation to the nominal type, but Stevens (1975) said of his own definition of measurement that "the assignment can be any consistent rule. The only rule not allowed would be random assignment, for randomness amounts in effect to a nonrule". The use of the mean as a measure of the central tendency for the ordinal type is still debatable among those who accept Stevens's typology. Many behavioural scientists use the mean for ordinal data, anyway. This is often justified on the basis that the ordinal type in behavioural science is in fact somewhere between the true ordinal and interval types; although the interval difference between two ordinal ranks is not constant, it is often of the same order of magnitude. For example, applications of measurement models in educational contexts often indicate that total scores have a fairly linear relationship with measurements across the range of an assessment. Thus, some argue that so long as the unknown interval difference between ordinal scale ranks is not too variable, interval scale statistics such as means can meaningfully be used on ordinal scale variables. Statistical analysis software such as

SPSS SPSS Statistics is a statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, and criminal investigation. Long produced by SPSS Inc., it was acquired by IBM in 2009. Cur ...

requires the user to select the appropriate measurement class for each variable. This ensures that subsequent user errors cannot inadvertently perform meaningless analyses (for example correlation analysis with a variable on a nominal level). L. L. Thurstone made progress toward developing a justification for obtaining the interval type, based on the law of comparative judgment. A common application of the law is the analytic hierarchy process. Further progress was made by Georg Rasch (1960), who developed the probabilistic

Rasch model The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, ...

that provides a theoretical basis and justification for obtaining interval-level measurements from counts of observations such as total scores on assessments.

Other proposed typologies

Typologies aside from Stevens's typology have been proposed. For instance, Mosteller and Tukey (1977), Nelder (1990) described continuous counts, continuous ratios, count ratios, and categorical modes of data. See also Chrisman (1998), van den Berg (1991).

Mosteller and Tukey's typology (1977)

Mosteller and Tukey noted that the four levels are not exhaustive and proposed: # Names # Grades (ordered labels like beginner, intermediate, advanced) # Ranks (orders with 1 being the smallest or largest, 2 the next smallest or largest, and so on) # Counted fractions (bound by 0 and 1) # Counts (non-negative integers) # Amounts (non-negative real numbers) # Balances (any real number) For example, percentages (a variation on fractions in the Mosteller–Tukey framework) do not fit well into Stevens's framework: No transformation is fully admissible.

Chrisman's typology (1998)

Nicholas R. Chrisman introduced an expanded list of levels of measurement to account for various measurements that do not necessarily fit with the traditional notions of levels of measurement. Measurements bound to a range and repeating (like degrees in a circle, clock time, etc.), graded membership categories, and other types of measurement do not fit to Stevens's original work, leading to the introduction of six new levels of measurement, for a total of ten: # Nominal # Gradation of membership # Ordinal # Interval # Log-interval # Extensive ratio # Cyclical ratio # Derived ratio # Counts # Absolute While some claim that the extended levels of measurement are rarely used outside of academic geography, graded membership is central to

fuzzy set theory In mathematics, fuzzy sets (a.k.a. uncertain sets) are sets whose elements have degrees of membership. Fuzzy sets were introduced independently by Lotfi A. Zadeh in 1965 as an extension of the classical notion of set. At the same time, defined a ...

, while absolute measurements include probabilities and the plausibility and ignorance in

Dempster–Shafer theory The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory (DST), is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as probability, possibility and i ...

. Cyclical ratio measurements include angles and times. Counts appear to be ratio measurements, but the scale is not arbitrary and fractional counts are commonly meaningless. Log-interval measurements are commonly displayed in stock market graphics. All these types of measurements are commonly used outside academic geography, and do not fit well to Stevens' original work.

Scale types and Stevens's "operational theory of measurement"

The theory of scale types is the intellectual handmaiden to Stevens's "operational theory of measurement", which was to become definitive within psychology and the

behavioral sciences Behavioral sciences explore the cognitive processes within organisms and the behavioral interactions between organisms in the natural world. It involves the systematic analysis and investigation of human and animal behavior through naturalist ...

, despite Michell's characterization as its being quite at odds with measurement in the natural sciences (Michell, 1999). Essentially, the operational theory of measurement was a reaction to the conclusions of a committee established in 1932 by the

British Association for the Advancement of Science The British Science Association (BSA) is a charity and learned society founded in 1831 to aid in the promotion and development of science. Until 2009 it was known as the British Association for the Advancement of Science (BA). The current Ch ...

to investigate the possibility of genuine scientific measurement in the psychological and behavioral sciences. This committee, which became known as the ''Ferguson committee'', published a Final Report (Ferguson, et al., 1940, p. 245) in which Stevens's sone scale (Stevens & Davis, 1938) was an object of criticism: That is, if Stevens's '' sone'' scale genuinely measured the intensity of auditory sensations, then evidence for such sensations as being quantitative attributes needed to be produced. The evidence needed was the presence of ''additive structure'' – a concept comprehensively treated by the German mathematician

Otto Hölder Ludwig Otto Hölder (December 22, 1859 – August 29, 1937) was a German mathematician born in Stuttgart. Early life and education Hölder was the youngest of three sons of professor Otto Hölder (1811–1890), and a grandson of professor Chris ...

(Hölder, 1901). Given that the physicist and measurement theorist Norman Robert Campbell dominated the Ferguson committee's deliberations, the committee concluded that measurement in the social sciences was impossible due to the lack of

concatenation In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalisations of concatenat ...

operations. This conclusion was later rendered false by the discovery of the

theory of conjoint measurement The theory of conjoint measurement (also known as conjoint measurement or additive conjoint measurement) is a general, formal theory of continuous quantity. It was independently discovered by the French economist Gérard Debreu (1960) and by the A ...

by Debreu (1960) and independently by Luce & Tukey (1964). However, Stevens's reaction was not to conduct experiments to test for the presence of additive structure in sensations, but instead to render the conclusions of the Ferguson committee null and void by proposing a new theory of measurement: Stevens was greatly influenced by the ideas of another Harvard academic, the

Nobel laureate The Nobel Prizes ( sv, Nobelpriset, no, Nobelprisen) are awarded annually by the Royal Swedish Academy of Sciences, the Swedish Academy, the Karolinska Institutet, and the Norwegian Nobel Committee to individuals and organizations who make ...

physicist Percy Bridgman (1927), whose doctrine of ''operationism'' Stevens used to define measurement. In Stevens's definition, for example, it is the use of a tape measure that defines length (the object of measurement) as being measurable (and so by implication quantitative). Critics of operationism object that it confuses the relations between two objects or events for properties of one of those of objects or events (Hardcastle, 1995; Michell, 1999; Moyer, 1981a,b; Rogers, 1989). The Canadian measurement theorist William Rozeboom (1966) was an early and trenchant critic of Stevens's theory of scale types.

Same variable may be different scale type depending on context

Another issue is that the same variable may be a different scale type depending on how it is measured and on the goals of the analysis. For example, hair color is usually thought of as a nominal variable, since it has no apparent ordering. However, it is possible to order colors (including hair colors) in various ways, including by hue; this is known as

colorimetry Colorimetry is "the science and technology used to quantify and describe physically the human color perception". It is similar to spectrophotometry, but is distinguished by its interest in reducing spectra to the physical correlates of color ...

. Hue is an interval level variable.

Stevens's typology

Overview

Comparison

Nominal level

Mathematical operations

Central tendency

Ordinal scaling

Central tendency

Interval scale

Central tendency and statistical dispersion

Ratio scale

Central tendency and statistical dispersion

Debate on Stevens's typology

Other proposed typologies

Mosteller and Tukey's typology (1977)

Chrisman's typology (1998)

Scale types and Stevens's "operational theory of measurement"

Same variable may be different scale type depending on context

See also

References

Further reading