Data (; ) are individual
,
, or items of
, often numeric. In a more technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects, while a datum (singular of ''data'') is a single value of a single variable. Although the terms "data" and "information" are often used interchangeably, this term has distinct meanings. In some popular publications, data are sometimes said to be transformed into information when they are viewed in context or in post-analysis. However, in academic treatments of the subject data are simply units of information. Data are used in
, businesses management (e.g., sales data, revenue, profits,
),
, governance (e.g.,
s,
s,
rates), and in virtually every other form of human organizational activity (e.g., censuses of the number of
by non-profit organizations). Data are , collected, reported, and analyzed, and used to create data visualizations such as graphs, tables or images. Data as a general
refers to the fact that some existing
or
is '' represented'' or ''
d'' in some form suitable for better usage or . ''
'' ("unprocessed data") is a collection of
or
before it has been "cleaned" and corrected by researchers. Raw data needs to be corrected to remove
or obvious instrument or data entry errors (e.g., a thermometer reading from an outdoor Arctic location recording a tropical temperature). Data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next stage. is raw data that is collected in an uncontrolled "
" environment.
is data that is generated within the context of a scientific investigation by observation and recording. Data has been described as the new
of the
.

# Etymology and terminology

The first English use of the word "data" is from the 1640s. The word "data" was first used to mean "transmissible and storable computer information" in 1946. The expression "data processing" was first used in 1954. The Latin word ''data'' is the plural of ' datum', "(thing) given," neuter past participle of ''dare'' "to give". In English the word ''data'' may be used as a plural noun in this sense, with some writers—usually, those working in natural sciences, life sciences, and social sciences—using ''datum'' in the singular and ''data'' for plural, especially in the 20th century and in many cases also the 21st (for example,
as of the 7th edition still requires "data" to be plural.). However, in everyday language and much of the usage of
and
, "data" is most commonly used in the singular as a
(like "sand" or "rain"). The term ''
'' takes the singular.

# Meaning

Data,
,
, and
are closely related concepts, but each has its role concerning the other, and each term has its meaning. According to a common view, data are collected and analyzed; data only becomes information suitable for making decisions once it has been analyzed in some fashion. One can say that the extent to which a set of data is informative to someone depends on the extent to which it is unexpected by that person. The amount of information contained in a data stream may be characterized by its
.
is the understanding based on extensive experience dealing with information on a subject. For example, the height of
is generally considered data. The height can be measured precisely with an
and entered into a database. This data may be included in a book along with other data on Mount Everest to describe the mountain in a manner useful for those who wish to decide on the best method to climb it. An understanding based on experience climbing mountains that could advise persons on the way to reach Mount Everest's peak may be seen as "knowledge". The practical climbing of Mount Everest's peak based on this knowledge may be seen as "wisdom". In other words, wisdom refers to the practical application of a person's knowledge in those circumstances where good may result. Thus wisdom complements and completes the series "data", "information" and "knowledge" of increasingly abstract concepts. Data are often assumed to be the least abstract concept, information the next least, and knowledge the most abstract. In this view, data becomes information by interpretation; e.g., the height of Mount Everest is generally considered "data", a book on Mount Everest geological characteristics may be considered "information", and a climber's guidebook containing practical information on the best way to reach Mount Everest's peak may be considered "knowledge". "Information" bears a diversity of meanings that ranges from everyday usage to technical use. This view, however, has also been argued to reverse how data emerges from information, and information from knowledge. Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation. Beynon-Davies uses the concept of a
to differentiate between data and information; data are a series of symbols, while information occurs when the symbols are used to refer to something. Before the development of computing devices and machines, people had to manually collect data and impose patterns on it. Since the development of computing devices and machines, these devices can also collect data. In the 2010s, computers are widely used in many fields to collect data and sort or process it, in disciplines ranging from
, analysis of
usage by citizens to scientific research. These patterns in data are seen as information that can be used to enhance knowledge. These patterns may be interpreted as "
" (though "truth" can be a subjective concept) and may be authorized as aesthetic and ethical criteria in some disciplines or cultures. Events that leave behind perceivable physical or virtual remains can be traced back through data. Marks are no longer considered data once the link between the mark and observation is broken. Mechanical computing devices are classified according to how they represent data. An analog computer represents a datum as a voltage, distance, position, or other physical quantity. A
represents a piece of data as a sequence of symbols drawn from a fixed
. The most common digital computers use a binary alphabet, that is, an alphabet of two characters typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from the binary alphabet. Some special forms of data are distinguished. A
is a collection of data, which can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably
and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish
, that is, a description of other data. A similar yet earlier term for metadata is "ancillary data." The prototypical example of metadata is the library catalog, which is a description of the contents of books.

# Data documents

Whenever data needs to be registered, data exists in the form of a data
s. Kinds of data documents include: *
*data study *
*
*
*
*data handbook * data journal Some of these data documents (data repositories, data studies, data sets, and software) are indexed in
es, while data papers are indexed in traditional bibliographic databases, e.g.,
. See further.

## Data collection

Gathering data can be accomplished through a primary source (the researcher is the first person to obtain the data) or a secondary source (the researcher obtains the data that has already been collected by other sources, such as data disseminated in a scientific journal). Data analysis methodologies vary and include data triangulation and data percolation. The latter offers an articulate method of collecting, classifying, and analyzing data using five possible angles of analysis (at least three) to maximize the research's objectivity and permit an understanding of the phenomena under investigation as complete as possible: qualitative and quantitative methods, literature reviews (including scholarly articles), interviews with experts, and computer simulation. The data are thereafter "percolated" using a series of pre-determined steps so as to extract the most relevant information.

# In other fields

Although data are also increasingly used in other fields, it has been suggested that the highly interpretive nature of them might be at odds with the ethos of data as "given".
introduced the term ''capta'' (from the Latin ''capere'', “to take”) to distinguish between an immense number of possible data and a sub-set of them, to which attention is oriented.
has argued that since the humanities affirm knowledge production as "situated, partial, and constitutive," using ''data'' may introduce assumptions that are counterproductive, for example that phenomena are discrete or are observer-independent. The term ''capta'', which emphasizes the act of observation as constitutive, is offered as an alternative to ''data'' for visual representations in the humanities.

