
In
information systems, a tag is a
keyword or term assigned to a piece of information (such as an
Internet bookmark,
multimedia, database
record
A record, recording or records may refer to:
An item or collection of data Computing
* Record (computer science), a data structure
** Record, or row (database), a set of fields in a database related to one entity
** Boot sector or boot record, ...
, or
computer file). This kind of
metadata helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system, although they may also be chosen from a
controlled vocabulary
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controll ...
.
Tagging was popularized by
website
A website (also written as a web site) is a collection of web pages and related content that is identified by a common domain name and published on at least one web server. Examples of notable websites are Google, Facebook, Amazon, and Wikip ...
s associated with
Web 2.0 and is an important feature of many Web 2.0 services.
It is now also part of other
database systems,
desktop applications, and
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s.
Overview
People use tags to aid
classification, mark ownership, note
boundaries, and indicate
online identity. Tags may take the form of words, images, or other identifying marks. An analogous example of tags in the physical world is
museum
A museum ( ; plural museums or, rarely, musea) is a building or institution that cares for and displays a collection of artifacts and other objects of artistic, cultural, historical
History (derived ) is the systematic study and th ...
object tagging. People were using textual
keywords to
classify information and objects long before computers. Computer based
search algorithm
In computer science, a search algorithm is an algorithm designed to solve a search problem. Search algorithms work to retrieve information stored within particular data structure, or calculated in the Feasible region, search space of a problem do ...
s made the use of such keywords a rapid way of exploring records.
Tagging gained popularity due to the growth of
social bookmarking,
image sharing, and
social networking websites.
These sites allow users to create and manage labels (or "tags") that categorize content using simple keywords. Websites that include tags often display collections of tags as
tag clouds,
[For example, Blogger and ]WordPress
WordPress (WP or WordPress.org) is a free and open-source software, free and open-source content management system (CMS) written in PHP, hypertext preprocessor language and paired with a MySQL or MariaDB database with supported secure hypert ...
can display tag clouds. as do some desktop applications.
[For example: Leap is a ]macOS
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac (computer), Mac computers. Within the market of ...
application that features a clickable tag cloud of macOS tags: TaggTool is a Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
application that permits tagging files and displaying a tag cloud: On websites that aggregate the tags of all users, an individual user's tags can be useful both to them and to the larger community of the website's users.
Tagging systems have sometimes been classified into two kinds: ''top-down'' and ''bottom-up''.
Top-down
taxonomies are created by an authorized group of designers (sometimes in the form of a
controlled vocabulary
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controll ...
), whereas bottom-up taxonomies (called
folksonomies
Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tags ...
) are created by all users.
This definition of "top down" and "bottom up" should not be confused with the distinction between a ''single hierarchical''
tree structure (in which there is one correct way to classify each item) versus ''multiple non-hierarchical''
sets (in which there are multiple ways to classify an item); the structure of both top-down and bottom-up taxonomies may be either hierarchical, non-hierarchical, or a combination of both.
Some researchers and applications have experimented with combining hierarchical and non-hierarchical tagging to aid in information retrieval. Others are combining top-down and bottom-up tagging, including in some large library catalogs (
OPACs) such as
WorldCat
WorldCat is a union catalog that itemizes the collections of tens of thousands of institutions (mostly libraries), in many countries, that are current or past members of the OCLC global cooperative. It is operated by OCLC, Inc. Many of the O ...
.
When tags or other taxonomies have further properties (or
semantics) such as
relationship
Relationship most often refers to:
* Family relations and relatives: consanguinity
* Interpersonal relationship, a strong, deep, or close association or acquaintance between two or more people
* Correlation and dependence, relationships in mathem ...
s and
attributes, they constitute an
ontology
In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality.
Ontology addresses questions like how entities are grouped into categories and which of these entities ...
.
Metadata tags as described in this article should not be confused with the use of the word "tag" in some software to refer to an automatically generated
cross-reference; examples of the latter are ''tags tables'' in
Emacs
Emacs , originally named EMACS (an acronym for "Editor MACroS"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, ...
and
''smart tags'' in
Microsoft Office
Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a ma ...
.
History
The use of keywords as part of an identification and classification system long predates computers.
Paper data storage devices, notably
edge-notched cards, that permitted classification and sorting by multiple criteria were already in use prior to the twentieth century, and
faceted classification has been used by libraries since the 1930s.
In the late 1970s and early 1980s, the
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
text editor Emacs
Emacs , originally named EMACS (an acronym for "Editor MACroS"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, ...
offered a companion software program called ''Tags'' that could automatically build a table of cross-references called a ''tags table'' that Emacs could use to jump between a
function call and that function's definition. This use of the word "tag" did not refer to metadata tags, but was an early use of the word "tag" in software to refer to a
word index.
Online databases and early websites deployed keyword tags as a way for publishers to help users find content. In the early days of the
World Wide Web
The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet.
Documents and downloadable media are made available to the network through web se ...
, the
keywords
meta element was used by
web designers to tell
web search engines what the web page was about, but these keywords were only visible in a web page's
source code
In computing, source code, or simply code, is any collection of code, with or without comment (computer programming), comments, written using a human-readable programming language, usually as plain text. The source code of a Computer program, p ...
and were not modifiable by users.

In 1997, the collaborative portal "A Description of the Equator and Some ØtherLands" produced by
documenta X, Germany, used the
folksonomic term ''Tag'' for its co-authors and guest authors on its Upload page. In "The Equator" the term ''Tag'' for user-input was described as an ''abstract literal or keyword'' to aid the user. However, users defined singular ''Tags'', and did not share ''Tags'' at that point.
In 2003, the
social bookmarking website
Delicious provided a way for its users to add "tags" to their bookmarks (as a way to help find them later);
Delicious also provided browseable aggregated views of the bookmarks of all users featuring a particular tag. Within a couple of years, the
photo sharing website
Flickr allowed its users to add their own text tags to each of their pictures, constructing flexible and easy metadata that made the pictures highly searchable. The success of Flickr and the influence of Delicious popularized the concept, and other
social software websites—such as
YouTube
YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second most ...
,
Technorati, and
Last.fm—also implemented tagging. In 2005, the
Atom
Every atom is composed of a nucleus and one or more electrons bound to the nucleus. The nucleus is made of one or more protons and a number of neutrons. Only the most common variety of hydrogen has no neutrons.
Every solid, liquid, gas ...
web syndication standard provided a "category" element for inserting subject categories into
web feed
On the World Wide Web, a web feed (or news feed) is a data format used for providing users with frequently updated content. Content distributors ''Web syndication, syndicate'' a web feed, thereby allowing users to ''subscribe'' a channel to it b ...
s, and in 2007
Tim Bray proposed a "tag"
URN.
Examples
Within a blog
Many systems (and other web
content management systems) allow authors to add free-form tags to a post, along with (or instead of) placing the post into a predetermined category.
For example, a post may display that it has been tagged with
baseball
and
tickets
. Each of those tags is usually a
web link leading to a index page listing all of the posts associated with that tag. The blog may have a sidebar listing all the tags in use on that blog, with each tag leading to an index page. To reclassify a post, an author edits its list of tags. All connections between posts are automatically tracked and updated by the blog software; there is no need to relocate the page within a complex hierarchy of categories.
Within application software
Some
desktop applications and
web applications feature their own tagging systems, such as email tagging in
Gmail
Gmail is a free email service provided by Google. As of 2019, it had 1.5 billion active user (computing), users worldwide. A user typically accesses Gmail in a web browser or the official mobile app. Google also supports the use of email clien ...
and
Mozilla Thunderbird,
bookmark tagging in
Firefox, audio tagging in
iTunes or
Winamp, and photo tagging in various applications. Some of these applications display collections of tags as
tag clouds.
Assigned to computer files
There are various systems for applying tags to the files in a computer's
file system.
In
Apple
An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus '' Malus''. The tree originated in Central Asia, where its wild ances ...
's
Mac System 7, released in 1991, users could assign one of
seven editable colored labels (with editable names such as "Essential", "Hot", and "In Progress") to each file and folder. In later iterations of the Mac operating system ever since
OS X 10.9
OS X Mavericks (version 10.9) is the 10th major release of macOS, Apple Inc.'s desktop and server operating system for Macintosh computers. OS X Mavericks was announced on June 10, 2013, at WWDC 2013, and was released on October 22, 2013, worl ...
was released in 2013, users could assign multiple arbitrary tags as
extended file attributes to any file or folder, and before that time the
open-source OpenMeta standard provided similar tagging functionality for
Mac OS X.
Several
semantic file system
Semantic file systems are file systems used for information persistence which structure the data according to their semantics and intent, rather than the location as with current file systems. It allows the data to be addressed by their content (as ...
s that implement tags are available for the
Linux kernel, including
Tagsistant
Tagsistant is a semantic file system for the Linux kernel, written in C and based on FUSE. Unlike traditional file systems that use hierarchies of directories to locate objects, Tagsistant introduces the concept of tags.
Design and differenc ...
.
Microsoft Windows allows users to set tags only on
Microsoft Office
Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a ma ...
documents and some kinds of picture files.
Cross-platform
In computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several computing platforms. Some cross-platform software ...
file tagging standards include
Extensible Metadata Platform (XMP), an
ISO standard for embedding metadata into popular image, video and document file formats, such as
JPEG and
PDF, without breaking their readability by applications that do not support XMP. XMP largely supersedes the earlier
IPTC Information Interchange Model.
Exif is a standard that specifies the image and audio
file format
A file format is a Computer standard, standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary format, pr ...
s used by
digital camera
A digital camera is a camera that captures photographs in digital memory. Most cameras produced today are digital, largely replacing those that capture images on photographic film. Digital cameras are now widely incorporated into mobile devic ...
s, including some metadata tags.
TagSpaces
TagSpaces is an open-source data manager and file navigator. It helps organize files on local drives by adding tags to files. Users get the same user interface to manage their files on different platforms. TagSpaces is compatible with Windows, ...
is an open-source cross-platform application for tagging files; it inserts tags into the
filename.
For an event
An ''official tag'' is a keyword adopted by events and conferences for participants to use in their web publications, such as blog entries, photos of the event, and presentation slides. Search engines can then index them to make relevant materials related to the event searchable in a uniform way. In this case, the tag is part of a
controlled vocabulary
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controll ...
.
In research
A researcher may work with a large collection of items (e.g. press quotes, a bibliography, images) in digital form. If he/she wishes to associate each with a small number of themes (e.g. to chapters of a book, or to sub-themes of the overall subject), then a group of tags for these themes can be attached to each of the items in the larger collection. In this way, freeform
classification allows the author to manage what would otherwise be unwieldy amounts of information.
Special types
Triple tags
A triple tag or machine tag uses a special
syntax to define extra
semantic information about the tag, making it easier or more meaningful for interpretation by a computer program. Triple tags comprise three parts: a
namespace, a
predicate, and a value. For example,
geo:long=50.123456
is a tag for the geographical
longitude
Longitude (, ) is a geographic coordinate that specifies the east– west position of a point on the surface of the Earth, or another celestial body. It is an angular measurement, usually expressed in degrees and denoted by the Greek let ...
coordinate whose value is 50.123456. This triple structure is similar to the
Resource Description Framework model for information.
The triple tag format was first devised for geolicious in November 2004, to map
Delicious bookmarks, and gained wider acceptance after its adoption by Mappr and GeoBloggers to map
Flickr photos. In January 2007, Aaron Straup Cope at Flickr introduced the term ''machine tag'' as an alternative name for the triple tag, adding some questions and answers on purpose, syntax, and use.
Specialized metadata for geographical identification is known as ''
geotagging
Geotagging, or GeoTagging, is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RSS feeds and is a form of geospatial metadata. This data ...
''; machine tags are also used for other purposes, such as identifying photos taken at a specific event or naming species using
binomial nomenclature
In taxonomy, binomial nomenclature ("two-term naming system"), also called nomenclature ("two-name naming system") or binary nomenclature, is a formal system of naming species of living things by giving each a name composed of two parts, b ...
.
Hashtags
A hashtag is a kind of metadata tag marked by the prefix
#
, sometimes known as a "hash" symbol. This form of tagging is used on
microblogging and
social networking services such as
Twitter
Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ...
,
Facebook
Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin ...
,
Google+,
VK and
Instagram
Instagram is a photo and video sharing social networking service owned by American company Meta Platforms. The app allows users to upload media that can be edited with filters and organized by hashtags and geographical tagging. Posts can ...
. The hash is used to distinguish tag text, as distinct, from other text in the post.
Knowledge tags
A knowledge tag is a type of
meta-information that describes or defines some aspect of a piece of information (such as a
document
A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" o ...
,
digital image,
database table, or
web page).
Knowledge tags are more than traditional non-hierarchical
keywords or terms; they are a type of
metadata that captures knowledge in the form of descriptions, categorizations, classifications,
semantics, comments, notes, annotations,
hyperdata,
hyperlinks, or references that are collected in tag profiles (a kind of
ontology
In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality.
Ontology addresses questions like how entities are grouped into categories and which of these entities ...
).
These tag profiles reference an information resource that resides in a distributed, and often heterogeneous, storage repository.
Knowledge tags are part of a
knowledge management discipline that leverages
Enterprise 2.0 methodologies for users to capture insights, expertise, attributes, dependencies, or relationships associated with a data resource.
Different kinds of knowledge can be captured in knowledge tags, including factual knowledge (that found in books and data), conceptual knowledge (found in perspectives and concepts), expectational knowledge (needed to make judgments and hypothesis), and methodological knowledge (derived from reasoning and strategies).
These forms of
knowledge
Knowledge can be defined as awareness of facts or as practical skills, and may also refer to familiarity with objects or situations. Knowledge of facts, also called propositional knowledge, is often defined as true belief that is disti ...
often exist outside the data itself and are derived from personal experience, insight, or expertise. Knowledge tags are considered an expansion of the information itself that adds additional value, context, and meaning to the information. Knowledge tags are valuable for preserving organizational intelligence that is often lost due to
turnover
Turnover or turn over may refer to:
Arts, entertainment, and media
*''Turn Over'', a 1988 live album by Japanese band Show-Ya
* Turnover (band), an American rock band
*"Turnover", a song on Fugazi's 1990 album '' Repeater''
*''Turnover'', a Japane ...
, for sharing knowledge stored in the minds of individuals that is typically isolated and unharnessed by the organization, and for connecting knowledge that is often lost or disconnected from an information resource.
Advantages and disadvantages
In a typical tagging system, there is no explicit information about the meaning or
semantics of each tag, and a user can apply new tags to an item as easily as applying older tags.
Hierarchical classification systems can be slow to change, and are rooted in the culture and era that created them; in contrast, the flexibility of tagging allows users to classify their collections of items in the ways that they find useful, but the personalized variety of terms can present challenges when searching and browsing.
When users can freely choose tags (creating a
folksonomy, as opposed to selecting terms from a
controlled vocabulary
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controll ...
), the resulting metadata can include
homonym
In linguistics, homonyms are words which are homographs (words that share the same spelling, regardless of pronunciation), or homophones (equivocal words, that share the same pronunciation, regardless of spelling), or both. Using this definition, ...
s (the same tags used with different meanings) and
synonym
A synonym is a word, morpheme, or phrase that means exactly or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are al ...
s (multiple tags for the same concept), which may lead to inappropriate connections between items and inefficient searches for information about a subject. For example, the tag "orange" may refer to the
fruit
In botany, a fruit is the seed-bearing structure in flowering plants that is formed from the ovary after flowering.
Fruits are the means by which flowering plants (also known as angiosperms) disseminate their seeds. Edible fruits in partic ...
or the
color
Color (American English) or colour (British English) is the visual perceptual property deriving from the spectrum of light interacting with the photoreceptor cells of the eyes. Color categories and physical specifications of color are assoc ...
, and items related to a version of the
Linux kernel may be tagged "Linux", "kernel", "Penguin", "software", or a variety of other terms. Users can also choose tags that are different
inflection
In linguistic morphology, inflection (or inflexion) is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and ...
s of words (such as singular and plural), which can contribute to navigation difficulties if the system does not include
stemming
In linguistic morphology and information retrieval, stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form. The stem need not be identical to the morph ...
of tags when searching or browsing. Larger-scale folksonomies address some of the problems of tagging, in that users of tagging systems tend to notice the current use of "tag terms" within these systems, and thus use existing tags in order to easily form connections to related items. In this way, folksonomies may collectively develop a partial set of tagging conventions.
Complex system dynamics
Despite the apparent lack of control, research has shown that a simple form of shared vocabulary emerges in social bookmarking systems. Collaborative tagging exhibits a form of
complex system
A complex system is a system composed of many components which may interact with each other. Examples of complex systems are Earth's global climate, organisms, the human brain, infrastructure such as power grid, transportation or communicatio ...
s dynamics (or
self-organizing dynamics).
Thus, even if no central controlled vocabulary constrains the actions of individual users, the distribution of tags converges over time to stable
power law distributions.
Once such stable distributions form, simple
folksonomic vocabularies can be extracted by examining the
correlations that form between different tags. In addition, research has suggested that it is easier for
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
algorithms to learn tag semantics when users tag "verbosely"—when they annotate resources with a wealth of freely associated, descriptive keywords.
Spamming
Tagging systems open to the public are also open to tag spam, in which people apply an excessive number of tags or unrelated tags to an item (such as a
YouTube
YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second most ...
video) in order to attract viewers. This abuse can be mitigated using human or statistical identification of spam items. The number of tags allowed may also be limited to reduce spam.
Syntax
Some tagging systems provide a single
text box to enter tags, so to be able to
tokenize the string, a
separator
Separator can refer to:
* A mechanical device to separate fluids and solids, like
** Cream separator, separates cream from milk
** Demister (vapor), removal of liquid droplets entrained in a vapor stream
** Separator (oil production), of an oil pr ...
must be used. Two popular separators are the
space character and the
comma. To enable the use of separators in the tags, a system may allow for higher-level separators (such as
quotation mark
Quotation marks (also known as quotes, quote marks, speech marks, inverted commas, or talking marks) are punctuation marks used in pairs in various writing systems to set off direct speech, a quotation, or a phrase. The pair consists of an ...
s) or
escape characters. Systems can avoid the use of separators by allowing only one tag to be added to each input
widget at a time, although this makes adding multiple tags more time-consuming.
A syntax for use within
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
is to use the rel-tag
microformat which uses the
''rel'' attribute with value "tag" (i.e.,
rel="tag"
) to indicate that the linked-to page acts as a tag for the current context.
See also
*
Annotation
*
Collective intelligence
Collective intelligence (CI) is shared or group intelligence (GI) that Emergence, emerges from the collaboration, collective efforts, and competition of many individuals and appears in consensus decision making. The term appears in sociobiology ...
*
Concept map
*
Enterprise bookmarking
*
Enterprise social software
*
Expert system
*
Explicit knowledge
*
Human–computer interaction
Human–computer interaction (HCI) is research in the design and the use of computer technology, which focuses on the interfaces between people ( users) and computers. HCI researchers observe the ways humans interact with computers and design ...
*
Information ecology
*
Knowledge transfer
*
Knowledge worker
*
Management information system
*
Metaknowledge
*
Organisational memory
*
RRID SciCrunch is a collaboratively edited knowledge base about scientific resources. It is a community portal for researchers and a content management system for data and databases. It is intended to provide a common source of data to the research comm ...
*
Semantics
*
Semantic web
*
Social network aggregation
*
Subject (documents)
*
Subject indexing
References
{{DEFAULTSORT:Tag (Metadata)
Collective intelligence
Computer jargon
Information retrieval techniques
Knowledge representation
Metadata
Reference
Web 2.0