library and information science Library and information science (LIS) (sometimes given as the plural library and information sciences) is a branch of academic disciplines that deals generally with organization, access, collection, and protection/regulation of information, whether ...
document A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', whic ...

s (such as books, articles and pictures) are classified and searched by subject – as well as by other attributes such as author, genre and document type. This makes "subject" a fundamental term in this field. Library and information specialists assign subject labels to documents to make them findable. There are many ways to do this and in general there is not always consensus about which subject should be assigned to a given document. To optimize subject indexing and searching, we need to have a deeper understanding of what a subject is. The question: "what is to be understood by the statement 'document A belongs to subject category X'?" has been debated in the field for more than 100 years (see
below Below may refer to: *Earth *Ground (disambiguation) *Soil *Floor *Bottom (disambiguation) *Less than *Temperatures below freezing *Hell or underworld People with the surname *Fred Below (1926–1988), American blues drummer *Fritz von Below (1853 ...


Hjørland defined subjects as the ''epistemological potentials of documents''. This definition is in line with the request oriented understanding of indexing quoted below. The idea is that a document is assigned a subject to ease retrieval and findability. And the criteria for what should be found – what constitutes knowledge – is in the end an epistemological question.

Theoretical view

Charles Ammi Cutter Charles Ammi Cutter (March 14, 1837 – September 6, 1903) was an American librarian A librarian is a person who works professionally in a library, providing access to information, and sometimes social or technical programming, or instruction ...

For Cutter the stability of subjects depends on a social process in which their meaning is stabilized in a name or a designation. A subject "referred ..to those intellections ..that had received a name that itself represented a distinct consensus in usage" (Miksa, 1983a, p. 60) and: the "systematic structure of established subjects" is "resident in the public realm" (Miksa, 1983a, p. 69); " bjects are by their very nature locations in a classificatory structure of publicly accumulated knowledge (Miksa, 1983a, p. 61). Bernd Frohmann adds: "The stability of the public realm in turn relies upon natural and objective mental structures which, with proper education, govern a natural progression from particular to general concepts. Since for Cutter, mind, society, and SKO ystems of Knowledge Organizationstand one behind the other, each supporting each, all manifesting the same structure, his discursive construction of subjects invites connections with discourses of mind, education, and society. The
Dewey Decimal Classification The Dewey Decimal Classification (DDC), colloquially the Dewey Decimal System, is a proprietary library classification A library book shelf in Hong Kong arranged using the Dewey classification A library classification is a system of knowledge ...
(DDC), by contrast, severs those connections.
Melvil Dewey Melville Louis Kossuth "Melvil" Dewey (December 10, 1851 – December 26, 1931) was an influential American librarian A librarian is a person who works professionally in a library A library is a collection of materials, books or media t ...

Melvil Dewey
emphasized more than once that his system maps no structure beyond its own; there is neither a "transcendental deduction" of its categories nor any reference to Cutter's objective structure of social consensus. It is content-free: Dewey disdained any philosophical excogitation of the meaning of his class symbols, leaving the job of finding verbal equivalents to others. His innovation and the essence of the system lay in the notation. The DDC is a poorly semiotic system of expanding nests of ten digits, lacking any referent beyond itself. In it, a subject is wholly constituted in terms of its position in the system. The essential characteristic of a subject is a class symbol which refers only to other symbols. Its verbal equivalent is accidental, a merely pragmatic characteristic... .... The conflict of interpretations over "subjects" became explicit in the battles between "bibliography" (an approach to subjects having much in common with Cutter's) and Dewey's "close classification". William Fletcher spoke for the scholarly bibliographer.... Fletcher's "subjects", like Cutter's, referred to the categories of a fantasized, stable social order, whereas Dewey's subjects were elements of a semiological system of standardized, techno-bureaucratic administrative software for the library in its corporate, rather than high culture, incarnation". (Frohmann, 1994, 112–113). Cutter's early view on what a subject is, is probably wiser than most understandings that dominated the 20th century – and also the understanding reflected in the ISO-standard quoted below. The early statements quoted by Frohmann indicate that subjects are somehow shaped in social processes. When that is said, it should be added that they are not particularly detailed or clear. We only get a vague idea of the social nature of subjects.


A system, which has en explicit theoretical foundation is Ranganathan's
Colon Classification Colon classification (CC) is a system of library classification A library classification is a system of knowledge organization by which library resources are arranged and ordered systematically. Library classifications a notational system tha ...
. Ranganathan provided an explicit definition of the concept of "subject": A related definition is given by one of Ranganathan's students: Ranganathan's definition of "subject" is strongly influenced by his Colon Classification system. The colon system is based on the combination of single elements from facets to subject designation. This is the reason why the combined nature of subjects are emphasized so strongly. It leads, however, to absurdities such as the claim that gold cannot be a subject (but is alternatively termed "an isolate"). This aspect of the theory has been criticized by Metcalfe (1973, p. 318). Metcalfe's skepticism regarding Ranganathan's theory is formulated in hard words (op. cit., p. 317): "This pseudo-science imposed itself on British disciples from about 1950 on...". It seems unacceptable that Ranganathan defines the word subject in a way that favors his own system. A scientific concept like "subject" should make it possible to compare different ways of establishing access to information. Whether or not subjects are combined or not should be examined once their definition has been given, it should not determined a priori, in the definition. Besides the emphasis on the combined, organizing and systematizing nature of subjects contains Ranganathan's definition of subject the pragmatic demand, that a subject should be determined in a way that suits a normal person's competency or specialization. Again we see a strange kind of wishful thinking mixing a general understanding of a concept with demands put by his own specific system. One thing is what the word subject means, quite another issue is how to provide subject descriptions that fulfill demands such as the specificity of a given information retrieval language which fulfill demands put on the system, such as
precision and recall In pattern recognition Pattern recognition is the automated recognition of pattern A pattern is a regularity in the world, in human-made design, or in abstract ideas. As such, the elements of a pattern repeat in a predictable manner. A geom ...
. If researchers too often define terms in ways that favor specific kinds of systems, that are such definitions not useful to provide more general theories about subjects, subject analysis and IR. Among other things are comparative studies of different kinds of systems made difficult. Based on these arguments (as well as additional arguments which have been used in the literature) we may conclude that Ranganathan's definition of the concept "subject" is not suited for scientific use. Like the definition of "subject" given by the ISO-standard for
topic maps A topic map is a standard for the representation Representation may refer to: Law and politics *Representation (politics) Political representation is the activity of making citizens "present" in public policy making processes when political a ...
may Ranganathan's definition be useful within his own closed system. The purpose of a scientific and scholarly field is, however, to examine the relative fruitfulness of systems such as
topic maps A topic map is a standard for the representation Representation may refer to: Law and politics *Representation (politics) Political representation is the activity of making citizens "present" in public policy making processes when political a ...
and Colon classification. For such purpose is another understanding of "subject" necessary.

Patrick Wilson (1927–2003)

In his book Wilson (1968) examined – in particular by thought experiments – the suitability of different methods of examining the subject of a document. The methods were: * identifying the author's purpose for writing the document, * weighing the relative dominance and subordination of different elements in the picture, which the reading imposes on the reader, * grouping or count the document's use of concepts and references, * construing a set of rules for selecting elements deemed necessary (as opposed to unnecessary) for the work as a whole. Patrick Wilson shows convincingly that each of these methods are insufficient to determine the subject of a document and is led to conclude ( p. 89): "The notion of the subject of a writing is indeterminate..." or, on p. 92 (about what users may expect to find using a particular position in a library classification system): "For nothing definite can be expected of the things found at any given position". In connection to the last quote has Wilson an interesting footnote in which he writes that authors of documents often use terms in ambiguous ways ("hostility" is used as an example). Even if the librarian could personally develop a very precise understanding of a concept, he would be unable to use it in his classification, because none of the documents use the term in the same precise way. Based on this argumentation is Wilson led to conclude: "If people write on what are for them ill-defined phenomena, a correct description of their subjects must reflect the ill-definedness". Wilson's concept of subject was discussed by Hjørland (1992) who found that it is problematic to give up the precise understanding of such a basic term in LIS. Wilson's arguments led him to an agnostic position which Hjørland found unacceptable and unnecessary. Concerning the authors' use of ambiguous terms, the role of the subject analysis is to determine which documents would be fruitful for users to identify whether or not the documents use one or another term or whether a given term in a document is used in one or another meaning. Clear and relevant concepts and distinctions in classification systems and controlled vocabularies may be fruitful even if they are applied to documents with ambiguous terminology.

"Content oriented" versus "request oriented" views

Request oriented indexing is indexing in which the anticipated request from users is influencing how documents are being indexed. The indexer ask himself: "Under which descriptors should this entity be found?" and "think of all the possible queries and decide for which ones the entity at hand is relevant" (Soergel, 1985, p. 230). Request oriented indexing may be indexing that is targeted towards a particular audience or user group. For example, a library or a database for feminist studies may index documents different compared to a historical library. It is probably better, however, to understand request oriented indexing as policy based indexing: The indexing is done according to some ideals and reflects the purpose of the library or database doing the indexing. In this way it is not necessarily a kind of indexing based on user studies. Only if empirical data about use or users are applied should request oriented indexing be regarded as a user-based approach.

The subject knowledge view

Rowley & Hartley (2008, p. 109) wrote "In order to achieve good consistent indexing, the indexer must have a through appreciation of the structure of the subject and the nature of the contribution that the document is making to the advancement of knowledge within a particular discipline". This is accordance with Hjørland's definition given above.

Other views and definitions

In the ISO-standard for
topic maps A topic map is a standard for the representation Representation may refer to: Law and politics *Representation (politics) Political representation is the activity of making citizens "present" in public policy making processes when political a ...
the concept of subject is defined this way: "Subject Anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever." ISO 13250-1, here cited from draft: http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0446.htm#overview) This definition may work well with the closed system of concepts provided by the topic maps standard. In broader contexts, however, is not fruitful because it does not contain any specification of what to identify in a document or in a discourse when ascribing subject identification terms or symbols to it. If different methods of subject analysis imply different results, which of these results can then be said to reflect the (true) subject? (Given that the expression "a true subject assignment" is meaningful at all, which is an important part of the problem). Different persons may have different opinions about what the subject of a specific document is. How can a theoretical understanding of the term "subject" be helpful deciding principles of subject analysis?

Related concepts

Indexing words versus concepts versus subjects

A proposal for the differentiation between concept indexing and subject indexing was given by Bernier (1980). In his opinion subject indexes are different from, and can be contrasted with, indexes to concepts, topics and words. Subjects are what authors are working and reporting on. A document can have the subject of Chromatography if this is what the author wishes to inform about. Papers using Chromatography as a research method or discussing it in a subsection do not have Chromatography as subjects. Indexers can easily drift into indexing concepts and words rather than subjects, but this is not good indexing. Bernier does not, however, differentiate author's subjects from those of the information seeker. A user may want a document about a subject, which is different from the one intended by its author. From the point of view of information systems, the subject of a document is related to the questions that the document can answer for the users (cf. the distinction between a content oriented and a request-oriented approach). Hjørland & Nicolaisen (2005) investigated the concept of subject in relation to
Bradford's law Bradford's law is a pattern first described by Samuel C. Bradford in 1934 that estimates the exponential decay, exponentially diminishing returns of searching for references in science journals. One formulation is that if journals in a field are so ...
of scattering and made a distinction between three kinds of scattering: * lexical scattering – the scattering of words in texts and in collections of texts, * semantic scattering – the scattering of concepts in texts and in collections of texts, * subject scattering – the scattering of items useful to a given task or problem.


"The FRSAR Working Group is aware that some controlled vocabularies provide terminology to express other aspects of works in addition to subject (such as form, genre, and target audience of resources). While very important and the focus of many user queries, these aspects describe isness or what class the work belongs to based on form or genre (e.g., novel, play, poem, essay, biography, symphony, concerto, sonata, map, drawing, painting, photograph, etc.) rather than what the work is about." (IFLA, 2010, p. 10).


"Those LIS authors who have focused on the subjects of visual resources, such as artworks and photographs, have often been concerned with how to distinguish between the "aboutness" and the "ofness" (both specific and generic depiction or representation) of such works (Shatford, 1986). In this sense, "aboutness" has a narrower meaning than that used above. A painting of a sunset over San Francisco, for instance, might be analyzed as being (generically) "of" sunsets and (specifically) "of" San Francisco, but also "about" the passage of time." (IFLA, 2010, p. 11). See also: Baca & Harpring (2000) and Shatford (1986).Shatford, S. (1986). Analyzing the subject of a picture: A theoretical approach. Cataloging & Classification Quarterly. 6 (3): 39–62.

See also

Aboutness Aboutness is a term used in library and information science Library and information science (LIS) (sometimes given as the plural library and information sciences) is a branch of academic disciplines that deal generally with organization, access, an ...
Document classification Document classification or document categorization is a problem in library science Library science (often termed library studies, bibliothecography, library economy, and informatics) is an or multidisciplinary field that applies the practices, pe ...
Subject indexing Subject indexing is the act of describing or classifying a document A document is a written Writing is a medium of human communication Communication (from Latin ''communicare'', meaning "to share") is the act of developing Semantics, m ...
Subject access Subject access refers to the methods and systems by which books, journals, and other document A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fic ...
* Subject term *
Topic-comment In linguistics Linguistics is the scientific study of language A language is a structured system of communication Communication (from Latin Latin (, or , ) is a classical language belonging to the Italic languages, Itali ...


Further reading

Drake, C. L. (1960). What is a subject? Australian Library Journal, 9, 34–41. Englebretsen, George (1987). Subjects. Studia Leibnitiana, Bd. 19, H. 1, pp. 85–90. Published by: Franz Steiner Verlag. {{JSTOR, 40694071 Hjørland, Birger (1997): Information Seeking and Subject Representation. An Activity-theoretical approach to Information Science. Westport & London: Greenwood Press. Hjørland, Birger (2009). Book review of: Rowley, Jennifer & Hartley, Richard (2008). Organizing Knowledge. An Introduction to Managing Access to Information. Aldershot: Ashgate Publishing Limited. IN: Journal of Documentation, 65(1), 166–169. Manuscript retrieved 2011-10-15 from: http://arizona.openrepository.com/arizona/bitstream/10150/106533/1/Book_review_Rowley_&_Hartley.doc Hjørland, Birger (2017). Subject (of documents). Knowledge Organization, vol. 44, Issue 1, pp. 55-64.Also in ISKO Encyclopedia of Knowledge Organization: https://www.isko.org/cyclo/subject IFLA (2010).Functional Requirements for Subject Authority Data (FRSAD): A Conceptual Model. By IFLA Working Group on the Functional Requirements for Subject Authority Records (FRSAR). Edited by Marcia Lei Zeng, Maja umer, Athena Salaba. International Federation of Library Associations and Institutions. Berlin: De Gruyter. Retrieved 2011-09-14 from: http://www.ifla.org/files/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf Miksa, F. (1983b): The Subject in the Dictionary Catalog from Cutter to the Present. Chicago: American Library Association. Welty, C. A. (1998). The Ontological Nature of Subject Taxonomies. IN: N. Guarino (ed.), Proceedings of the First Conference on Formal Ontology and Information Systems, Amsterdam, IOS Press. http://www.cs.vassar.edu/faculty/welty/papers/fois-98/fois-98-1.html * * Information science Library science terminology