HOME

TheInfoList




In
morphology Morphology, from the Greek and meaning "study of shape", may refer to: Disciplines * Morphology (archaeology), study of the shapes or forms of artifacts * Morphology (astronomy), study of the shape of astronomical objects such as nebulae, galaxies ...
and
lexicography Lexicography is the study of lexicon A lexicon is the vocabulary A vocabulary, also known as a wordstock or word-stock, is a set of familiar words within a person's language. A vocabulary, usually developed with age, serves as a useful a ...

lexicography
, a lemma (plural ''lemmas'' or ''lemmata'') is the canonical form, dictionary form, or citation form of a set of
word In linguistics Linguistics is the scientific study of language A language is a structured system of communication used by humans, including speech (spoken language), gestures (Signed language, sign language) and writing. Most lang ...

word
s (
headword A headword, lemma, or catchword is the word In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic, objective or pragmatics, practical meaning (linguistics) ...
). In
English English usually refers to: * English language English is a West Germanic languages, West Germanic language first spoken in History of Anglo-Saxon England, early medieval England, which has eventually become the World language, leading lan ...

English
, for example, ''break'', ''breaks'', ''broke'', ''broken'' and ''breaking'' are forms of the same
lexeme A lexeme () is a unit of lexical meaning that underlies a set of words that are related through inflection In linguistic morphology Morphology, from the Greek and meaning "study of shape", may refer to: Disciplines * Morphology (archaeolog ...
, with ''break'' as the lemma by which they are indexed. ''Lexeme'', in this context, refers to the set of all the forms that have the same meaning, and ''lemma'' refers to the particular form that is chosen by convention to represent the lexeme. Lemmas have special significance in highly
inflected languages Fusional languages or inflected languages are a type of synthetic language A synthetic language uses inflection In linguistic morphology, inflection (or inflexion) is a process of word formation, in which a word is modified to express dif ...
such as
Arabic Arabic (, ' or , ' or ) is a Semitic language The Semitic languages are a branch of the Afroasiatic language family originating in the Middle East The Middle East is a list of transcontinental countries, transcontinental region ...

Arabic
,
Turkish Turkish may refer to: * of or about Turkey Turkey ( tr, Türkiye ), officially the Republic of Turkey, is a country straddling Southeastern Europe and Western Asia. It shares borders with Greece Greece ( el, Ελλάδα, , ), offi ...

Turkish
and
Russian Russian refers to anything related to Russia, including: *Russians (русские, ''russkiye''), an ethnic group of the East Slavic peoples, primarily living in Russia and neighboring countries *Rossiyane (россияне), Russian language term ...
. The process of determining the ''lemma'' for a given word is called
lemmatisationLemmatisation ( or lemmatization) in linguistics Linguistics is the science, scientific study of language. It encompasses the analysis of every aspect of language, as well as the methods for studying and modeling them. The traditional areas ...
. The lemma can be viewed as the chief of the
principal parts In language learning, the principal parts of a verb A verb, from the Latin ''wikt:verbum#Latin, verbum'' meaning ''word'', is a word (part of speech) that in syntax conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occu ...
, although lemmatisation is at least partly arbitrary.


Morphology

The form of a word that is chosen to serve as the lemma is usually the least
marked In linguistics and social sciences, markedness is the state of standing out as nontypical or divergent in comparison to a regular or more common form. In a marked–unmarked relation, one term of an opposition is the broader, dominant one. The ...
form, but there are several exceptions such as, for several languages, the use of the infinitive for verbs. For English, the citation form of a
noun A noun () is a word In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic, objective or pragmatics, practical meaning (linguistics), meaning. In many l ...

noun
is the
singular Singular may refer to: * Singular, the grammatical number In linguistics, grammatical number is a grammatical category of nouns, pronouns, adjectives, and verb agreement (linguistics), agreement that expresses count distinctions (such as "one", ...
: ''mouse'' rather than ''mice''. For multiword lexemes that contain
possessive adjective Possessive determiners (from la, possessivus, translit=; grc, κτητικός / ktētikós - en. ktetic Lallu) are determiners A determiner, also called determinative ( abbreviated ), is a word In linguistics, a word of a spoken language c ...
s or
reflexive pronoun In general linguistics, a reflexive pronoun, sometimes simply called a reflexive, is an anaphoric pronoun In linguistics and grammar, a pronoun (list of glossing abbreviations, abbreviated ) is a word or a group of words that one may subs ...
s, the citation form uses a form of the
indefinite pronoun An indefinite pronoun is a pronoun In linguistics Linguistics is the scientific study of language A language is a structured system of communication used by humans, including speech (spoken language), gestures (Signed language, si ...
''one'': ''do one's best'', ''perjure oneself''. In European languages with
grammatical gender In linguistics Linguistics is the scientific study of language A language is a structured system of communication used by humans, including speech (spoken language), gestures (Signed language, sign language) and writing. Most langua ...
, the citation form of regular adjectives and nouns is usually the masculine singular. If the language also has cases, the citation form is often the masculine singular nominative. For many languages, the citation form of a
verb A verb () is a word (part of speech) that in syntax conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occurrence (''happen'', ''become''), or a state of being (''be'', ''exist'', ''stand''). In the usual description of E ...
is the
infinitive Infinitive ( ) is a term for certain forms existing in many languages, most often used as s. As with many linguistic concepts, there is not a single definition applicable to all languages. The word is derived from '' odusinfinitivus'', a derivat ...
:
French
French
',
German German(s) may refer to: Common uses * of or related to Germany * Germans, Germanic ethnic group, citizens of Germany or people of German ancestry * For citizens of Germany, see also German nationality law * German language The German la ...

German
',
Hindustani Hindustani may refer to: * something of, from, or related to Hindustan (another name of India) * Hindustani language, an Indo-Aryan language, whose two official norms are Hindi and Urdu * Fiji Hindi, a variety of Eastern Hindi spoken in Fiji, and i ...
/,
Spanish Spanish may refer to: * Items from or related to Spain: **Spaniards, a nation and ethnic group indigenous to Spain **Spanish language **Spanish cuisine Other places * Spanish, Ontario, Canada * Spanish River (disambiguation), the name of several ...

Spanish
'. For English, that usually coincides with the uninflected, least marked form of the verb (that is, "break", not "breaks" or "breaking"), but the present tense is used for some
defective verb In linguistics Linguistics is the science, scientific study of language. It encompasses the analysis of every aspect of language, as well as the methods for studying and modeling them. The traditional areas of linguistic analysis include ...
s (''shall'', ''can'', and ''must'' have only one form). For
Latin Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken in the area around Rome, known as Latium. Through the power of the Roman Republic, it became ...

Latin
,
Ancient Greek Ancient Greek includes the forms of the Greek language Greek ( el, label=Modern Greek Modern Greek (, , or , ''Kiní Neoellinikí Glóssa''), generally referred to by speakers simply as Greek (, ), refers collectively to the diale ...
,
Modern Greek Modern Greek (, , or , ''Kiní Neoellinikí Glóssa''), generally referred to by speakers simply as Greek (, ), refers collectively to the dialects of the Greek language spoken in the modern era, including the official standardized form of the l ...
, and
Bulgarian Bulgarian may refer to: * Something of, from, or related to the country of Bulgaria * Bulgarians, a South Slavic ethnic group * Bulgarian language, a Slavic language * Bulgarian alphabet * A citizen of Bulgaria, see Demographics of Bulgaria * Bulg ...

Bulgarian
however, the first person singular
present tense The present tense (abbreviated An abbreviation (from Latin ''brevis'', meaning ''short'') is a shortened form of a word or phrase, by any method. It may consist of a group of letters, or words taken from the full version of the word or phrase; ...
is traditionally used, but some modern dictionaries use the infinitive instead. (Bulgarian lacked infinitives. For contracted verbs in Ancient Greek, an uncontracted first person singular present tense is used to reveal the contract vowel: ''philéō'' for ''philō'' "I love" mplying affection ''agapáō'' for ''agapō'' "I love" mplying regard.
Finnish Finnish may refer to: * Something or someone from, or related to Finland * Finnish culture * Finnish people or Finns, the primary ethnic group in Finland * Finnish language, the national language of the Finnish people * Finnish cuisine See also

...
dictionaries list verbs not under their root, but under the first infinitive, marked with ''-(t)a'', ''-(t)ä''. For
Japanese Japanese may refer to: * Something from or related to Japan Japan ( ja, 日本, or , and formally ) is an island country An island country or an island nation is a country A country is a distinct territory, territorial body or ...

Japanese
, the non-past (present and future) tense is used. For
Arabic Arabic (, ' or , ' or ) is a Semitic language The Semitic languages are a branch of the Afroasiatic language family originating in the Middle East The Middle East is a list of transcontinental countries, transcontinental region ...

Arabic
, which has no infinitives, the third-person singular masculine of the past tense is the least-marked form and is used for entries in modern dictionaries. In older dictionaries, which are still commonly used, the
triliteral The roots of verbs and most nouns in the Semitic languages are characterized as a sequence of consonant In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are ...
of the word, either a verb or a noun, is used. This is similar to
Hebrew Hebrew (, , or ) is a Northwest Semitic languages, Northwest Semitic language of the Afroasiatic languages, Afroasiatic language family. Historically, it is regarded as one of the spoken languages of the Israelites and their longest-survivi ...
, which also uses the third-person singular masculine past-tense (perfect) form, e.g. ברא ''bara' '' create, כפר ''kaphar'' deny.
Georgian Georgian may refer to: Common meanings * Anything related to, or originating from Georgia (country) **Georgians, an indigenous Caucasian ethnic group **Georgian language, a Kartvelian language spoken by Georgians **Georgian scripts, three scripts ...
uses the
verbal noun A verbal noun or gerundial noun is a verb form that functions as a noun. An example of a verbal noun in English English usually refers to: * English language English is a West Germanic languages, West Germanic language first spoken in ...
. For
Korean Korean may refer to: People and culture * Koreans, ethnic group originating in the Korean Peninsula * Korean cuisine * Korean culture * Korean language **Korean alphabet, known as Hangul or Chosŏn'gŭl **Korean dialects and the Jeju language **S ...
, ''-da'' is attached to the stem. In the
Tamil Tamil may refer to: * Tamils, an ethnic group native to India, Sri Lanka and some other parts of Asia **Sri Lankan Tamils, Tamil people native to Sri Lanka **Tamil Malaysians, Tamil people native to Malaysia * Tamil language, a Dravidian languages, ...

Tamil
, an
agglutinative language An agglutinative language is a type of synthetic language A synthetic language uses inflection In linguistic morphology, inflection (or inflexion) is a process of word formation, in which a word is modified to express different grammatica ...
, the verb stem is often cited, e.g., ' In
Irish Irish most commonly refers to: * Someone or something of, from, or related to: ** Ireland, an island situated off the north-western coast of continental Europe ** Northern Ireland, a constituent unit of the United Kingdom of Great Britain and North ...
, words are highly inflected by case (genitive, nominative, dative and vocative) and by their place within a sentence because of initial mutations. The noun ''cainteoir'', the lemma for the noun meaning "speaker", has a variety of forms: ''chainteoir'', ''gcainteoir'', ''cainteora'', ''chainteora'', ''cainteoirí'', ''chainteoirí'' and ''gcainteoirí''. Some phrases are cited in a sort of lemma: ''
Carthago delenda est Ruins in Carthage ("Furthermore, I consider that Carthage Carthage was the capital city of the ancient Ancient Carthage, Carthaginian civilization, on the eastern side of the Lake of Tunis in what is now Tunisia. Carthage was one of the most ...
'' (literally, "Carthage must be destroyed") is a common way of citing
Cato Cato typically refers to either Cato the Elder or Cato the Younger, both of the Porcii Catones family of Rome. It may also refer to any of the following: People Romans, in the family Porcii Catones * Cato the Elder (Cato Maior) or "the Censor" ...
, but what he said was nearer to ''censeo Carthaginem esse delendam'' ("I hold Carthage to be in need of destruction").


Lexicography

In a dictionary, the lemma "go" represents the inflected forms "go", "goes", "going", "went", and "gone". The relationship between an inflected form and its lemma is usually denoted by an angle bracket, e.g., "went" < "go". Of course, the disadvantage of such simplifications is the inability to look up a declined or conjugated form of the word, but some dictionaries, like
Webster's Dictionary ''Webster's Dictionary'' is any of the English language dictionaries A dictionary is a listing of lexeme A lexeme () is a unit of lexical meaning that underlies a set of words that are related through inflection In linguistic mo ...

Webster's Dictionary
, list "went". Multilingual dictionaries vary in how they deal with this issue: the
Langenscheidt Langenscheidt () is a Germany, German publishing company that specializes in language reference works. In addition to publishing language, monolingual dictionary, dictionaries, Langenscheidt also publishes bilingual dictionaries and travel phras ...

Langenscheidt
dictionary of German does not list ''ging'' (< ''gehen''), but the Cassell does. Lemmas or
word stem In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic, objective or pragmatics, practical meaning (linguistics), meaning. In many languages, words also corres ...
s are used often in
corpus linguistics Corpus linguistics is the study of language as a language is expressed in its text corpus (plural ''corpora''), its body of "real world" text. Corpus linguistics proposes that reliable language analysis is more feasible with corpora collected in ...
for determining word frequency. In that usage, the specific definition of "lemma" is flexible depending on the task it is being used for.


Pronunciation

A word may have different
pronunciation Pronunciation is the way in which a word or a language A language is a structured system of communication Communication (from Latin ''communicare'', meaning "to share" or "to be in relation with") is "an apparent answer to the painful d ...
s, depending on its
phonetic Phonetics is a branch of linguistics Linguistics is the scientific study of language A language is a structured system of communication used by humans, including speech (spoken language), gestures (Signed language, sign language) an ...
environment (the neighbouring sounds) or on the degree of stress in a sentence. An example of the latter is the weak and strong forms of certain English
function word In linguistics Linguistics is the scientific study of language A language is a structured system of communication used by humans, including speech (spoken language), gestures (Signed language, sign language) and writing. Most languages ...
s like ''some'' and ''but'' (pronounced , when stressed but , when unstressed). Dictionaries usually give the pronunciation used when the word is pronounced alone (its isolation form) and with stress, but they may also note common weak forms of pronunciation.


Difference between stem and lemma

The
stem Stem or STEM may refer to: Biology * Plant stem, the aboveground structures that have vascular tissue and that support leaves and flowers ** Stipe (botany), a stalk that supports some other structure ** Stipe (mycology), the stem supporting the c ...
is the part of the word that never changes even when morphologically inflected; a lemma is the base form of the word. For example, from "produced", the lemma is "produce", but the stem is "produc-". This is because there are words such as production. and producing In linguistic analysis, the stem is defined more generally as the analyzed base form from which all inflected forms can be formed. When
phonology Phonology is a branch of linguistics Linguistics is the scientific study of language, meaning that it is a comprehensive, systematic, objective, and precise study of language. Linguistics encompasses the analysis of every aspect of lan ...

phonology
is taken into account, the definition of the unchangeable part of the word is not useful, as can be seen in the phonological forms of the words in the preceding example: "produced" vs. "production" . Some lexemes have several stems but one lemma. For instance the verb "
to go
to go
" has the stems "go" and "went" due to
suppletionIn linguistics and etymology, suppletion is traditionally understood as the use of one word as the inflection, inflected form of another word when the two words are not cognate. For those learning a language, suppletive forms will be seen as "irregul ...
: the past tense was co-opted from a different verb, "
to wend
to wend
".


See also

*
Principal parts In language learning, the principal parts of a verb A verb, from the Latin ''wikt:verbum#Latin, verbum'' meaning ''word'', is a word (part of speech) that in syntax conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occu ...
*
Root (linguistics) A root (or root word) is the core of a word that is irreducible into more meaningful elements. In morphology Morphology, from the Greek and meaning "study of shape", may refer to: Disciplines * Morphology (archaeology), study of the shapes or ...
*
Null morpheme In morphology Morphology, from the Greek and meaning "study of shape", may refer to: Disciplines * Morphology (archaeology), study of the shapes or forms of artifacts * Morphology (astronomy), study of the shape of astronomical objects such as ne ...
*
Uninflected word In linguistic morphology In linguistics Linguistics is the science, scientific study of language. It encompasses the analysis of every aspect of language, as well as the methods for studying and modeling them. The traditional areas of l ...
*
Lexical Markup FrameworkLanguage resource management - Lexical markup framework (LMF; ISO 24613:2008), is the ISO International Organization for Standardization The International Organization for Standardization (ISO; ) is an international standard-setting body compose ...


References


External links

{{Authority control Lexical units Units of linguistic morphology Linguistics terminology