HOME

TheInfoList



OR:

In
morphology Morphology, from the Greek and meaning "study of shape", may refer to: Disciplines * Morphology (archaeology), study of the shapes or forms of artifacts * Morphology (astronomy), study of the shape of astronomical objects such as nebulae, galaxies ...
and
lexicography Lexicography is the study of lexicons, and is divided into two separate academic disciplines. It is the art of compiling dictionaries. * Practical lexicography is the art or craft of compiling, writing and editing dictionaries. * Theoretica ...
, a lemma (plural ''lemmas'' or ''lemmata'') is the canonical form, dictionary form, or citation form of a set of
word A word is a basic element of language that carries an semantics, objective or pragmatics, practical semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of w ...
forms. In
English English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national ide ...
, for example, ''break'', ''breaks'', ''broke'', ''broken'' and ''breaking'' are forms of the same
lexeme A lexeme () is a unit of lexical meaning that underlies a set of words that are related through inflection. It is a basic abstract unit of meaning, a unit of morphological analysis in linguistics that roughly corresponds to a set of forms taken ...
, with ''break'' as the lemma by which they are indexed. ''Lexeme'', in this context, refers to the set of all the inflected or alternating forms in the paradigm of a single word, and ''lemma'' refers to the particular form that is chosen by convention to represent the lexeme. Lemmas have special significance in highly
inflected languages Fusional languages or inflected languages are a type of synthetic language, distinguished from agglutinative languages by their tendency to use a single inflectional morpheme to denote multiple grammatical, syntactic, or semantic features. For e ...
such as
Arabic Arabic (, ' ; , ' or ) is a Semitic languages, Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C ...
, Turkish and
Russian Russian(s) refers to anything related to Russia, including: *Russians (, ''russkiye''), an ethnic group of the East Slavic peoples, primarily living in Russia and neighboring countries *Rossiyane (), Russian language term for all citizens and peo ...
. The process of determining the ''lemma'' for a given lexeme is called
lemmatisation Lemmatisation ( or lemmatization) in linguistics is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form. In computational linguistics, lemma ...
. The lemma can be viewed as the chief of the
principal parts In language learning, the principal parts of a verb are those forms that a student must memorize in order to be able to conjugate the verb through all its forms. The concept originates in the humanist Latin schools, where students learned verbs ...
, although lemmatisation is at least partly arbitrary.


Morphology

The form of a word that is chosen to serve as the lemma is usually the least
marked In linguistics and social sciences, markedness is the state of standing out as nontypical or divergent as opposed to regular or common. In a marked–unmarked relation, one term of an opposition is the broader, dominant one. The dominant defau ...
form, but there are several exceptions such as the use of the infinitive for verbs in some languages. For English, the citation form of a
noun A noun () is a word that generally functions as the name of a specific object or set of objects, such as living creatures, places, actions, qualities, states of existence, or ideas.Example nouns for: * Living creatures (including people, alive, d ...
is the
singular Singular may refer to: * Singular, the grammatical number that denotes a unit quantity, as opposed to the plural and other forms * Singular homology * SINGULAR, an open source Computer Algebra System (CAS) * Singular or sounder, a group of boar, ...
(and non-possessive) form: ''mouse'' rather than ''mice''. For multiword lexemes that contain
possessive adjective Possessive determiners (from la, possessivus, translit=; grc, κτητικός / ktētikós - en. ktetic Lallu) are determiners which express possession. Some traditional grammars of English refer to them as possessive adjectives, though they do ...
s or
reflexive pronoun A reflexive pronoun is a pronoun that refers to another noun or pronoun (its antecedent) within the same sentence. In the English language specifically, a reflexive pronoun will end in ''-self'' or ''-selves'', and refer to a previously n ...
s, the citation form uses a form of the
indefinite pronoun An indefinite pronoun is a pronoun which does not have a specific familiar referent. Indefinite pronouns are in contrast to definite pronouns. Indefinite pronouns can represent either count nouns or noncount nouns. They often have related for ...
''one'': ''do one's best'', ''perjure oneself''. In European languages with
grammatical gender In linguistics, grammatical gender system is a specific form of noun class system, where nouns are assigned with gender categories that are often not related to their real-world qualities. In languages with grammatical gender, most or all nouns ...
, the citation form of regular adjectives and nouns is usually the masculine singular. If the language also has cases, the citation form is often the masculine singular nominative. For many languages, the citation form of a
verb A verb () is a word (part of speech) that in syntax generally conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occurrence (''happen'', ''become''), or a state of being (''be'', ''exist'', ''stand''). In the usual descri ...
is the
infinitive Infinitive (abbreviated ) is a linguistics term for certain verb forms existing in many languages, most often used as non-finite verbs. As with many linguistic concepts, there is not a single definition applicable to all languages. The word is deri ...
: French ',
German German(s) may refer to: * Germany (of or related to) **Germania (historical use) * Germans, citizens of Germany, people of German ancestry, or native speakers of the German language ** For citizens of Germany, see also German nationality law **Ger ...
', Hindustani /,
Spanish Spanish might refer to: * Items from or related to Spain: **Spaniards are a nation and ethnic group indigenous to Spain **Spanish language, spoken in Spain and many Latin American countries **Spanish cuisine Other places * Spanish, Ontario, Cana ...
'. English verbs usually have an infinitive, which in its bare form (without the particle ''to'') is its least marked (for example, ''break'' is chosen over ''to break'', ''breaks'', ''broke'', ''breaking'', and ''broken''); for
defective verb In linguistics, a defective verb is a verb that either lacks a conjugated form or entails incomplete conjugation, and thus cannot be conjugated for certain grammatical tenses, aspects, persons, genders, or moods that the majority of verbs or ...
s with no infinitive the present tense is used (for example, ''must'' has only one form while ''shall'' has no infinitive, and both lemmas are their lexemes' present tense forms). For
Latin Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
,
Ancient Greek Ancient Greek includes the forms of the Greek language used in ancient Greece and the ancient world from around 1500 BC to 300 BC. It is often roughly divided into the following periods: Mycenaean Greek (), Dark Ages (), the Archaic peri ...
,
Modern Greek Modern Greek (, , or , ''Kiní Neoellinikí Glóssa''), generally referred to by speakers simply as Greek (, ), refers collectively to the dialects of the Greek language spoken in the modern era, including the official standardized form of the ...
, and
Bulgarian Bulgarian may refer to: * Something of, from, or related to the country of Bulgaria * Bulgarians, a South Slavic ethnic group * Bulgarian language, a Slavic language * Bulgarian alphabet * A citizen of Bulgaria, see Demographics of Bulgaria * Bul ...
, the first person singular
present tense The present tense (abbreviated or ) is a grammatical tense whose principal function is to locate a situation or event in the present time. The present tense is used for actions which are happening now. In order to explain and understand present ...
is traditionally used, but some modern dictionaries use the infinitive instead (except for Bulgarian, which lacks infinitives; for contracted verbs in Ancient Greek, an uncontracted first person singular present tense is used to reveal the contract vowel: ''philéō'' for ''philō'' "I love" mplying affection ''agapáō'' for ''agapō'' "I love" mplying regard.
Finnish Finnish may refer to: * Something or someone from, or related to Finland * Culture of Finland * Finnish people or Finns, the primary ethnic group in Finland * Finnish language, the national language of the Finnish people * Finnish cuisine See also ...
dictionaries list verbs not under their root, but under the first infinitive, marked with ''-(t)a'', ''-(t)ä''. For
Japanese Japanese may refer to: * Something from or related to Japan, an island country in East Asia * Japanese language, spoken mainly in Japan * Japanese people, the ethnic group that identifies with Japan through ancestry or culture ** Japanese diaspor ...
, the non-past (present and future) tense is used. For
Arabic Arabic (, ' ; , ' or ) is a Semitic languages, Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C ...
the third-person singular masculine of the past/perfect tense is the least-marked form and is used for entries in modern dictionaries. In older dictionaries, which are still commonly used, the
triliteral The roots of verbs and most nouns in the Semitic languages are characterized as a sequence of consonants or " radicals" (hence the term consonantal root). Such abstract consonantal roots are used in the formation of actual words by adding the vowel ...
of the word, either a verb or a noun, is used. This is similar to
Hebrew Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
, which also uses the third-person singular masculine perfect form, e.g. ברא ''bara' '' create, כפר ''kaphar'' deny.
Georgian Georgian may refer to: Common meanings * Anything related to, or originating from Georgia (country) ** Georgians, an indigenous Caucasian ethnic group ** Georgian language, a Kartvelian language spoken by Georgians **Georgian scripts, three scrip ...
uses the
verbal noun A verbal noun or gerundial noun is a verb form that functions as a noun. An example of a verbal noun in English grammar, English is 'sacking' as in the sentence "The sacking of the city was an epochal event" (''sacking'' is a noun formed from the ...
. For
Korean Korean may refer to: People and culture * Koreans, ethnic group originating in the Korean Peninsula * Korean cuisine * Korean culture * Korean language **Korean alphabet, known as Hangul or Chosŏn'gŭl **Korean dialects and the Jeju language ** ...
, ''-da'' is attached to the stem. In
Tamil Tamil may refer to: * Tamils, an ethnic group native to India and some other parts of Asia **Sri Lankan Tamils, Tamil people native to Sri Lanka also called ilankai tamils **Tamil Malaysians, Tamil people native to Malaysia * Tamil language, nativ ...
, an
agglutinative language An agglutinative language is a type of synthetic language with morphology that primarily uses agglutination. Words may contain different morphemes to determine their meanings, but all of these morphemes (including stems and affixes) tend to remain ...
, the verb stem (which is also the imperative form - the least marked one) is often cited, e.g., '' இரு'' In
Irish Irish may refer to: Common meanings * Someone or something of, from, or related to: ** Ireland, an island situated off the north-western coast of continental Europe ***Éire, Irish language name for the isle ** Northern Ireland, a constituent unit ...
, words are highly inflected by case (genitive, nominative, dative and vocative) and by their place within a sentence because of initial mutations. The noun ''cainteoir'', the lemma for the noun meaning "speaker", has a variety of forms: ''chainteoir'', ''gcainteoir'', ''cainteora'', ''chainteora'', ''cainteoirí'', ''chainteoirí'' and ''gcainteoirí''. Some phrases are cited in a sort of lemma: ''
Carthago delenda est ("Furthermore, I consider that Carthage must be destroyed"), often abbreviated to ("Carthage must be destroyed"), is a Latin oratorical phrase pronounced by Cato the Censor, a politician of the Roman Republic. The phrase originates from deba ...
'' (literally, "Carthage must be destroyed") is a common way of citing Cato, but what he said was nearer to ''censeo Carthaginem esse delendam'' ("I hold Carthage to be in need of destruction").


Lexicography

In a dictionary, the lemma "go" represents the
inflected In linguistic morphology, inflection (or inflexion) is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and defini ...
forms "go", "goes", "going", "went", and "gone". The relationship between an inflected form and its lemma is usually denoted by an angle bracket, e.g., "went" < "go". Of course, the disadvantage of such simplifications is the inability to look up a declined or conjugated form of the word, but some dictionaries, like
Webster's Dictionary ''Webster's Dictionary'' is any of the English language dictionaries edited in the early 19th century by American lexicographer Noah Webster (1758–1843), as well as numerous related or unrelated dictionaries that have adopted the Webster's n ...
, list "went". Multilingual dictionaries vary in how they deal with this issue: the
Langenscheidt Langenscheidt () is a German publishing company that specializes in language reference works. In addition to publishing language, monolingual dictionary, dictionaries, Langenscheidt also publishes bilingual dictionaries and travel phrase-books. ...
dictionary of German does not list ''ging'' (< ''gehen''), but the Cassell does. Lemmas or
word stem In linguistics, a word stem is a part of a word responsible for its lexical meaning. The term is used with slightly different meanings depending on the morphology of the language in question. In Athabaskan linguistics, for example, a verb stem is ...
s are used often in
corpus linguistics Corpus linguistics is the study of language, study of a language as that language is expressed in its text corpus (plural ''corpora''), its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feas ...
for determining word frequency. In that usage, the specific definition of "lemma" is flexible depending on the task it is being used for.


Pronunciation

A word may have different
pronunciation Pronunciation is the way in which a word or a language is spoken. This may refer to generally agreed-upon sequences of sounds used in speaking a given word or language in a specific dialect ("correct pronunciation") or simply the way a particular ...
s, depending on its
phonetic Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. ...
environment (the neighbouring sounds) or on the degree of
stress Stress may refer to: Science and medicine * Stress (biology), an organism's response to a stressor such as an environmental condition * Stress (linguistics), relative emphasis or prominence given to a syllable in a word, or to a word in a phrase ...
in a sentence. An example of the latter is the weak and strong forms of certain English
function word In linguistics, function words (also called functors) are words that have little lexical meaning or have ambiguous meaning and express grammatical relationships among other words within a sentence, or specify the attitude or mood of the speaker. ...
s like ''some'' and ''but'' (pronounced , when stressed but , when unstressed). Dictionaries usually give the pronunciation used when the word is pronounced alone (its
isolation form Morphophonology (also morphophonemics or morphonology) is the branch of linguistics that studies the interaction between morphological and phonological or phonetic processes. Its chief focus is the sound changes that take place in morphemes (mi ...
) and with stress, but they may also note common weak forms of pronunciation.


Difference between stem and lemma

The
stem Stem or STEM may refer to: Plant structures * Plant stem, a plant's aboveground axis, made of vascular tissue, off which leaves and flowers hang * Stipe (botany), a stalk to support some other structure * Stipe (mycology), the stem of a mushro ...
is the part of the word that never changes even when morphologically inflected; a lemma is the least marked form of the word. For example, from "produced", the lemma is "produce", but the stem is "produc-". This is because there are words such as production. and producing In linguistic analysis, the stem is defined more generally as the analyzed base form from which all inflected forms can be formed. When
phonology Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...
is taken into account, the definition of the unchangeable part of the word is not useful, as can be seen in the phonological forms of the words in the preceding example: "produced" vs. "production" . Some lexemes have several stems but one lemma. For instance the verb " to go" has the stems "go" and "went" due to
suppletion In linguistics and etymology, suppletion is traditionally understood as the use of one word as the inflected form of another word when the two words are not cognate. For those learning a language, suppletive forms will be seen as "irregular" or even ...
: the past tense was co-opted from a different verb, " to wend".


Headword

A headword, lemma, or catchword is the
word A word is a basic element of language that carries an semantics, objective or pragmatics, practical semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of w ...
under which a set of related
dictionary A dictionary is a listing of lexemes from the lexicon of one or more specific languages, often arranged alphabetically (or by radical and stroke for ideographic languages), which may include information on definitions, usage, etymologies ...
or
encyclopaedia An encyclopedia (American English) or encyclopædia (British English) is a reference work or compendium providing summaries of knowledge either general or special to a particular field or discipline. Encyclopedias are divided into articles ...
entries appears. The headword is used to locate the entry, and dictates its alphabetical position. Depending on the size and nature of the dictionary or encyclopedia, the entry may include alternative meanings of the word, its
etymology Etymology ()The New Oxford Dictionary of English (1998) – p. 633 "Etymology /ˌɛtɪˈmɒlədʒi/ the study of the class in words and the way their meanings have changed throughout time". is the study of the history of the Phonological chan ...
,
pronunciation Pronunciation is the way in which a word or a language is spoken. This may refer to generally agreed-upon sequences of sounds used in speaking a given word or language in a specific dialect ("correct pronunciation") or simply the way a particular ...
and
inflection In linguistic morphology, inflection (or inflexion) is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and defin ...
s,
compound word In linguistics, a compound is a lexeme (less precisely, a word or sign) that consists of more than one stem. Compounding, composition or nominal composition is the process of word formation that creates compound lexemes. Compounding occurs when ...
s or phrases that contain the headword, and encyclopedic information about the concepts represented by the word. For example, the headword ''
bread Bread is a staple food prepared from a dough of flour (usually wheat) and water, usually by baking. Throughout recorded history and around the world, it has been an important part of many cultures' diet. It is one of the oldest human-made f ...
'' may contain the following (simplified) definitions: :Bread :''(noun)'' :* A common food made from the combination of
flour Flour is a powder made by grinding raw grains, roots, beans, nuts, or seeds. Flours are used to make many different foods. Cereal flour, particularly wheat flour, is the main ingredient of bread, which is a staple food for many culture ...
,
water Water (chemical formula ) is an inorganic, transparent, tasteless, odorless, and nearly colorless chemical substance, which is the main constituent of Earth's hydrosphere and the fluids of all known living organisms (in which it acts as a ...
and
yeast Yeasts are eukaryotic, single-celled microorganisms classified as members of the fungus kingdom. The first yeast originated hundreds of millions of years ago, and at least 1,500 species are currently recognized. They are estimated to constitut ...
:* Money ''(slang)'' :''(verb)'' :* To coat in breadcrumbs :— to know which side your bread is buttered to know how to act in your own best interests. The '' Academic Dictionary of Lithuanian'' contains around 500,000 headwords. The ''
Oxford English Dictionary The ''Oxford English Dictionary'' (''OED'') is the first and foundational historical dictionary of the English language, published by Oxford University Press (OUP). It traces the historical development of the English language, providing a com ...
'' (OED) has around 273,000 headwords along with 220,000 lemmas, while ''
Webster's Third New International Dictionary ''Webster's Third New International Dictionary of the English Language, Unabridged'' (commonly known as ''Webster's Third'', or ''W3'') was published in September 1961. It was edited by Philip Babcock Gove and a team of lexicographers who spent 757 ...
'' has about 470,000. The ''
Deutsches Wörterbuch The ''Deutsches Wörterbuch'' (; "The German Dictionary"), abbreviated ''DWB'', is the largest and most comprehensive dictionary of the German language in existence.German language German ( ) is a West Germanic languages, West Germanic language mainly spoken in Central Europe. It is the most widely spoken and Official language, official or co-official language in Germany, Austria, Switzerland, Liechtenstein, and the Ita ...
, has around 330,000 headwords.The Deutsches Wörterbuch
at the BBAW, retrieved 22-June-2012.
These values are cited by the dictionary makers and may not use exactly the same definition of a headword. In addition, headwords may not accurately reflect a dictionary's physical size. The ''OED'' and the ''DWB'', for instance, include exhaustive historical reviews and exact citations from
source document A source document is a document in which data collected for a clinical trial is first recorded. This data is usually later entered in the case report form A case report form (or CRF) is a paper or electronic questionnaire specifically used in clin ...
s not usually found in standard dictionaries. The term 'lemma' comes from the practice in Greco-Roman antiquity of using the word to refer to the headwords of marginal
glosses A gloss is a brief notation, especially a marginal one or an interlinear one, of the meaning of a word or wording in a text. It may be in the language of the text or in the reader's language if that is different. A collection of glosses is a ''g ...
in
scholia Scholia (singular scholium or scholion, from grc, σχόλιον, "comment, interpretation") are grammatical, critical, or explanatory comments – original or copied from prior commentaries – which are inserted in the margin of th ...
; for this reason, the
Ancient Greek Ancient Greek includes the forms of the Greek language used in ancient Greece and the ancient world from around 1500 BC to 300 BC. It is often roughly divided into the following periods: Mycenaean Greek (), Dark Ages (), the Archaic peri ...
plural form is sometimes used, namely ''lemmata'' (Greek λῆμμα, pl. λήμματα).


See also

*
Lexeme A lexeme () is a unit of lexical meaning that underlies a set of words that are related through inflection. It is a basic abstract unit of meaning, a unit of morphological analysis in linguistics that roughly corresponds to a set of forms taken ...
*
Lexical Markup Framework Language resource management - Lexical markup framework (LMF; ISO 24613:2008), is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scop ...
*
Null morpheme In morphology, a null morpheme or zero morpheme is a morpheme that has no phonetic form. In simpler terms, a null morpheme is an "invisible" affix. It is a concept useful for analysis, by contrasting null morphemes with alternatives that do have ...
*
Principal parts In language learning, the principal parts of a verb are those forms that a student must memorize in order to be able to conjugate the verb through all its forms. The concept originates in the humanist Latin schools, where students learned verbs ...
*
Root (linguistics) A root (or root word) is the core of a word that is irreducible into more meaningful elements. In morphology, a root is a morphologically simple unit which can be left bare or to which a prefix or a suffix can attach. The root word is the prima ...
*
Uninflected word In linguistic morphology, an uninflected word is a word that has no morphological markers (inflection) such as affixes, ablaut, consonant gradation, etc., indicating declension or conjugation. If a word has an uninflected form, this is usually t ...


References


External links

{{Authority control Lexical units Units of linguistic morphology Linguistics terminology