Language complexity
   HOME

TheInfoList



OR:

Language complexity is a topic in
linguistics Linguistics is the science, scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure ...
which can be divided into several sub-topics such as
phonological Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...
, morphological, syntactic, and semantic complexity. The subject also carries importance for
language evolution Evolutionary linguistics or Darwinian linguistics is a sociobiological approach to the study of language. Evolutionary linguists consider linguistics as a subfield of sociobiology and evolutionary psychology. The approach is also closely linked ...
. Language complexity has been studied less than many other traditional fields of linguistics. While the consensus is turning towards recognizing that complexity is a suitable research area, a central focus has been on
methodological In its most common sense, methodology is the study of research methods. However, the term can also refer to the methods themselves or to the philosophical discussion of associated background assumptions. A method is a structured procedure for bri ...
choices. Some languages, particularly pidgins and creoles, are considered simpler than most other languages, but there is no direct ranking, and no universal method of measurement although several possibilities are now proposed within different schools of analysis.


History

Throughout the 19th century, differential complexity was taken for granted. The classical languages
Latin Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
and
Greek Greek may refer to: Greece Anything of, from, or related to Greece, a country in Southern Europe: *Greeks, an ethnic group. *Greek language, a branch of the Indo-European language family. **Proto-Greek language, the assumed last common ancestor ...
, as well as
Sanskrit Sanskrit (; attributively , ; nominally , , ) is a classical language belonging to the Indo-Aryan branch of the Indo-European languages. It arose in South Asia after its predecessor languages had diffused there from the northwest in the late ...
, were considered to possess qualities which could be achieved by the rising European
national language A national language is a language (or language variant, e.g. dialect) that has some connection—de facto or de jure—with a nation. There is little consistency in the use of this term. One or more languages spoken as first languages in the te ...
s only through an elaboration that would give them the necessary structural and lexical complexity that would meet the requirements of an advanced civilization. At the same time, languages described as 'primitive' were naturally considered to reflect the simplicity of their speakers. On the other hand,
Friedrich Schlegel Karl Wilhelm Friedrich (after 1814: von) Schlegel (; ; 10 March 1772 – 12 January 1829) was a German poet, literary critic, philosopher, philologist, and Indologist. With his older brother, August Wilhelm Schlegel, he was one of the main figure ...
noted that some nations "which appear to be at the very lowest grade of intellectual culture", such as
Basque Basque may refer to: * Basques, an ethnic group of Spain and France * Basque language, their language Places * Basque Country (greater region), the homeland of the Basque people with parts in both Spain and France * Basque Country (autonomous co ...
,
Sámi The Sámi ( ; also spelled Sami or Saami) are a Finno-Ugric-speaking people inhabiting the region of Sápmi (formerly known as Lapland), which today encompasses large northern parts of Norway, Sweden, Finland, and of the Murmansk Oblast, Ru ...
and some
native American languages Over a thousand indigenous languages are spoken by the Indigenous peoples of the Americas. These languages cannot all be demonstrated to be related to each other and are classified into a hundred or so language families (including a large numbe ...
, possess a striking degree of elaborateness.


Equal complexity hypothesis

During the 20th century, linguists and
anthropologists An anthropologist is a person engaged in the practice of anthropology. Anthropology is the study of aspects of humans within past and present societies. Social anthropology, cultural anthropology and philosophical anthropology study the norms and ...
adopted a standpoint that would reject any
nationalist Nationalism is an idea and movement that holds that the nation should be congruent with the state. As a movement, nationalism tends to promote the interests of a particular nation (as in a group of people), Smith, Anthony. ''Nationalism: Th ...
ideas about superiority of the languages of establishment. The first known quote that puts forward the idea that all languages are equally complex comes from Rulon S. Wells III, 1954, who attributes it to
Charles F. Hockett Charles Francis Hockett (January 17, 1916 – November 3, 2000) was an American linguist who developed many influential ideas in American structuralism#Structuralism in linguistics, structuralist linguistics. He represents the post-Leonard Bloomfi ...
. While laymen never ceased to consider certain languages as simple and others as complex, such a view was erased from official contexts. For instance, the 1971 edition of Guinness Book of World Records featured
Saramaccan Saramaccan () is a creole language spoken by about 58,000 ethnic African people near the Saramacca and the upper Suriname River, as well as in Paramaribo, capital of Suriname (formerly also known as Dutch Guiana). The language also has 25,000 s ...
, a creole language, as "the world's least complex language". According to linguists, this claim was "not founded on any serious evidence", and it was removed from later editions. Apparent complexity differences in certain areas were explained with a balancing force by which the simplicity in one area would be compensated with the complexity of another; e.g.
David Crystal David Crystal, (born 6 July 1941) is a British linguist, academic, and prolific author best known for his works on linguistics and the English language. Family Crystal was born in Lisburn, Northern Ireland, on 6 July 1941 after his mother had ...
, 1987: In 2001 the compensation hypothesis was eventually refuted by the
creolist A creole language, or simply creole, is a stable natural language that develops from the simplifying and mixing of different languages into a new one within a fairly brief period of time: often, a pidgin evolved into a full-fledged language. ...
John McWhorter John Hamilton McWhorter V (; born October 6, 1965) is an American linguist with a specialty in creole languages, sociolects, and Black English. He is currently associate professor of linguistics at Columbia University, where he also teaches Amer ...
who pointed out the absurdity of the idea that, as languages change, each would have to include a mechanism that calibrates it according to the complexity of all the other 6,000 or so languages around the world. He underscored that linguistics has no knowledge of any such mechanism. Revisiting the idea of differential complexity, McWhorter argued that it is indeed creole languages, such as Saramaccan, that are structurally "much simpler than all but very few older languages". In McWhorter's notion this is not problematic in terms of the equality of creole languages because simpler structures convey logical meanings in the most straightforward manner, while increased language complexity is largely a question of features which may not add much to the functionality, or improve usefulness, of the language. Examples of such features are inalienable possessive marking, switch-reference marking, syntactic asymmetries between matrix and
subordinate clauses A subordinate clause, dependent clause, subclause, or embedded clause is a clause that is embedded within a complex sentence. For instance, in the English sentence "I know that Bette is a dolphin", the clause "that Bette is a dolphin" occurs as t ...
,
grammatical gender In linguistics, grammatical gender system is a specific form of noun class system, where nouns are assigned with gender categories that are often not related to their real-world qualities. In languages with grammatical gender, most or all noun ...
, and other secondary features which are most typically absent in creoles. McWhorter's notion that "unnatural" language contact in pidgins, creoles and other contact varieties inevitably destroys "natural" accretions in complexity perhaps represents a recapitulation of 19th-century ideas about the relationship between language contact and complexity. During the years following McWhorter's article, several books and dozens of articles were published on the topic. As to date, there have been research projects on language complexity, and several workshops for researchers have been organised by various universities.


Complexity metrics

At a general level, language complexity can be characterized as the number and variety of elements, and the elaborateness of their interrelational structure. This general characterisation can be broken down into sub-areas: * ''Syntagmatic complexity'': number of parts, such as word length in terms of phonemes, syllables etc. * ''Paradigmatic complexity'': variety of parts, such as phoneme inventory size, number of distinctions in a grammatical category, e.g. aspect * ''Organizational complexity'': e.g. ways of arranging components, phonotactic restrictions, variety of word orders. * ''Hierarchic complexity'': e.g. recursion, lexical–semantic hierarchies. Measuring complexity is considered difficult, and the comparison of whole natural languages as a daunting task. On a more detailed level, it is possible to demonstrate that some structures are more complex than others. Phonology and morphology are areas where such comparisons have traditionally been made. For instance, linguistics has tools for the assessment of the phonological system of any given language. As for the study of syntactic complexity, grammatical rules have been proposed as a basis, but generative frameworks, such as the minimalist program and the Simpler Syntax framework, have been less successful in defining complexity and its predictions than non-formal ways of description. Many researchers suggest that several different concepts may be needed when approaching complexity: entropy, size, description length, effective complexity, information, connectivity, irreducibility, low probability, syntactic depth etc. Research suggests that while methodological choices affect the results, even rather crude analytic tools may provide a feasible starting point for measuring grammatical complexity.


Computational tools

*
Coh-Metrix Coh-Metrix is a computational tool that produces indices of the linguistic and discourse representations of a text. Developed by Arthur C. Graesser and Danielle S. McNamara, Coh-Metrix analyzes texts on many different features. Measurements Coh-M ...
*
L2 Syntactic Complexity Analyzer L2 Syntactical Complexity Analyzer (L2SCA) developed by Xiaofei Lu at the Pennsylvania State University, is a computational tool which produces syntactic complexity indices of written English language texts. Along with Coh-Metrix, the L2SCA is one ...


References


Bibliography

* * * * * * {{Authority control Grammar Phonology Language