Glottochronology (from

Attic Greek Attic Greek is the Greek language, Greek dialect of the regions of ancient Greece, ancient region of Attica, including the ''polis'' of classical Athens, Athens. Often called Classical Greek, it was the prestige (sociolinguistics), prestige diale ...

γλῶττα ''tongue, language'' and χρόνος ''time'') is the part of lexicostatistics which involves comparative linguistics and deals with the

chronological Chronology (from Latin , from Ancient Greek , , ; and , ''wikt:-logia, -logia'') is the science of arranging events in their order of occurrence in time. Consider, for example, the use of a timeline or sequence of events. It is also "the deter ...

relationship between languages.Sheila Embleton (1992). Historical Linguistics: Mathematical concepts. In W. Bright (Ed.), ''International Encyclopedia of Linguistics'' The idea was developed by Morris Swadesh in the 1950s in his article on Salish internal relationships. He developed the idea under two assumptions: there indeed exists a relatively stable ''basic vocabulary'' (referred to as ''

Swadesh list A Swadesh list () is a compilation of cultural universal, tentatively universal concepts for the purposes of lexicostatistics. That is, a Swadesh list is a list of forms and concepts which all languages, without exception, have terms for, such as ...

s'') in all languages of the world; and, any replacements happen in a way analogous to

radioactive decay Radioactive decay (also known as nuclear decay, radioactivity, radioactive disintegration, or nuclear disintegration) is the process by which an unstable atomic nucleus loses energy by radiation. A material containing unstable nuclei is conside ...

in a constant percentage per time elapsed. Using mathematics and statistics, Swadesh developed an equation to determine when languages separated and give an approximate time of when the separation occurred. His methods aimed to aid linguistic anthropologists by giving them a definitive way to determine a separation date between two languages. The formula provides an approximate number of centuries since two languages were supposed to have separated from a singular common ancestor. His methods also purported to provide information on when ancient languages may have existed. Despite multiple studies and literature containing the information of glottochronology, it is not widely used today and is surrounded with controversy. Glottochronology tracks language separation from thousands of years ago but many linguists are skeptical of the concept because it is more of a 'probability' rather than a 'certainty.' On the other hand, some linguists may say that glottochronology is gaining traction because of its relatedness to archaeological dates. Glottochronology is not as accurate as archaeological data, but some linguists still believe that it can provide a solid estimate. Over time many different extensions of the Swadesh method evolved; however, Swadesh's original method is so well known that 'glottochronology' is usually associated with him.

Methodology

The original method of glottochronology presumed that the core vocabulary of a language is replaced at a constant (or constant average) rate across all languages and cultures and so can be used to measure the passage of time. The process makes use of a list of lexical terms and morphemes which are similar to multiple languages. Lists were compiled by Morris Swadesh and assumed to be resistant against borrowing (originally designed in 1952 as a list of 200 items, but the refined 100-word list in Swadesh (1955)Swadesh, Morris. (1955). Towards greater accuracy in lexicostatistic dating. ''International Journal of American Linguistics'', ''21'', 121–137 is much more common among modern day linguists). The core vocabulary was designed to encompass concepts common to every human language such as personal pronouns, body parts, heavenly bodies and living beings, verbs of basic actions, numerals, basic adjectives, kin terms, and natural occurrences and events. Through a basic word list, one eliminates concepts that are specific to a particular culture or time period. It has been found through differentiating word lists that the ideal is really impossible and that the meaning set may need to be tailored to the languages being compared. Word lists are not homogenous throughout studies and they are often changed and designed to suit both languages being studied. Linguists find that it is difficult to find a word list where all words used are culturally unbiased. Many alternative word lists have been compiled by other linguists and often use fewer meaning slots. The percentage of

cognate In historical linguistics, cognates or lexical cognates are sets of words that have been inherited in direct descent from an etymological ancestor in a common parent language. Because language change can have radical effects on both the s ...

s (words with a common origin) in the word lists is then measured. The larger the percentage of cognates, the more recently the two languages being compared are presumed to have separated.

Glottochronologic constant

Determining word lists rely on morpheme decay or change in vocabulary. Morpheme decay must stay at a constant rate for glottochronology to be applied to a language. This leads to a critique of the glottochronologic formula because some linguists argue that the morpheme decay rate is not guaranteed to stay the same throughout history. American Linguist Robert Lees obtained a value for the "glottochronological constant" (r) of words by considering the known changes in 13 pairs of languages using the 200 word list. He obtained a value of 0.8048 ± 0.0176 with 90% confidence. For his 100-word list Swadesh obtained a value of 0.86, the higher value reflecting the elimination of semantically unstable words.

Divergence time

The basic formula of glottochronology proposed by Morris Swadesh is: :

t = -\frac

''t'' = a given period of time from one stage of the language to another (measured in millennia), ''c'' = proportion of wordlist items retained at the end of that period and ''r'' = rate of replacement for that word list. By testing historically verifiable cases in which ''t'' is known by nonlinguistic data (such as the approximate distance from Classical Latin to modern Romance languages), Swadesh arrived at the empirical value of approximately 0.14 for ''L'', (''c''?) which means that the rate of replacement constitutes around 14 words from the 100-wordlist per millennium. This is represented in the table below.

Results

Glottochronology was applied to a range of language families, including Salishan,

Indo-European The Indo-European languages are a language family native to the northern Indian subcontinent, most of Europe, and the Iranian plateau with additional native branches found in regions such as Sri Lanka, the Maldives, parts of Central Asia (e. ...

, Japonic,

Afro-Asiatic The Afroasiatic languages (also known as Afro-Asiatic, Afrasian, Hamito-Semitic, or Semito-Hamitic) are a language family (or "phylum") of about 400 languages spoken predominantly in West Asia, North Africa, the Horn of Africa, and parts of th ...

, Chinese and Mayan and other American languages. For Amerind, correlations have been obtained with radiocarbon dating and blood groups as well as archaeology.

Example Wordlist

Below is an example of a basic word list composed of basic Turkish words and their English translations.

Discussion

The concept of language change is old, and its history is reviewed in Hymes (1973) and Wells (1973). In some sense, glottochronology is a reconstruction of history and can often be closely related to archaeology. Many linguistic studies find the success of glottochronology to be found alongside archaeological data. Glottochronology itself dates back to the mid-20th century.Lees, Robert. (1953). The basis of glottochronology. ''Language'', ''29'' (2), 113–127. An introduction to the subject is given in Embleton (1986) and in McMahon and McMahon (2005). Glottochronology has been controversial ever since, partly because of issues of accuracy but also because of the question of whether its basis is sound (for example, Bergsland 1958; Bergsland and Vogt 1962; Fodor 1961; Chrétien 1962; Guy 1980). The concerns have been addressed by Dobson et al. (1972), Dyen (1973)Dyen, Isidore, ed. (1973). ''Lexicostatistics in genetic linguistics: Proceedings of the Yale conference, April 3–4, 1971''. La Haye: Mouton. and Kruskal, Dyen and Black (1973).Some Results From the Vocabulary Method of Reconstructing Language Trees, Joseph B. Kruskal, Isidore Dyen and Paul Black, Lexicostatistics in Genetic Linguistics, Isidore Dyen (editor), Mouton, The Hague, 1973, pp. 30–55 The assumption of a single-word replacement rate can distort the divergence-time estimate when borrowed words are included (Thomason and Kaufman 1988). The presentations vary from "Why linguists don't do dates" to the one by Starostin discussed below. Since its original inception, glottochronology has been rejected by many linguists, mostly Indo-Europeanists of the school of the traditional

comparative method In linguistics, the comparative method is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages with common descent from a shared ancestor and then extrapolating backwards ...

. Criticisms have been answered in particular around three points of discussion: * Criticism levelled against the higher stability of lexemes in Swadesh lists alone (Haarmann 1990) misses the point because a certain amount of losses only enables the computations (Sankoff 1970). The non-homogeneity of word lists often leads to lack of understanding between linguists. Linguists also have difficulties finding a completely unbiased list of basic cultural words. it can take a long time for linguists to find a viable word list which can take several test lists to find a usable list. * Traditional glottochronology presumes that language changes at a stable rate. :Thus, in Bergsland & Vogt (1962), the authors make an impressive demonstration, on the basis of actual language data verifiable by extralinguistic sources, that the "rate of change" for Icelandic constituted around 4% per millennium, but for closely connected Riksmal (Literary Norwegian), it would amount to as much as 20% (Swadesh's proposed "constant rate" was supposed to be around 14% per millennium). :That and several other similar examples effectively proved that Swadesh's formula would not work on all available material, which is a serious accusation since evidence that can be used to "calibrate" the meaning of ''L'' (language history recorded during prolonged periods of time) is not overwhelmingly large in the first place. :It is highly likely that the chance of replacement is different for every word or feature ("each word has its own history", among hundreds of other sources:). :That global assumption has been modified and downgraded to single words, even in single languages, in many newer attempts (see below). :There is a lack of understanding of Swadesh's mathematical/statistical methods. Some linguists reject the methods in full because the statistics lead to 'probabilities' when linguists trust 'certainties' more. * A serious argument is that language change arises from socio-historical events that are, of course, unforeseeable and, therefore, uncomputable.

Modifications

Somewhere in between the original concept of Swadesh and the rejection of glottochronology in its entirety lies the idea that glottochronology as a formal method of linguistic analysis becomes valid with the help of several important modifications. Thus, inhomogeneities in the replacement rate were dealt with by Van der Merwe (1966)van der Merwe, N. J. 1966 "New mathematics for glottochronology", Current Anthropology 7: 485–500 by splitting the word list into classes each with their own rate, while Dyen, James and Cole (1967)Dyen, I., James, A. T., & J. W. L. Cole 1967 "Language divergence and estimated word retention rate", allowed each meaning to have its own rate. Simultaneous estimation of divergence time and replacement rate was studied by Kruskal, Dyen and Black. Brainard (1970) allowed for chance cognation, and drift effects were introduced by Gleason (1959). Sankoff (1973) suggested introducing a borrowing parameter and allowed synonyms. A combination of the various improvements is given in Sankoff's "Fully Parameterised Lexicostatistics". In 1972, Sankoff in a biological context developed a model of genetic divergence of populations. Embleton (1981) derives a simplified version of that in a linguistic context. She carries out a number of simulations using this which are shown to give good results. Improvements in statistical methodology related to a completely different branch of science,

phylogenetics In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical dat ...

; the study of changes in DNA over time sparked a recent renewed interest. The new methods are more robust than the earlier ones because they calibrate points on the tree with known historical events and smooth the rates of change across them. As such, they no longer require the assumption of a constant rate of change
Gray & Atkinson 2003
.

Starostin's method

Another attempt to introduce such modifications was performed by the Russian linguist Sergei Starostin, who had proposed the following: * Systematic

loanword A loanword (also a loan word, loan-word) is a word at least partly assimilated from one language (the donor language) into another language (the recipient or target language), through the process of borrowing. Borrowing is a metaphorical term t ...

s, borrowed from one language into another, are a disruptive factor and must be eliminated from the calculations; the one thing that really matters is the "native" replacement of items by items from the same language. The failure to notice that factor was a major reason in Swadesh's original estimation of the replacement rate at under 14 words from the 100-wordlist per millennium, but the real rate is much slower (around 5 or 6). Introducing that correction effectively cancels out the "Bergsland & Vogt" argument since a thorough analysis of the Riksmal data shows that its basic wordlist includes about 15 to 16 borrowings from other Germanic languages (mostly Danish), and the exclusion of those elements from the calculations brings the rate down to the expected rate of 5 to 6 "native" replacements per millennium. * The rate of change is not really constant but depends on the time period during which the word has existed in the language (the chance of lexeme X being replaced by lexeme Y increases in direct proportion to the time elapsed, the so-called "aging of words" is empirically understood as gradual "erosion" of the word's primary meaning under the weight of acquired secondary ones). * Individual items on the 100 word-list have different stability rates (for instance, the word "I" generally has a much lower chance of being replaced than the word "yellow"). The resulting formula, taking into account both the time dependence and the individual stability quotients, looks as follows: :

t = \sqrt \frac

In that formula, −''Lc'' reflects the gradual slowing down of the replacement process because of different individual rates since the least stable elements are the first and the quickest to be replaced, and the square root represents the reverse trend, the acceleration of replacement as items in the original wordlist "age" and become more prone to shifting their meaning. This formula is obviously more complicated than Swadesh's original one, but, it yields, as shown by Starostin, more credible results than the former and more or less agrees with all the cases of language separation that can be confirmed by historical knowledge. On the other hand, it shows that glottochronology can really be used only as a serious scientific tool on language families whose historical phonology has been meticulously elaborated (at least to the point of being able to distinguish between cognates and loanwords clearly).

References

Bibliography

* Bergsland, Knut; & Vogt, Hans. (1962). On the validity of glottochronology. ''Current Anthropology'', ''3'', 115–153. * Brainerd, Barron (1970).
A Stochastic Process related to Language Change
''Journal of Applied Probability'' 7, 69–78. * Callaghan, Catherine A. (1991). Utian and the Swadesh list. In J. E. Redden (Ed.), ''Papers for the American Indian language conference, held at the University of California, Santa Cruz, July and August, 1991'' (pp. 218–237). Occasional papers on linguistics (No. 16). Carbondale: Department of Linguistics, Southern Illinois University. * Campbell, Lyle. (1998). ''Historical Linguistics; An Introduction'' hapter 6.5 Edinburgh: Edinburgh University Press. . * Chretien, Douglas (1962)
The Mathematical Models of Glottochronology
''Language'' 38, 11–37. * Crowley, Terry (1997)
An introduction to historical linguistics
3rd ed. Auckland: Oxford Univ. Press. pp. 171–193. * Dyen, Isidore (1965). "A Lexicostatistical classification of the Austronesian languages." ''International Journal of American Linguistics'', Memoir 19.
Gray, R.D. & Atkinson, Q.D. (2003): "Language-tree divergence times support the Anatolian theory of Indo-European origin." ''Nature'' 426–435–439.
* Gudschinsky, Sarah. (1956). The ABC's of lexicostatistics (glottochronology). ''Word'', ''12'', 175–210. * Haarmann, Harald. (1990). "Basic vocabulary and language contacts; the disillusion of glottochronology. In ''Indogermanische Forschungen '' 95:7ff. * Hockett, Charles F. (1958).
A course in modern linguistics
' (Chap. 6). New York: Macmillan. * Hoijer, Harry. (1956)
Lexicostatistics: A critique
''Language'', ''32'', 49–60. * Holm, Hans J. (2003)
The Proportionality Trap. Or: What is wrong with lexicostatistical Subgrouping
.''Indogermanische Forschungen'', ''108'', ''38–46''. * Holm, Hans J. (2005). Genealogische Verwandtschaft. Kap. 45 in ''Quantitative Linguistik; ein internationales Handbuch. Herausgegeben von R.Köhler, G. Altmann, R. Piotrowski'', Berlin: Walter de Gruyter. * Holm, Hans J. (2007). The new Arboretum of Indo-European 'Trees'; Can new algorithms reveal the Phylogeny and even Prehistory of IE?. ''Journal of Quantitative Linguistics'' 14-2:167–214 * Hymes, Dell H. (1960). Lexicostatistics so far. ''Current Anthropology'', ''1'' (1), 3–44. * McWhorter, John. (2001). ''The power of Babel''. New York: Freeman. . * Nettle, Daniel. (1999). Linguistic diversity of the Americas can be reconciled with a recent colonization. in ''PNAS'' 96(6):3325–9. * Sankoff, David (1970). "On the Rate of Replacement of Word-Meaning Relationships." ''Language'' 46.564–569. * Sjoberg, Andree; & Sjoberg, Gideon. (1956). Problems in glottochronology. ''American Anthropologist'', ''58'' (2), 296–308. * Starostin, Sergei. Methodology Of Long-Range Comparison. 2002
pdf
* Thomason, Sarah Grey, and Kaufman, Terrence. (1988). ''Language Contact, Creolization, and Genetic Linguistics''. Berkeley: University of California Press. * Tischler, Johann, 1973. Glottochronologie und Lexikostatistik nnsbrucker Beiträge zur Sprachwissenschaft 11 Innsbruck. * Wittmann, Henri (1969). "A lexico-statistic inquiry into the diachrony of Hittite." ''Indogermanische Forschungen'' 74.1–1

* Wittmann, Henri (1973). "The lexicostatistical classification of the French-based Creole languages." ''Lexicostatistics in genetic linguistics: Proceedings of the Yale conference, April 3–4, 1971'', dir. Isidore Dyen, 89–99. La Haye: Mouto

* George Kingsley Zipf, Zipf, George K. (1965). ''The Psychobiology of Language: an Introduction to Dynamic Philology.'' Cambridge, MA: M.I.T.Press.

External links

in Wiktionary.
Discussion with some statistics

* ttp://www.elinguistics.net/ Queryable experiment: quantification of the genetic proximity between 110 languages with trees and discussion {{Chronology 1950s introductions Historical linguistics American inventions Comparative linguistics Quantitative linguistics Statistical natural language processing Chronology