The Indo-European languages are a language family
native to western and southern Eurasia
. It comprises most of the languages of Europe
together with those of the northern Indian subcontinent
and the Iranian Plateau
. A few of these languages, such as English
, and Spanish
, have expanded through colonialism
in the modern period and are now spoken across several continents. The Indo-European family is divided into several branches or sub-families, of which there are 8 groups with languages alive today: Albanian
and 7 extinct subdivisions.
Today, the most populous individual languages are Spanish
, and Russian
, each with over 100 million native speakers. The Cornish
language, however has less than 3,000 speakers.
have more than 50 million native speakers each. In total, 46% of the world's population (3.2 billion) speaks an Indo-European language as a first language, by far the highest of any language family. There are about 445 living Indo-European languages, according to the estimate by ''Ethnologue
'', with over two thirds (313) of them belonging to the Indo-Iranian branch
All Indo-European languages have descended from a single prehistoric language, reconstructed
, spoken sometime in the Neolithic
era. Its precise geographical location, the Indo-European ''urheimat''
, is unknown and has been the object of many competing hypotheses; the most widely accepted is the Kurgan hypothesis
, which posits the ''urheimat'' to be the Pontic–Caspian steppe
, associated with the Yamnaya culture
around 3000 BC. By the time the first written records appeared, Indo-European had already evolved into numerous languages spoken across much of Europe and south-west Asia. Written evidence of Indo-European appeared during the Bronze Age
in the form of Mycenaean Greek
and the Anatolian languages
. The oldest records are isolated Hittite words and names – interspersed in texts that are otherwise in the unrelated Old Assyrian Akkadian
language, a Semitic
language – found in the texts of the Assyrian colony of Kültepe
in eastern Anatolia in the 20th century BC. Although no older written records of the original Proto-Indo-Europeans
remain, some aspects of their culture and religion
can be reconstructed from later evidence in the daughter cultures. The Indo-European family is significant to the field of historical linguistics
as it possesses the second-longest recorded history
of any known family, after the Afroasiatic family
in the form of the Egyptian language
and the Semitic languages
. The analysis of the family relationships between the Indo-European languages and the reconstruction of their common source was central to the development of the methodology of historical linguistics as an academic discipline in the 19th century.
The Indo-European family is not known to be linked to any other language family through any more distant genetic relationship, although several disputed proposals
to that effect have been made.
During the nineteenth century, the linguistic concept of Indo-European languages was frequently used interchangeably with the racial concepts of Aryan
and the Biblical concept of Japhetite
History of Indo-European linguistics
During the 16th century, European visitors to the Indian subcontinent
began to notice similarities among Indo-Aryan
, and European
languages. In 1583, English Jesuit
missionary and Konkani
scholar Thomas Stephens
wrote a letter from Goa
to his brother (not published until the 20th century)
in which he noted similarities between Indian languages and Greek
Another account was made by Filippo Sassetti
, a merchant born in Florence
in 1540, who travelled to the Indian subcontinent. Writing in 1585, he noted some word similarities between Sanskrit
and Italian (these included ''devaḥ''/''dio'' "God", ''sarpaḥ''/''serpe'' "serpent", ''sapta''/''sette'' "seven", ''aṣṭa''/''otto'' "eight", and ''nava''/''nove'' "nine").
However, neither Stephens' nor Sassetti's observations led to further scholarly inquiry.
In 1647, Dutch
linguist and scholar Marcus Zuerius van Boxhorn
noted the similarity among certain Asian and European languages and theorized that they were derived from a primitive common language which he called Scythian.
He included in his hypothesis Dutch
, and German
, later adding Slavic
, and Baltic languages
. However, Van Boxhorn's suggestions did not become widely known and did not stimulate further research.
Ottoman Turkish traveler Evliya Çelebi
visited Vienna in 1665–1666 as part of a diplomatic mission and noted a few similarities between words in German and in Persian.
and others made observations of the same type. Coeurdoux made a thorough comparison of Sanskrit, Latin and Greek conjugations
in the late 1760s to suggest a relationship among them. Meanwhile, Mikhail Lomonosov
compared different language groups, including Slavic, Baltic ("Kurlandic
"), Iranian ("Medic
, "Hottentot" (Khoekhoe
), and others, noting that related languages (including Latin, Greek, German and Russian) must have separated in antiquity from common ancestors.
[M.V. Lomonosov (drafts for ''Russian Grammar'', published 1755). In: Complete Edition, Moscow, 1952, vol. 7, pp. 652–59]
Представимъ долготу времени, которою сіи языки раздѣлились. ... Польской и россійской языкъ коль давно раздѣлились! Подумай же, когда курляндской! Подумай же, когда латинской, греч., нѣм., росс. О глубокая древность! magine the depth of time when these languages separated! ... Polish and Russian separated so long ago! Now think how long ago [this happened toKurlandic! Think when [this happened to] Latin, Greek, German, and Russian! Oh, great antiquity!]
The hypothesis reappeared in 1786 when [[William Jones (philologist)|Sir William Jones]] first lectured on the striking similarities among three of the oldest languages known in his time: Latin
, and Sanskrit
, to which he tentatively added Gothic
, and Persian
, though his classification contained some inaccuracies and omissions. In one of the most famous quotations in linguistics, Jones made the following prescient statement in a lecture to the Asiatic Society of Bengal
in 1786, conjecturing the existence of an earlier ancestor language, which he called "a common source" but did not name:
first used the term ''Indo-European'' in 1813, deriving it from the geographical extremes of the language family: from Western Europe
to North India
. A synonym is Indo-Germanic (''Idg.'' or ''IdG.''), specifying the family's southeasternmost and northwesternmost branches. This first appeared in French (''indo-germanique'') in 1810 in the work of Conrad Malte-Brun
; in most languages this term is now dated or less common than ''Indo-European'', although in German ''indogermanisch'' remains the standard scientific term. A number of other synonymous terms
have also been used.
wrote in 1816 ''On the conjugational system of the Sanskrit language compared with that of Greek, Latin, Persian and Germanic'' and between 1833 and 1852 he wrote ''Comparative Grammar''. This marks the beginning of Indo-European studies
as an academic discipline. The classical phase of Indo-European comparative linguistics
leads from this work to August Schleicher
's 1861 ''Compendium'' and up to Karl Brugmann
'', published in the 1880s. Brugmann's neogrammarian
reevaluation of the field and Ferdinand de Saussure
's development of the laryngeal theory
may be considered the beginning of "modern" Indo-European studies. The generation of Indo-Europeanists active in the last third of the 20th century (such as Calvert Watkins
, Jochem Schindler
, and Helmut Rix
) developed a better understanding of morphology and of ablaut
in the wake of Kuryłowicz
's 1956 ''Apophony in Indo-European,'' who in 1927 pointed out the existence of the Hittite consonant
ḫ. Kuryłowicz's discovery supported Ferdinand de Saussure's 1879 proposal of the existence of ''coefficients sonantiques'', elements de Saussure reconstructed to account for vowel length alternations in Indo-European languages. This led to the so-called laryngeal theory
, a major step forward in Indo-European linguistics and a confirmation of de Saussure's theory.
The various subgroups of the Indo-European language family include ten major branches, listed below in alphabetical order:
, attested from the 13th century AD;
evolved from an ancient Paleo-Balkan language
, traditionally thought to be Illyrian
; however, the evidence supporting this is insufficient.
, extinct by Late Antiquity
, spoken in Anatolia
, attested in isolated terms in Luwian
mentioned in Semitic Old Assyrian
texts from the 20th and 19th centuries BC, Hittite texts
from about 1650 BC.
, attested from the early 5th century AD.
, believed by most Indo-Europeanists to form a phylogenetic unit, while a minority ascribes similarities to prolonged language-contact.
), attested from the 9th century AD (possibly earlier
), earliest texts in Old Church Slavonic
. Slavic languages include Bulgarian
, and Rusyn
, attested from the 14th century AD; although attested relatively recently, they retain many archaic features attributed to Proto-Indo-European
(PIE). Living examples are Lithuanian
), attested since the 6th century BC; Lepontic
inscriptions date as early as the 6th century BC; Celtiberian
from the 2nd century BC; Primitive Irish Ogham inscription
s from the 4th or 5th century AD, earliest inscriptions in Old Welsh
from the 7th century AD. Modern Celtic languages include Welsh
, Scottish Gaelic
), earliest attestations in runic
inscriptions from around the 2nd century AD, earliest coherent texts in Gothic
, 4th century AD. Old English
manuscript tradition from about the 8th century AD. Includes English
, Low German
, see also History of Greek
); fragmentary records in Mycenaean Greek
from between 1450 and 1350 BC have been found. Homer
ic texts date to the 8th century BC.
, attested circa 1400 BC, descended from Proto-Indo-Iranian
(dated to the late 3rd millennium BC).
), attested from around 1400 BC in Hittite
texts from Anatolia
, showing traces of Indo-Aryan
words. Epigraphically from the 3rd century BC in the form of Prakrit
(Edicts of Ashoka
). The Rigveda
is assumed to preserve intact records via oral tradition
dating from about the mid-second millennium BC
in the form of Vedic Sanskrit
. Includes a wide range of modern languages from Northern India
, Southern Pakistan
as well as Sinhala
of Sri Lanka
of the Maldives
or Iranic, attested from roughly 1000 BC in the form of Avestan
. Epigraphically from 520 BC in the form of Old Persian
). Includes Persian
, and Zemiaki
), attested from the 7th century BC. Includes the ancient Osco-Umbrian languages
, as well as Latin
and its descendants, the Romance languages
, such as Italian
, and Catalan
, with proposed links to the Afanasevo culture
of Southern Siberia. Extant in two dialects (Turfanian and Kuchean, or Tocharian A and B), attested from roughly the 6th to the 9th century AD. Marginalized by the Old Turkic Uyghur Khaganate
and probably extinct by the 10th century.
In addition to the classical ten branches listed above, several extinct and little-known languages and language-groups have existed or are proposed to have existed:
* Ancient Belgian
: hypothetical language associated with the proposed Nordwestblock
cultural area. Speculated to be connected to Italic or Venetic, and to have certain phonological features in common with Lusitanian.
: possibly Iranic, Thracian, or Celtic
: possibly very close to Thracian
: Poorly-attested language spoken by the Elymians
, one of the three indigenous (i.e. pre-Greek and pre-Punic) tribes of Sicily. Indo-European affiliation uncertain, but relationships to Italic or Anatolian have been proposed.
: possibly related to Albanian, Messapian, or both
: doubtful affiliation, features shared with Venetic, Illyrian, and Indo-Hittite
, significant transition of the Pre-Indo-European
: possibly close to or part of Celtic.
: possibly related to (or part of) Celtic, Ligurian, or Italic
* Ancient Macedonian
: proposed relationship to Greek.
: not conclusively deciphered
: extinct language once spoken north of Macedon
: language of the ancient Phrygians
: an ancient language spoken by the Sicels (Greek Sikeloi, Latin Siculi), one of the three indigenous (i.e. pre-Greek and pre-Punic) tribes of Sicily. Proposed relationship to Latin or proto-Illyrian (Pre-Indo-European) at an earlier stage.
: proposed, pre-Celtic, Iberian language
: possibly including Dacian
: shares several similarities with Latin and the Italic languages, but also has some affinities with other IE languages, especially Germanic and Celtic.
Membership of languages in the Indo-European language family is determined by genealogical
relationships, meaning that all members are presumed descendants of a common ancestor, Proto-Indo-European
. Membership in the various branches, groups and subgroups of Indo-European is also genealogical, but here the defining factors are ''shared innovations'' among various languages, suggesting a common ancestor that split off from other Indo-European groups. For example, what makes the Germanic languages a branch of Indo-European is that much of their structure and phonology can be stated in rules that apply to all of them. Many of their common features are presumed innovations that took place in Proto-Germanic
, the source of all the Germanic languages.
In the 21st century, several attempts have been made to model the phylogeny of Indo-European languages using Bayesian methodologies similar to those applied to problems in biological phylogeny.
Although there are differences in absolute timing between the various analyses, there is much commonality between them, including the result that the first known language groups to diverge were the Anatolian and Tocharian language families, in that order.
Tree versus wave model
The "tree model
" is considered an appropriate representation of the genealogical history of a language family if communities do not remain in contact after their languages have started to diverge. In this case, subgroups defined by shared innovations form a nested pattern. The tree model is not appropriate in cases where languages remain in contact as they diversify; in such cases subgroups may overlap, and the "wave model
" is a more accurate representation. Most approaches to Indo-European subgrouping to date have assumed that the tree model is by-and-large valid for Indo-European; however, there is also a long tradition of wave-model approaches.
In addition to genealogical changes, many of the early changes in Indo-European languages can be attributed to language contact
. It has been asserted, for example, that many of the more striking features shared by Italic languages (Latin, Oscan, Umbrian, etc.) might well be areal features
. More certainly, very similar-looking alterations in the systems of long vowel
s in the West Germanic languages greatly postdate any possible notion of a proto-language
innovation (and cannot readily be regarded as "areal", either, because English and continental West Germanic were not a linguistic area). In a similar vein, there are many similar innovations in Germanic and Balto-Slavic that are far more likely areal features than traceable to a common proto-language, such as the uniform development of a high vowel
(*''u'' in the case of Germanic, *''i/u'' in the case of Baltic and Slavic) before the PIE syllabic resonants *''ṛ, *ḷ, *ṃ, *ṇ'', unique to these two groups among IE languages, which is in agreement with the wave model. The Balkan sprachbund
even features areal convergence among members of very different branches.
An extension to the ''Ringe
model of language evolution'', suggests that early IE had featured limited contact between distinct lineages, with only the Germanic subfamily exhibiting a less treelike behaviour as it acquired some characteristics from neighbours early in its evolution. The internal diversification of especially West Germanic is cited to have been radically non-treelike.
Specialists have postulated the existence of higher-order subgroups such as Italo-Celtic
or Graeco-Armeno-Aryan, and Balto-Slavo-Germanic. However, unlike the ten traditional branches, these are all controversial to a greater or lesser degree.
The Italo-Celtic subgroup was at one point uncontroversial, considered by Antoine Meillet
to be even better established than Balto-Slavic. The main lines of evidence included the genitive suffix ''-ī''; the superlative suffix ''-m̥mo''; the change of /p/ to /kʷ/ before another /kʷ/ in the same word (as in ''penkʷe'' > ''*kʷenkʷe'' > Latin ''quīnque'', Old Irish ''cóic''); and the subjunctive morpheme ''-ā-''. This evidence was prominently challenged by Calvert Watkins
, while Michael Weiss has argued for the subgroup.
Evidence for a relationship between Greek and Armenian includes the regular change of the second laryngeal
to ''a'' at the beginnings of words, as well as terms for "woman" and "sheep". Greek and Indo-Iranian share innovations mainly in verbal morphology and patterns of nominal derivation. Relations have also been proposed between Phrygian and Greek, and between Thracian and Armenian. Some fundamental shared features, like the aorist
(a verb form denoting action without reference to duration or completion) having the perfect active particle -s fixed to the stem, link this group closer to Anatolian languages and Tocharian. Shared features with Balto-Slavic languages, on the other hand (especially present and preterit formations), might be due to later contacts.
hypothesis proposes that the Indo-European language family consists of two main branches: one represented by the Anatolian languages and another branch encompassing all other Indo-European languages. Features that separate Anatolian from all other branches of Indo-European (such as the gender or the verb system) have been interpreted alternately as archaic debris or as innovations due to prolonged isolation. Points proffered in favour of the Indo-Hittite hypothesis are the (non-universal) Indo-European agricultural terminology in Anatolia and the preservation of laryngeals. However, in general this hypothesis is considered to attribute too much weight to the Anatolian evidence. According to another view, the Anatolian subgroup left the Indo-European parent language comparatively late, approximately at the same time as Indo-Iranian and later than the Greek or Armenian divisions. A third view, especially prevalent in the so-called French school of Indo-European studies, holds that extant similarities in non-satem
languages in general—including Anatolian—might be due to their peripheral location in the Indo-European language-area and to early separation, rather than indicating a special ancestral relationship. Hans J. Holm, based on lexical calculations, arrives at a picture roughly replicating the general scholarly opinion and refuting the Indo-Hittite hypothesis.
Satem and centum languages
The division of the Indo-European languages into satem and centum groups was put forward by Peter von Bradke in 1890, although Karl Brugmann
did propose a similar type of division in 1886. In the satem languages, which include the Balto-Slavic and Indo-Iranian branches, as well as (in most respects) Albanian and Armenian, the reconstructed Proto-Indo-European palatovelars
remained distinct and were fricativized, while the labiovelars merged with the 'plain velars'. In the centum languages, the palatovelars merged with the plain velars, while the labiovelars remained distinct. The results of these alternative developments are exemplified by the words for "hundred" in Avestan (''satem'') and Latin (''centum'')—the initial palatovelar developed into a fricative in the former, but became an ordinary velar in the latter.
Rather than being a genealogical separation, the centum–satem division is commonly seen as resulting from innovative changes that spread across PIE dialect-branches over a particular geographical area; the centum–satem isogloss
intersects a number of other isoglosses that mark distinctions between features in the early IE branches. It may be that the centum branches in fact reflect the original state of affairs in PIE, and only the satem branches shared a set of innovations, which affected all but the peripheral areas of the PIE dialect continuum. Kortlandt proposes that the ancestors of Balts and Slavs took part in satemization before being drawn later into the western Indo-European sphere.
Some linguists propose that Indo-European languages form part of one of several hypothetical macrofamilies
. However, these theories remain highly controversial and are not accepted by most linguists in the field. Some of the smaller proposed macrofamilies include:
, joining Indo-European with Uralic
, postulated by John Colarusso
, which joins Indo-European with Northwest Caucasian
Other, greater proposed families including Indo-European languages, include:
, a theory championed by Joseph Greenberg
, comprising the Uralic
and various 'Paleosiberian
' families (Ainu
) and possibly others
, comprising all or some of the Eurasiatic languages as well as the Kartvelian
(or wider, Elamo-Dravidian
) and Afroasiatic
* Borean languages
, separately proposed by Harold C. Fleming
and Sergei Starostin
, a language family encompassing almost all of the world's natural languages with the exception of those native to sub-Saharan Africa
, New Guinea
, and the Andaman Islands
Objections to such groupings are not based on any theoretical claim about the likely historical existence or non-existence of such macrofamilies; it is entirely reasonable to suppose that they might have existed. The serious difficulty lies in identifying the details of actual relationships between language families, because it is very hard to find concrete evidence that transcends chance resemblance, or is not equally likely explained as being due to borrowing
, which can travel very long distances). Because the signal-to-noise ratio
in historical linguistics declines over time, at great enough time-depths it becomes open to reasonable doubt that one can even distinguish between signal and noise.
The proposed Proto-Indo-European language (PIE) is the reconstructed
common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans
. From the 1960s, knowledge of Anatolian became certain enough to establish its relationship to PIE. Using the method of internal reconstruction
, an earlier stage, called Pre-Proto-Indo-European, has been proposed.
PIE was an inflected language
, in which the grammatical relationships between words were signaled through inflectional morphemes (usually endings). The roots
of PIE are basic morpheme
s carrying a lexical
meaning. By addition of suffix
es, they form stems
, and by addition of endings
, these form grammatically inflected words (nouns
). The reconstructed Indo-European verb
system is complex and, like the noun, exhibits a system of ablaut
BMAC in "IE languages c. 1500 BC" is Bactria–Margiana Archaeological Complex
The diversification of the parent language into the attested branches of daughter languages is historically unattested. The timeline of the evolution of the various daughter languages, on the other hand, is mostly undisputed, quite regardless of the question of Indo-European origins
Using a mathematical analysis borrowed from evolutionary biology, Don Ringe and Tandy Warnow propose the following evolutionary tree of Indo-European branches:
(before 3500 BC)
* Pre-Italic and Pre-Celtic (before 2500 BC)
* Pre-Armenian and Pre-Greek (after 2500 BC)
* Pre-Germanic and Pre-Balto-Slavic; proto-Germanic c. 500 BC
David Anthony proposes the following sequence:
* Pre-Italic and Pre-Celtic (3000 BC)
* Pre-Armenian (2800 BC)
* Pre-Balto-Slavic (2800 BC)
* Pre-Greek (2500 BC)
(2200 BC); split between Iranian and Old Indic 1800 BC
From 1500 BC the following sequence may be given:
* 1500–1000 BC: The Nordic Bronze Age
, and the (pre)-Proto-Celtic Urnfield
cultures emerge in Central Europe, introducing the Iron Age
. Migration of the Proto-Italic
speakers into the Italian peninsula (Bagnolo stele
). Redaction of the Rigveda
and rise of the Vedic civilization
in the Punjab
. The Mycenaean civilization
gives way to the Greek Dark Ages
. Hittite goes extinct.
* 1000–500 BC: The Celtic languages
spread over Central and Western Europe. Baltic languages
are spoken in a huge area from present-day Poland to the Ural Mountains. Proto Germanic
and the beginning of Classical Antiquity
. The Vedic Civilization gives way to the Mahajanapadas
. Siddhartha Gautama
composes the Gathas
, rise of the Achaemenid Empire
, replacing the Elamites
. Separation of Proto-Italic into Osco-Umbrian
. Genesis of the Greek
and Old Italic
alphabets. A variety of Paleo-Balkan languages
are spoken in Southern Europe.
* 500 BC – 1 BC/AD: Classical Antiquity
: spread of Greek
throughout the Mediterranean and, during the Hellenistic period
), to Central Asia and the Hindukush
. Kushan Empire
, Mauryan Empire
* 1 BC – AD 500: Late Antiquity
, Gupta period
; attestation of Armenian
. The Roman Empire
and then the Migration period
marginalize the Celtic languages to the British Isles. Sogdian
, an Eastern Iranian language
, becomes the ''lingua franca
'' of the Silk Road
in Central Asia leading to China, due to the proliferation of Sogdia
n merchants there. The last of the Anatolian languages are extinct
* 500–1000: Early Middle Ages
. The Viking Age
forms an Old Norse koine
spanning Scandinavia, the British Isles and Iceland. The Islamic conquest
and the Turkic expansion
results in the Arabization
of significant areas where Indo-European languages were spoken. Tocharian
is extinct in the course of the Turkic expansion while Northeastern Iranian
) is reduced to small refugia. Slavic languages spread over wide areas in central, eastern and southeastern Europe, largely replacing Romance in the Balkans (with the exception of Romanian) and whatever was left of the paleo-Balkan languages
with the exception of Albanian.
* 1000–1500: Late Middle Ages
: Attestation of Albanian
* 1500–2000: Early Modern period
to present: Colonialism
results in the spread of Indo-European languages to every continent, most notably Romance
(North, Central and South America, North and Sub-Saharan Africa, West Asia), West Germanic
in North America, Sub-Saharan Africa, East Asia and Australia; to a lesser extent Dutch and German), and Russian
to Central Asia and North Asia.
Important languages for reconstruction
In reconstructing the history of the Indo-European languages and the form of the Proto-Indo-European language
, some languages have been of particular importance. These generally include the ancient Indo-European languages that are both well-attested and documented at an early date, although some languages from later periods are important if they are particularly linguistically conservative
(most notably, Lithuanian
). Early poetry is of special significance because of the rigid poetic meter
normally employed, which makes it possible to reconstruct a number of features (e.g. vowel length
) that were either unwritten or corrupted in the process of transmission down to the earliest extant written manuscript
Most noticeable of all:
* Vedic Sanskrit
(c. 1500–500 BC). This language is unique in that its source documents were all composed orally, and were passed down through oral tradition
schools) for c. 2,000 years before ever being written down. The oldest documents are all in poetic form; oldest and most important of all is the Rigveda
(c. 1500 BC).
* Ancient Greek
(c. 750–400 BC). Mycenaean Greek
(c. 1450 BC) is the oldest recorded form, but its value is lessened by the limited material, restricted subject matter, and highly ambiguous writing system. More important is Ancient Greek, documented extensively beginning with the two Homeric poems
'' and the ''Odyssey
'', c. 750 BC).
(c. 1700–1200 BC). This is the earliest-recorded of all Indo-European languages, and highly divergent from the others due to the early separation of the Anatolian languages
from the remainder. It possesses some highly archaic features found only fragmentarily, if at all, in other languages. At the same time, however, it appears to have undergone many early phonological and grammatical changes which, combined with the ambiguities of its writing system, hinder its usefulness somewhat.
Other primary sources:
, attested in a huge amount of poetic and prose material in the Classical
period (c. 200 BC – 100 AD) and limited older material
from as early as c. 600 BC.
(the most archaic well-documented Germanic language
, c. 350 AD), along with the combined witness of the other old Germanic languages: most importantly, Old English
(c. 800–1000 AD), Old High German
(c. 750–1000 AD) and Old Norse
(c. 1100–1300 AD, with limited earlier sources dating all the way back to c. 200 AD).
* Old Avestan
(c. 1700–1200 BC) and Younger Avestan
(c. 900 BC). Documentation is sparse, but nonetheless quite important due to its highly archaic nature.
* Modern Lithuanian
, with limited records in Old Lithuanian
(c. 1500–1700 AD).
* Old Church Slavonic
(c. 900–1000 AD).
Other secondary sources, of lesser value due to poor attestation:
and other Anatolian languages
(c. 1400–400 BC).
and other Old Italic
languages (c. 600–200 BC).
* Old Persian
(c. 500 BC).
* Old Prussian
(c. 1350–1600 AD); even more archaic than Lithuanian.
Other secondary sources, of lesser value due to extensive phonological changes and relatively limited attestation:
* Old Irish
(c. 700–850 AD).
(c. 500–800 AD), underwent large phonetic shifts and mergers in the proto-language, and has an almost entirely reworked declension system.
* Classical Armenian
(c. 400–1000 AD).
(c. 1450–current time).
As the Proto-Indo-European (PIE) language broke up, its sound system diverged as well, changing according to various sound law
s evidenced in the daughter language
PIE is normally reconstructed with a complex system of 15 stop consonant
s, including an unusual three-way phonation
) distinction between voiceless
and "voiced aspirated
" (i.e. breathy voiced
) stops, and a three-way distinction among velar consonant
s (''k''-type sounds) between "palatal" ''ḱ ǵ ǵh'', "plain velar" ''k g gh'' and labiovelar
''kʷ gʷ gʷh''. (The correctness of the terms ''palatal'' and ''plain velar'' is disputed; see Proto-Indo-European phonology
.) All daughter languages have reduced the number of distinctions among these sounds, often in divergent ways.
As an example, in English
, one of the Germanic language
s, the following are some of the major changes that happened:
None of the daughter-language families (except possibly Anatolian
, particularly Luvian
) reflect the plain velar stops differently from the other two series, and there is even a certain amount of dispute whether this series existed at all in PIE. The major distinction between ''centum'' and ''satem''
languages corresponds to the outcome of the PIE plain velars:
* The "central" ''satem'' languages (Indo-Iranian
, and Armenian
) reflect both "plain velar" and labiovelar stops as plain velars, often with secondary palatalization
before a front vowel
(''e i ē ī''). The "palatal" stops are palatalized and often appear as sibilant
s (usually but not always distinct from the secondarily palatalized stops).
* The "peripheral" ''centum'' languages (Germanic
) reflect both "palatal" and "plain velar" stops as plain velars, while the labiovelars continue unchanged, often with later reduction into plain labial
or velar consonant
The three-way PIE distinction between voiceless, voiced and voiced aspirated stops is considered extremely unusual from the perspective of linguistic typology
—particularly in the existence of voiced aspirated stops without a corresponding series of voiceless aspirated stops. None of the various daughter-language families continue it unchanged, with numerous "solutions" to the apparently unstable PIE situation:
* The Indo-Aryan language
s preserve the three series unchanged but have evolved a fourth series of voiceless aspirated consonants.
* The Iranian language
s probably passed through the same stage, subsequently changing the aspirated stops into fricatives.
converted the voiced aspirates into voiceless aspirates.
probably passed through the same stage, but reflects the voiced aspirates as voiceless fricatives, especially ''f'' (or sometimes plain voiced stops in Latin
, and Albanian
merge the voiced aspirated into plain voiced stops.
change all three series in a chain shift
(e.g. with ''bh b p'' becoming ''b p f'' (known as ''Grimm's law
'' in Germanic).
Among the other notable changes affecting consonants are:
* The Ruki sound law
(''s'' becomes before ''r, u, k, i'') in the ''satem
* Loss of prevocalic ''p'' in Proto-Celtic
* Development of prevocalic ''s'' to ''h'' in Proto-Greek
, with later loss of ''h'' between vowels.
* Verner's law
* Grassmann's law
(dissimilation of aspirates) independently in Proto-Greek and Proto-Indo-Iranian.
The following table shows the basic outcomes of PIE consonants in some of the most important daughter languages for the purposes of reconstruction. For a fuller table, see Indo-European sound laws
* C- At the beginning of a word.
* -C- Between vowels.
* -C At the end of a word.
* `-C- Following an unstressed vowel (Verner's law
Between vowels, or between a vowel and (on either side).
Before a (PIE) stop ().
After a (PIE) obstruent (, etc.; ).
Before or after an obstruent (, etc.; ).
Before an original laryngeal.
Before a (PIE) front vowel ().
Before secondary (post-PIE) front-vowels.
Before or after a (PIE) (boukólos rule
Before or after a (PIE) (boukólos rule
Before a sonorant
Before or after a sonorant
Before or after .
After (Ruki sound law
Before an aspirated consonant in the next syllable (Grassmann's law
, also known as dissimilation of aspirates
Before a (PIE) front vowel () as well as before an aspirated consonant in the next syllable (Grassmann's law
, also known as dissimilation of aspirates
Before or after a (PIE) as well as before an aspirated consonant in the next syllable (Grassmann's law
, also known as dissimilation of aspirates
Comparison of conjugations
The following table presents a comparison of conjugations of the thematic present indicative
of the verbal root * of the English verb ''to bear
'' and its reflexes in various early attested IE languages and their modern descendants or relatives, showing that all languages had in the early stage an inflectional verb system.
While similarities are still visible between the modern descendants and relatives of these ancient languages, the differences have increased over time. Some IE languages have moved from synthetic
verb systems to largely periphrastic
systems. In addition, the pronoun
s of periphrastic forms are in brackets when they appear. Some of these verbs have undergone a change in meaning as well.
* In Modern Irish
''beir'' usually only carries the meaning ''to bear'' in the sense of bearing a child; its common meanings are ''to catch, grab''.
* The Hindustani
) verb ''bʰarnā'', the continuation of the Sanskrit verb, can have a variety of meanings, but the most common is "to fill". The forms given in the table, although etymologically derived from the present indicative
, now have the meaning of future subjunctive
The loss of the present indicative
in Hindustani is roughly compensated by the periphrastic habitual indicative
construction, using the habitual participle
(etymologically from the Sanskrit present participle ''bʰarant-'') and an auxiliary: ''ma͠i bʰartā hū̃, tū bʰartā hai, vah bʰartā hai, ham bʰarte ha͠i, tum bʰarte ho, ve bʰarte ha͠i'' (masculine forms).
* German is not directly descended from Gothic, but the Gothic forms are a close approximation of what the early West Germanic forms of c. 400 AD would have looked like. The cognate of Germanic ''beranan'' (English ''bear'') survives in German only in the compound ''gebären'', meaning "bear (a child)".
* The Latin verb ''ferre'' is irregular, and not a good representative of a normal thematic verb. In most Romance Languages such as French, other verbs now mean "to carry" (e.g. Fr. ''porter'' < Lat. ''portare'') and ''ferre'' was borrowed and nativized only in compounds such as ''souffrir'' "to suffer" (from Latin ''sub-'' and ''ferre'') and "to confer" (from Latin "con-" and "ferre").
* In Modern Greek
, ''phero'' φέρω (modern transliteration ''fero'') "to bear" is still used but only in specific contexts and is most common in such compounds as αναφέρω, διαφέρω, εισφέρω, εκφέρω, καταφέρω, προφέρω, προαναφέρω, προσφέρω etc. The form that is (very) common today is ''pherno'' φέρνω (modern transliteration ''ferno'') meaning "to bring". Additionally, the perfective form of ''pherno'' (used for the subjunctive voice and also for the future tense) is also ''phero''.
* In Modern Russian
''брать'' (brat') carries the meaning ''to take''. ''Бремя'' (br'em'a) means ''burden'', as something heavy to bear, and derivative ''беременность'' (b'er'em'ennost') means ''pregnancy''.
Comparison of cognates
Today, Indo-European languages are spoken by 3.2 billion native speakers
across all inhabited continents, the largest number by far for any recognised language family. Of the 20 languages with the largest numbers of native speakers
according to ''Ethnologue'', 10 are Indo-European: Spanish
, and Marathi
, accounting for over 1.7 billion native speakers.
Additionally, hundreds of millions of persons worldwide study Indo-European languages as secondary or tertiary languages, including in cultures which have completely different language families and historical backgrounds—there are between 600 million
and one billion
L2 learners of English alone.
The success of the language family, including the large number of speakers and the vast portions of the Earth that they inhabit, is due to several factors. The ancient Indo-European migrations
and widespread dissemination of Indo-European culture
, including that of the Proto-Indo-Europeans
themselves, and that of their daughter cultures including the Indo-Aryans
, Iranian peoples
, Germanic peoples
, and Slavs
, led to these peoples' branches of the language family already taking a dominant foothold in virtually all of Eurasia
except for swathes of the Near East
and East Asia
, replacing many (but not all) of the previously-spoken pre-Indo-European languages
of this extensive area. However Semitic languages
remain dominant in much of the Middle East
and North Africa
, and Caucasian languages
in much of the Caucasus
region. Similarly in Europe
and the Urals
the Uralic languages
(such as Hungarian, Finnish, Estonian etc) remain, as does Basque
, a pre-Indo-European isolate.
Despite being unaware of their common linguistic origin, diverse groups of Indo-European speakers continued to culturally dominate and often replace the indigenous languages of the western two-thirds of Eurasia. By the beginning of the Common Era
, Indo-European peoples controlled almost the entirety of this area: the Celts western and central Europe, the Romans southern Europe, the Germanic peoples northern Europe, the Slavs eastern Europe, the Iranian peoples most of western and central Asia and parts of eastern Europe, and the Indo-Aryan peoples in the Indian subcontinent
, with the Tocharians
inhabiting the Indo-European frontier in western China. By the medieval period, only the Semitic
, and Uralic languages
, and the language isolate Basque
remained of the (relatively) indigenous languages of Europe
and the western half of Asia.
Despite medieval invasions by Eurasian nomads
, a group to which the Proto-Indo-Europeans had once belonged, Indo-European expansion reached another peak in the early modern period
with the dramatic increase in the population of the Indian subcontinent
and European expansionism throughout the globe during the Age of Discovery
, as well as the continued replacement and assimilation of surrounding non-Indo-European languages and peoples due to increased state centralization and nationalism
. These trends compounded throughout the modern period due to the general global population growth
and the results of European colonization
of the Western Hemisphere
, leading to an explosion in the number of Indo-European speakers as well as the territories inhabited by them.
Due to colonization and the modern dominance of Indo-European languages in the fields of politics, global science, technology, education, finance, and sports, even many modern countries whose populations largely speak non-Indo-European languages have Indo-European languages as official languages, and the majority of the global population speaks at least one Indo-European language. The overwhelming majority of languages used on the Internet
are Indo-European, with English
continuing to lead the group; English in general has in many respects become the ''lingua franca''
of global communication.
* Grammatical conjugation
* ''The Horse, the Wheel, and Language
* Indo-European copula
* Indo-European sound laws
* Indo-European studies
* Indo-Semitic languages
* Indo-Uralic languages
* Eurasiatic languages
* Language family
* Languages of Asia
* Languages of Europe
* Languages of India
* List of Indo-European languages
* Proto-Indo-European root
* Proto-Indo-European religion
* Meillet, Antoine
. ''Esquisse d'une grammaire comparée de l'arménien classique'', 1903.
* Schleicher, August
, ''A Compendium of the Comparative Grammar of the Indo-European Languages'' (1861/62).
* Remys, Edmund, "General distinguishing features of various Indo-European languages and their relationship to Lithuanian". ''Indogermanische Forschungen'' ISSN 0019-7262, Vol. 112, 2007.
* Chantraine, Pierre
(1968), Dictionnaire étymologique de la langue grecque
'' Paris: Klincksieck.
an online collection of introductory videos to Ancient Indo-European languages produced by the University of Göttingen