Domestication of the horse
Nordic Bronze Age
Painted Grey Ware
Northern Black Polished Ware
Peoples and societies
Religion and mythology
Copenhagen Studies in Indo-European
Encyclopedia of Indo-European Culture
The Horse, the Wheel and Language
Journal of Indo-European Studies
Indogermanisches etymologisches Wörterbuch
Indo-European Etymological Dictionary
Romance languages (sometimes called the Romanic languages, Latin
languages, or Neo-
Latin languages) are the modern languages that
Vulgar Latin between the sixth and ninth centuries and
that thus form a branch of the
Italic languages within the
Indo-European language family.
Today, around 800 million people are native speakers worldwide, mainly
Africa and the Americas, but also elsewhere. Additionally,
Romance languages have many non-native speakers and are in
widespread use as lingua francas. This is especially the case for
French, which is in widespread use throughout Central and West Africa,
Mauritius and the Maghreb.
The five most widely spoken
Romance languages by number of native
speakers are Spanish (470 million), Portuguese
(250 million), French (150 million), Italian
(60 million), and Romanian (25 million).
Because of the difficulty of imposing boundaries on a continuum,
various counts of the modern
Romance languages are given; for example,
Dalby lists 23 based on mutual intelligibility. The following, more
extensive list, includes 35 current, living
languages, and one recently extinct language, Dalmatian:
Iberian Romance: Portuguese, Galician, Mirandese, Asturian, Leonese,
Spanish (Castilian), Aragonese, Ladino;
Occitan (langue d'oc), Gascon;
Gallo-Romance: French/Oïl languages,
Rhaeto-Romance: Romansh, Ladin, Friulian;
Gallo-Italic: Piedmontese, Ligurian, Lombard, Emilian-Romagnol;
Italo-Dalmatian: Italian, Tuscan, Corsican, Sassarese, Sicilian,
Neapolitan, Dalmatian (extinct in 1898), Venetian, Istriot;
Eastern Romance: Daco-Romanian, Istro-Romanian, Aromanian,
4.1 Vulgar Latin
4.2 Fall of the Western Roman Empire
4.3 Early Romance
4.4 Recognition of the vernaculars
4.5 Uniformization and standardization
5 Modern status
6 Classification and related languages
6.1 Proposed divisions
6.1.1 Italo-Western vs. Eastern vs. Sardinian
6.2 Pidgins, creoles, and mixed languages
6.3 Auxiliary and constructed languages
7 Linguistic features
7.1 Basic features
7.2 Changes from Classical Latin
7.3.3 Lexical stress
7.4 Nominal morphology
7.5 Pronouns, determiners
7.5.1 Personal pronouns
220.127.116.11 Development from Latin
18.104.22.168 Familiar–formal distinction
7.6 Verbal morphology
7.7.2 Lexical innovation
8 Sound changes
8.2 Stressed vowels
8.2.1 Loss of vowel length, reorientation
8.2.3 Further developments
8.2.4 Front-rounded vowels
8.3 Unstressed vowels
8.4 Intertonic vowels
9 Writing systems
9.2 Digraphs and trigraphs
9.2.1 Double consonants
9.4 Upper and lower case
10 Vocabulary comparison
11 See also
14 External links
Romance languages are the continuation of Vulgar Latin, the popular
and colloquial sociolect of
Latin spoken by soldiers, settlers, and
merchants of the Roman Empire, as distinguished from the classical
form of the language spoken by the Roman upper classes, the form in
which the language was generally written. Between 350 BC and 150
AD, the expansion of the Empire, together with its administrative and
educational policies, made
Latin the dominant native language in
continental Western Europe.
Latin also exerted a strong influence in
southeastern Britain, the Roman province of Africa, western Germany,
Pannonia and the
Balkans north of the Jireček Line.
During the Empire's decline, and after its fragmentation and collapse
in the fifth century, varieties of
Latin began to diverge within each
local area at an accelerated rate and eventually evolved into a
continuum of recognizably different typologies. The colonial empires
established by Portugal, Spain, and France from the fifteenth century
onward spread their languages to the other continents to such an
extent that about two-thirds of all Romance language speakers today
live outside Europe.
Despite other influences (e.g. substratum from pre-Roman languages,
especially Continental Celtic languages; and superstratum from later
Germanic or Slavic invasions), the phonology, morphology, and lexicon
Romance languages consist mainly of evolved forms of Vulgar
Latin. However, some notable differences occur between today's Romance
languages and their Roman ancestor. With only one or two exceptions,
Romance languages have lost the declension system of
Latin and, as a
result, have SVO sentence structure and make extensive use of
The term Romance comes from the
Vulgar Latin adverb romanice, "in
Roman", derived from Romanicus: for instance, in the expression
romanice loqui, "to speak in Roman" (that is, the
contrasted with latine loqui, "to speak in Latin" (Medieval Latin, the
conservative version of the language used in writing and formal
contexts or as a lingua franca), and with barbarice loqui, "to speak
in Barbarian" (the non-
Latin languages of the peoples living outside
the Roman Empire). From this adverb the noun romance originated,
which applied initially to anything written romanice, or "in the Roman
The word 'romance' with the modern sense of romance novel or love
affair has the same origin. In the medieval literature of Western
Europe, serious writing was usually in Latin, while popular tales,
often focusing on heroic adventures and courtly love, were composed in
the vernacular and came to be called "romances".
Lexical and grammatical similarities among the Romance languages, and
Latin and each of them, are apparent from the following
examples having the same meaning in various Romance lects:
English: She always closes the window before she dines / before
(Ea) semper antequam cenat fenestram claudit.
(Ea) claudi[t] semper illa fenestra antequam de cenare
(Jèdde) akjude sèmbe la fenèstre prime de mangè.
(Ella) zarra siempre a finestra antes de cenar.
(Ea/Nâsa) ãncljidi/nkidi totna firida/fireastra ninti di tsinã.
(Ella) pieslla siempres la ventana enantes de cenar.
(Lî) la sèra sänper la fnèstra prémma ed dsnèr.
(Ella) sempre tanca/clou la finestra abans de sopar.
Ella chjode/chjude sempre u purtellu nanzu di cenà.
Edda/Idda sarra sempri u purteddu nanzu/prima di cinà.
(Lē) la sèra sèmpar sù la fnèstra prima ad snàr.
(Ella) afecha siempri la ventana antis de cenal.
(Le) sarre toltin/tojor la fenétra avan de goutâ/dinar/sopar.
Elle ferme toujours la fenêtre avant de dîner/souper.
(Jê) e siere simpri il barcon prin di cenâ.
(Ela) pecha/fecha sempre a fiestra/xanela antes de cear.
Idda chjude sempri lu balconi primma di cinà.
(Ella/Lei) chiude sempre la finestra prima di cenare.
.אֵלייה סֵירּה סײֵמפּרֵי לה בֵֿינטאנה
אנטֵיס דֵי סֵינאר; Ella cerra siempre la ventana
antes de cenar.
(Ëra) stlüj dagnora la finestra impröma de cenè. (badiot) (Ëila)
stluj for l viere dan maië da cëina. (gherdëina)
Centro Cadore: La sera sempre la fenestra gnante de disna. Auronzo di
Cadore: La sera sempro la fenestra davoi de disnà.
(Eilla) pecha siempre la ventana primeiru de cenare.
(Le) a saera sempre u barcun primma de cenà.
(Lé) la sèra sèmper sö la finèstra prima de senà.
(Lee) la sara sù semper la finestra primma de disnà/scenà.
(Elle) à fàrm toujour là fnèt àvan k'à manj.
(Eilha) cerra siempre la bentana/jinela atrás de jantar.
إليا كلودت سامبرا لا فينسترا أبنتا دا
Ella cloudet sempre la fainestra abante da cenare. (reconstructed)
Essa 'nzerra sempe 'a fenesta primma 'e cenà.
Lli barre tréjous la crouésie devaunt de daîner.
(Ela) barra/tanca sempre/totjorn la fenèstra abans de sopar.
Ale frunme tojours l' creusèe édvint éd souper.
Chila a sara sèmper la fnestra dnans ëd fé sin-a/dnans ëd siné.
(Ela) fecha sempre a janela antes de jantar.
(Lia) la ciud sëmpra la fnèstra prëma ad magnè.
Ea închide întotdeauna fereastra înainte de a cina.
Ella clauda/serra adina la fanestra avant ch'ella tschainia.
Issa serrat semp(i)ri sa bentana in antis de cenai
Issa serrat semper sa bentana in antis de chenàre.
Edda sarra sempri lu balchoni primma di zinà.
Iḍḍa chiui sempri la finesṭṛa anti ca pistìa/mancia.
(Ella) siempre cierra la ventana antes de cenar/comer.
Lei serra sempre la finestra avanti cena.
Essa chjude sempre la finestra prima de cena'.
Eła ła sara/sera sempre ła fenestra vanti de xenàr/disnar.
Ele sere todi li finiesse divant di soper.
Romance-based creoles and pidgins
Li toujou' fèmen fenêt'-la avant li manger.
Li pou touzour ferm lafnet la avan (li) manze.
Y pou touzour ferm lafnet aven y manze.
Ta cerrá él con el puerta antes de cená.
E muhe closes e porta promé na dine.
Cape Verdean Creole
Êl fechâ porta antes de jantâ.
Some of the divergence comes from semantic change: where the same root
word has developed different meanings. For example, the Portuguese
word fresta is descended from
Latin fenestra "window" (and is thus
cognate to French fenêtre, Italian finestra, Romanian fereastră and
so on), but now means "skylight" and "slit". Cognates may exist but
have become rare, such as finiestra in Spanish, or dropped out of use
entirely. The Spanish and Portuguese terms defenestrar meaning "to
throw through a window" and fenestrado meaning "replete with windows"
also have the same root, but are later borrowings from Latin.
Likewise, Portuguese also has the word cear, a cognate of Italian
cenare and Spanish cenar, but uses it in the sense of "to have a late
supper" in most varieties, while the preferred word for "to dine" is
jantar (related to archaic Spanish yantar "to eat") because of
semantic changes in the 19th century. Galician has both fiestra (from
medieval fẽestra, the ancestor of standard Portuguese fresta) and
the less frequently used ventá and xanela.
As an alternative to lei (originally the genitive form), Italian has
the pronoun ella, a cognate of the other words for "she", but it is
hardly ever used in speaking.
Spanish, Asturian, and Leonese ventana and
Mirandese and Sardinian
bentana come from
Latin ventus "wind" (cf. English window,
etymologically 'wind eye'), and Portuguese janela, Galician xanela,
Mirandese jinela from
Latin *ianuella "small opening", a derivative of
Sardinian balcone (alternative for ventàna/bentàna) comes from Old
Italian and is similar to other
Romance languages such as French
balcon (from Italian balcone), Portuguese balcão, Romanian balcon,
Spanish balcón, Catalan balcó and Corsican balconi (alternative for
Romance languages in Europe
Main article: Vulgar Latin
Documentary evidence is limited about
Vulgar Latin for the purposes of
comprehensive research, and the literature is often hard to interpret
or generalize. Many of its speakers were soldiers, slaves, displaced
peoples, and forced resettlers, more likely to be natives of conquered
lands than natives of Rome. In Western Europe,
replaced Celtic and Italic languages, which were related to it by a
shared Indo-European origin. Commonalities in syntax and vocabulary
facilitated the adoption of Latin.
Vulgar Latin is believed to have already had most of the features
shared by all Romance languages, which distinguish them from Classical
Latin, such as the almost complete loss of the
Latin grammatical case
system and its replacement by prepositions; the loss of the neuter
grammatical gender and comparative inflections; replacement of some
verb paradigms by innovations (e.g. the synthetic future gave way to
an originally analytic strategy now typically formed by infinitive +
evolved present indicative forms of 'have'); the use of articles; and
the initial stages of the palatalization of the plosives /k/, /g/, and
To some scholars, this suggests the form of
Vulgar Latin that evolved
Romance languages was around during the time of the Roman
Empire (from the end of the first century BC), and was spoken
alongside the written Classical
Latin which was reserved for official
and formal occasions. Other scholars argue that the distinctions are
more rightly viewed as indicative of sociolinguistic and register
differences normally found within any language. Both were mutually
intelligible as one and the same language, which was true until very
approximately the second half of the 7th century. However, within two
Latin became a dead language since "the Romanized people
Europe could no longer understand texts that were read aloud or
recited to them," i.e.
Latin had ceased to be a first language and
became a foreign language that had to be learned, if the label Latin
is constrained to refer to a state of the language frozen in past time
and restricted to linguistic features for the most part typical of
Fall of the Western Roman Empire
During the political decline of the Western
Roman Empire in the fifth
century, there were large-scale migrations into the empire, and the
Latin-speaking world was fragmented into several independent states.
Europe and the
Balkans were occupied by the Germanic and
Slavic tribes, as well as by the Huns, which isolated the
the rest of Romance-speaking Europe.
British and African Romance, the forms of
Vulgar Latin used in
southeastern Britain and the Roman province of Africa, where it had
been spoken by much of the urban population, disappeared in the Middle
Ages (as did
Pannonian Romance in what is now Hungary and Moselle
Romance in Germany). But the Germanic tribes that had penetrated Roman
Italy, Gaul, and
Hispania eventually adopted Latin/Romance and the
remnants of the culture of ancient Rome alongside existing inhabitants
of those regions, and so
Latin remained the dominant language there.
Over the course of the fourth to eighth centuries, Vulgar Latin, by
this time highly dialectalized, broke up into discrete languages that
were no longer mutually intelligible.[dubious – discuss]:5 Clear
Latin change comes from the Reichenau Glosses, an
eighth-century compilation of about 1,200 words from the
Vulgate of Jerome) that were no longer intelligible
along with their eighth-century equivalents in
proto-Franco-Provençal. The following are some examples with reflexes
in several modern, closely related
Romance languages for comparison:
Classical / 4th cent.
(nens, etc.) /
(pipius) / (pitzinnos)
sulai / sulare
cantai / cantare
the best (plur.)
cei mai buni)
is mellus / sos menzus
(hermosa, bonita) /
in the mouth
en la boçhe
dans la bouche
in la bucca
en la boca
a la boca
in sa buca
dins la boca
ierru / iberru
In all of the above examples, the words appearing in the fourth
Vulgate are the same words as would have been used in
Latin of c. 50 BC. It is likely that some of these words had
already disappeared from casual speech; but if so, they must have been
still widely understood, as there is no recorded evidence that the
common people of the time had difficulty understanding the language.
By the 8th century, the situation was very different. During the late
8th century, Charlemagne, holding that "
Latin of his age was by
classical standards intolerably corrupt",:6 successfully imposed
Latin as an artificial written vernacular for Western
Europe. Unfortunately, this meant that parishioners could no longer
understand the sermons of their priests, forcing the Council of Tours
in 813 to issue an edict that priests needed to translate their
speeches into the rustica romana lingua, an explicit acknowledgement
of the reality of the
Romance languages as separate languages from
Latin.:6 By this time, and possibly as early as the 6th century
according to Price (1984),:6 the Romance lects had split apart
enough to be able to speak of separate Gallo-Romance, Ibero-Romance,
Eastern Romance languages. Some researchers[who?]
have postulated that the major divergences in the spoken dialects
began in the 5th century, as the formerly widespread and efficient
communication networks of the Western
Roman Empire rapidly broke down,
leading to the total disappearance of the Western
Roman Empire by the
end of the century. The critical period between the 5th–10th
centuries AD is poorly documented because little or no writing from
the chaotic "Dark Ages" of the 5th–8th centuries has survived, and
writing after that time was in consciously classicized Medieval Latin,
with vernacular writing only beginning in earnest in the 11th or 12th
A language that was closely related to medieval Romanian was spoken
during the Dark Ages by
Vlachs in the Balkans, Herzegovina, Dalmatia
(Morlachs), Ukraine (Hutsuls), Poland (Gorals), Slovakia, and Czech
Moravia, but gradually these communities lost their maternal
Recognition of the vernaculars
Between the 10th and 13th centuries, some local vernaculars developed
a written form and began to supplant
Latin in many of its roles. In
some countries, such as Portugal, this transition was expedited by
force of law; whereas in others, such as Italy, many prominent poets
and writers used the vernacular of their own accord – some of
the most famous in
Giacomo da Lentini
Giacomo da Lentini and
Uniformization and standardization
The invention of the printing press brought a tendency towards greater
uniformity of standard languages within political boundaries, at the
expense of other
Romance languages and dialects less favored
politically. In France, for instance, the dialect spoken in the region
of Paris gradually spread to the entire country, and the
the south lost ground.
Main articles: Romance-speaking Europe,
Romance-speaking Africa, Romance-speaking Asia, and Romance-speaking
European extent of romance languages in the 20th century
Number of native speakers of each Romance language, as fractions of
the total 690 million (2007)
The Romance language most widely spoken natively today is Spanish
(Castilian), followed by Portuguese, French, Italian and Romanian,
which together cover a vast territory in
Europe and beyond, and work
as official and national languages in dozens of countries.
Romance languages in the World
French, Italian, Portuguese, Spanish, and Romanian are also official
languages of the European Union. Spanish, Portuguese, French, Italian,
Romanian, and Catalan are the official languages of the
and French and Spanish are two of the six official languages of the
United Nations. Outside Europe, French, Portuguese and Spanish are
spoken and enjoy official status in various countries that emerged
from the respective colonial empires. Spanish is an official language
in nine countries of South America, home to about half that
continent's population; in six countries of
Central America (all
except Belize); and in Mexico. In the Caribbean, it is official in
Cuba, the Dominican Republic, and Puerto Rico. In all these countries,
Latin American Spanish is the vernacular language of the majority of
the population, giving Spanish the most native speakers of any Romance
Africa it is the official language of Equatorial Guinea,
but has few native speakers there.
Portuguese, in its original homeland, Portugal, is spoken by virtually
the entire population of 10 million. As the official language of
Brazil, it is spoken by more than 200 million people in that country,
as well as by neighboring residents of eastern Paraguay and northern
Uruguay, accounting for a little more than half the population of
South America. It is the official language of six African countries
(Angola, Cape Verde, Guinea-Bissau, Mozambique, Equatorial Guinea, and
São Tomé and Príncipe), and is spoken as a first language by
perhaps 30 million residents of that continent. In Asia,
Portuguese is co-official with other languages in
East Timor and
Macau, while most Portuguese-speakers in Asia—some 400,000—are
in Japan due to return immigration of Japanese Brazilians. In North
America 1,000,000 people speak Portuguese as their home language.
In Oceania, Portuguese is the second most spoken Romance language,
after French, due mainly to the number of speakers in East Timor. Its
closest relative, Galician, has official status in the autonomous
community of Galicia in Spain, together with Spanish.
Outside Europe, French is spoken natively most in the Canadian
province of Quebec, and in parts of
New Brunswick and Ontario. Canada
is officially bilingual, with French and English being the official
languages. In parts of the Caribbean, such as Haiti, French has
official status, but most people speak creoles such as Haitian Creole
as their native language. French also has official status in much of
Africa, but relatively few native speakers. In France's overseas
possessions, native use of French is increasing.
Italy also had some colonial possessions before World War II,
its language did not remain official after the end of the colonial
domination. As a result, Italian outside of
Italy and Switzerland is
now spoken only as a minority language by immigrant communities in
South America and Australia. In some former Italian colonies
in Africa—namely Libya,
Eritrea and Somalia—it is spoken by a few
educated people in commerce and government.
Romania did not establish a colonial empire, but beyond its native
territory in southeastern Europe, the
Romanian language is spoken as a
minority language by autochthonous populations in Serbia, Bulgaria,
and Hungary, and in some parts of the former Greater
1945), as well as in Ukraine (Bukovina, Budjak) and in some villages
Dniester and Bug rivers. The Aromanian language, often
called a dialect of Romanian, is spoken today by
Macedonia, Albania, Kosovo, and Greece. Romanian also spread to
other countries on the
Mediterranean (especially the other
Romance-speaking countries, most notably
Italy and Spain), and
elsewhere such as Israel, where it is the native language of five
percent of the population, and is spoken by many more as a
secondary language. This is due to the large number of Romanian-born
Jews who moved to
Israel after World War II. And finally, some
2.6 million people in the former Soviet republic of
a variety of Romanian, called variously Moldovan or Romanian by them.
The total native speakers of
Romance languages are divided as follows
(with their ranking within the languages of the world in
Spanish (Hispanosphere) 49% (2nd)
Portuguese (Lusosphere) 26% (6th)
French (Francophonie) 8.6% (18th)
Italian 7.7% (23rd)
Romanian 3.0% (49th)
Catalan 0.9% (not in the top 100)
Catalan is the official language of Andorra. In Spain, it is
co-official with Spanish (Castilian) in Catalonia, the Valencian
Community, and the Balearic Islands, and it is recognized, but not
official, in La Franja, and in Aragon. In addition, it is spoken by
many residents of Alghero, on the island of Sardinia, and it is
co-official in that city. Galician, with more than a million native
speakers, is official together with Spanish in Galicia, and has legal
recognition in neighbouring territories in Castilla y León. A few
other languages have official recognition on a regional or otherwise
limited level; for instance, Asturian and Aragonese in Spain;
Mirandese in Portugal; Friulan, Sardinian and
Italy; and Romansh in Switzerland.
Romance languages survive mostly as spoken languages for
informal contact. National governments have historically viewed
linguistic diversity as an economic, administrative or military
liability, as well as a potential source of separatist movements;
therefore, they have generally fought to eliminate it, by extensively
promoting the use of the official language, restricting the use of the
"other" languages in the media, characterizing them as mere
"dialects", or even persecuting them. As a result, all of these
languages are considered endangered to varying degrees according to
the UNESCO Red Book of Endangered Languages, ranging from "vulnerable"
(e.g. Sicilian and Venetian) to "severely endangered" (Arpitan, most
Occitan varieties). Since the late twentieth and early
twenty-first centuries, increased sensitivity to the rights of
minorities has allowed some of these languages to start recovering
their prestige and lost rights. Yet it is unclear whether these
political changes will be enough to reverse the decline of minority
Classification and related languages
Romance languages based on structural and comparative
criteria, not on socio-functional ones. FP: Franco-Provençal, IR:
Western Romance areas split by the La Spezia–Rimini Line
Main article: Classification of Romance languages
The classification of the
Romance languages is inherently difficult,
because most of the linguistic area is a dialect continuum, and in
some cases political biases can come into play. Along with Latin
(which is not included among the Romance languages) and a few extinct
languages of ancient Italy, they make up the Italic branch of the
Extent of variation in development (very conservative to very
1 Also [ɾ̥ r̥ ɻ̝̊ x χ ħ] are all possible allophones of [ɾ]
in this position.
There are various schemes used to subdivide the Romance languages.
Three of the most common schemes are as follows:
Italo-Western vs. Eastern vs. Southern. This is the scheme followed by
Ethnologue, and is based primarily on the outcome of the ten
monophthong vowels in Classical Latin. This is discussed more below.
West vs. East. This scheme divides the various languages along the La
Spezia–Rimini Line, which runs across north-central
Italy just to
the north of the city of
Florence (whose speech forms the basis of
standard Italian). In this scheme, "East" includes the languages of
central and southern Italy, and the Balkan Romance (or "Eastern
Romance") languages in Romania, Greece, and elsewhere in the Balkans;
"West" includes the languages of Portugal, Spain, France, northern
Italy and Switzerland. Sardinian does not easily fit in this scheme.
"Conservative" vs. "innovatory". This is a non-genetic division whose
precise boundaries are subject to debate. Generally, the Gallo-Romance
languages (discussed further below) form the core "innovatory"
languages, with standard French generally considered the most
innovatory of all, while the languages near the periphery (which
include Spanish, Portuguese, Italian and Romanian) are "conservative".
Sardinian is generally acknowledged the most conservative Romance
language, and was also the first language to split off genetically
from the rest, possibly as early as the first century BC. Dante
famously denigrated the Sardinians for the conservativeness of their
speech, remarking that they imitate
Latin "like monkeys imitate
Italo-Western vs. Eastern vs. Sardinian
The main subfamilies that have been proposed by
Ethnologue within the
various classification schemes for
Romance languages are:
Italo-Western, the largest group, which includes languages such as
Catalan, Portuguese, Italian, Spanish, and French.
Eastern Romance, which includes the
Romance languages of Eastern
Europe, such as Romanian.
Southern Romance, which includes a few languages with particularly
archaic features, such as Sardinian and, partially, Corsican. This
family is thought to have included the now-vanished Romance languages
Africa (or at least, they appear to have evolved their vowels in
the same way).
This controversial three-way division is made primarily based on the
Vulgar Latin (Proto-Romance) vowels:
Outcome of Classical
Italo-Western is in turn split along the so-called La Spezia–Rimini
Line in northern Italy, which divides the central and southern Italian
languages from the so-called
Western Romance languages to the north
and west. The primary characteristics dividing the two are:
Phonemic lenition of intervocalic stops, which happens to the
northwest but not to the southeast.
Degemination of geminate stops (producing new intervocalic single
voiceless stops, after the old ones were lenited), which again happens
to the northwest but not to the southeast.
Deletion of intertonic vowels (between the stressed syllable and
either the first or last syllable), again in the northwest but not the
Use of plurals in /s/ in the northwest vs. plurals using vowel change
in the southeast.
Development of palatalized /k/ before /e,i/ to /(t)s/ in the northwest
vs. /tʃ/ in the southeast.
Development of /kt/, which develops to /xt/ > /it/ (sometimes
progressing further to /tʃ/) in the northwest but /tt/ in the
In fact, the reality is somewhat more complex. All of the "southeast"
characteristics apply to all languages southeast of the line, and all
of the "northwest" characteristics apply to all languages in France
and (most of) Spain. However, the Gallo-
Italic languages are somewhere
in between. All of these languages do have the "northwest"
characteristics of lenition and loss of gemination. However:
Italic languages have vowel-changing plurals rather than
Lombard language in north-central
Italy and the Rhaeto-Romance
languages have the "southeast" characteristic of /tʃ/ instead of
/(t)s/ for palatalized /k/.
Venetian language in northeast
Italy and some of the
Rhaeto-Romance languages have the "southeast" characteristic of
developing /kt/ to /tt/.
Lenition of post-vocalic /p t k/ is widespread as an allophonic
phonetic realization in
Italy below the La Spezia-Rimini line,
including Corsica and most of Sardinia.
On top of this, the ancient
Mozarabic language in southern Spain, at
the far end of the "northwest" group, had the "southeast"
characteristics of lack of lenition and palatalization of /k/ to
/tʃ/. Certain languages around the
Pyrenees (e.g. some highland
Aragonese dialects) also lack lenition, and northern French dialects
such as Norman and Picard have palatalization of /k/ to /tʃ/
(although this is possibly an independent, secondary development,
since /k/ between vowels, i.e. when subject to lenition, developed to
/dz/ rather than /dʒ/, as would be expected for a primary
The usual solution to these issues is to create various nested
Western Romance is split into the Gallo-Iberian languages,
in which lenition happens and which include nearly all the Western
Romance languages, and the Pyrenean-Mozarabic group, which includes
the remaining languages without lenition (and is unlikely to be a
valid clade; probably at least two clades, one for Mozarabic and one
for Pyrenean). Gallo-Iberian is split in turn into the Iberian
languages (e.g. Spanish and Portuguese), and the larger Gallo-Romance
languages (stretching from eastern
Spain to northeast Italy).
Probably a more accurate description, however, would be to say that
there was a focal point of innovation located in central France, from
which a series of innovations spread out as areal changes. The La
Spezia–Rimini Line represents the farthest point to the southeast
that these innovations reached, corresponding to the northern chain of
the Apennine Mountains, which cuts straight across northern
forms a major geographic barrier to further language spread.
This would explain why some of the "northwest" features (almost all of
which can be characterized as innovations) end at differing points in
northern Italy, and why some of the languages in geographically remote
Spain (in the south, and high in the Pyrenees) are lacking
some of these features. It also explains why the languages in France
(especially standard French) seem to have innovated earlier and more
completely than other
Western Romance languages.
Many of the "southeast" features also apply to the Eastern Romance
languages (particularly, Romanian), despite the geographic
discontinuity. Examples are lack of lenition, maintenance of
intertonic vowels, use of vowel-changing plurals, and palatalization
of /k/ to /tʃ/. (Gemination is missing, which may be an independent
development, and /kt/ develops into /pt/ rather than either of the
normal Italo-Western developments.) This has led some researchers to
postulate a basic two-way East-West division, with the "Eastern"
languages including Romanian and central and southern Italian.
Despite being the first romance language to evolve from Vulgar
Latin, Sardinian does not fit well at all into this sort of
division. It is clear that Sardinian became linguistically independent
from the remainder of the
Romance languages at an extremely early
date, possibly already by the first century BC. Sardinian contains a
large number of archaic features, including total lack of
palatalization of /k/ and /g/ and a large amount of vocabulary
preserved nowhere else, including some items already archaic by the
time of Classical
Latin (first century BC). Sardinian has plurals in
/s/ but post-vocalic lenition of voiceless consonants is normally
limited to the status of an allophonic rule (e.g. [k]ane 'dog' but su
[g]ane or su [ɣ]ane 'the dog'), and there are a few innovations
unseen elsewhere, such as a change of /au/ to /a/. Use of su <
ipsum as an article is a retained archaic feature that also exists in
the Catalan of the
Balearic Islands and that used to be more
widespread in Occitano-Romance, and is known as article salat
(literally the "salted article"), while
delabialization of earlier /kw/ and /gw/ with Romania: Sard. abba,
Rum. apă 'water'; Sard. limba, Rom. limbă 'language' (cf. Italian
Gallo-Romance can be divided into the following subgroups:
The Langues d'oïl, including French and closely related languages.
Franco-Provençal language (also known as Arpitan) of southeastern
France, western Switzerland, and
Aosta Valley region of northwestern
The following groups are also sometimes considered part of
Occitano-Romance languages of southern France namely,
Catalan language of eastern Iberia is also sometimes included in
Gallo-romance. This is however disputed by some linguists who prefer
to group it with Iberian Romance, since although Old Catalan is close
to Old Occitan, it later adjusted its lexicon to some degree to align
with Spanish. In general however, modern Catalan, especially
grammatically, remains closer to modern
Occitan than to either Spanish
Gallo-Italian languages of northern Italy, including Piedmontese,
Ligurian, Western Lombard, Eastern Lombard, Emilian and Romagnol.
Eastern Lombard retain the final -o, being the exception
Rhaeto-Romance languages, including Romansh, and Friulian, and
Gallo-Romance languages are generally considered the most
innovative (least conservative) among the Romance languages.
Gallo-Romance features generally developed earliest and
appear in their most extreme manifestation in the Langue d'oïl,
gradually spreading out along riverways and transalpine roads.
In some ways, however, the
Gallo-Romance languages are conservative.
The older stages of many of the languages preserved a two-case system
consisting of nominative and oblique, fully marked on nouns,
adjectives and determiners, inherited almost directly from the Latin
nominative and accusative and preserving a number of different
declensional classes and irregular forms. The languages closest to the
oïl epicenter preserve the case system the best, while languages at
the periphery lose it early.
Notable characteristics of the
Gallo-Romance languages are:
Early loss of unstressed final vowels other than /a/—a defining
characteristic of the group.
Further reductions of final vowels in
Langue d'oïl and many
Gallo-Italic languages, with the feminine /a/ and prop vowel /e/
merging into /ə/, which is often subsequently dropped.
Early, heavy reduction [clarification needed – reduction to what?]
of unstressed vowels in the interior of a word (another defining
Loss of final vowels phonemicized the long vowels that used to be
automatic concomitants of stressed open syllables. These phonemic long
vowels are maintained directly in many Northern Italian dialects;
elsewhere, phonemic length was lost, but in the meantime many of the
long vowels diphthongized, resulting in a maintenance of the original
distinction. The langue d'oïl branch is again at the forefront of
innovation, with no less than five of the seven long vowels
diphthongizing (only high vowels were spared).
Front rounded vowels are present in all four branches.[clarification
needed – branches of what?] /u/ usually fronts to /y/, and secondary
mid front rounded vowels often develop from long /oː/ or /ɔː/.
Extreme lenition (i.e. multiple rounds of lenition) occurs in many
languages especially in
Langue d'oïl and many Gallo-Italian
The Langue d'oïl, Swiss
Rhaeto-Romance languages and many of the
northern dialects of
Occitan have a secondary palatalization of /k/
and /ɡ/ before /a/, producing different results from the primary
Romance palatalization: e.g. centum "hundred" > cent /sɑ̃/,
cantum "song" > chant /ʃɑ̃/.
Other than the Occitano-Romance languages, most Gallo-Romance
languages are subject-obligatory (whereas all the rest of the Romance
languages are pro-drop languages). This is a late development
triggered by progressive phonetic erosion:
Old French was still a
null-subject language, and this only changed upon loss of secondarily
final consonants in Middle French.
Pidgins, creoles, and mixed languages
Romance languages have developed varieties which seem
dramatically restructured as to their grammars or to be mixtures with
other languages. It is not always clear whether they should be
classified as Romance, pidgins, creole languages, or mixed languages.
Some other languages, such as Modern English, are sometimes thought of
as creoles of semi-Romance ancestry. There are several dozens of
creoles of French, Spanish, and Portuguese origin, some of them spoken
as national languages in former European colonies.
Creoles of French:
Antillean (French Antilles, Saint Lucia, Dominica)
Haitian (one of Haiti's two official languages)
Mauritian (lingua franca of Mauritius)
Réunion (native language of Réunion)
Seychellois (Seychelles' official language)
Creoles of Spanish:
Chavacano (in part of Philippines)
Palenquero (in part of Colombia)
Creoles of Portuguese:
Angolar (regional language in São Tomé and Principe)
Cape Verdean (Cape Verde's national language; includes several
Forro (regional language in São Tomé and Príncipe)
Papiamento (Dutch Antilles official language)
Upper Guinea (Guinea-Bissau's national language)
Auxiliary and constructed languages
Constructed language and International auxiliary
Latin and the
Romance languages have also served as the inspiration
and basis of numerous auxiliary and constructed languages, so-called
The concept was first developed in 1903 by Italian mathematician
Giuseppe Peano, under the title Latino sine flexione. He wanted to
create a naturalistic international language, as opposed to an
autonomous constructed language like
Volapuk which were
designed for maximal simplicity of lexicon and derivation of words.
Latin as the base of his language, because at the time of
his flourishing it was the de facto international language of
Other languages developed since include Idiom Neutral, Occidental,
Lingua Franca Nova, and most famously and successfully, Interlingua.
Each of these languages has attempted to varying degrees to achieve a
Latin vocabulary as common as possible to living Romance
There are also languages created for artistic purposes only, such as
Latin is a very well attested ancient language, some
amateur linguists have even constructed
Romance languages that mirror
real languages that developed from other ancestral languages. These
Brithenig (which mirrors Welsh), Breathanach (mirrors
Wenedyk (mirrors Polish), Þrjótrunn (mirrors Icelandic),
and Helvetian (mirrors German).
Romance languages have a number of shared features across all
Romance languages are moderately inflecting, i.e. there is a
moderately complex system of affixes (primarily suffixes) that are
attached to words to convey grammatical information such as number,
gender, person, tense, etc. Verbs have much more inflection than
nouns. The amount of synthesis is significantly more than English, but
less than Classical
Latin and much less than the oldest Indo-European
languages (e.g. Ancient Greek, Sanskrit).
Inflection is fusional, with
a single affix representing multiple features (as contrasted with
agglutinative languages such as Turkish or Japanese). For example,
Portuguese amei "I loved" is composed of am- "love" and the fusional
suffix -ei "first-person singular preterite indicative".
Romance languages have a primarily subject–verb–object word order,
with varying degrees of flexibility from one language to another.
Constructions are predominantly of the head-first (right-branching)
type. Adjectives, genitives and relative clauses all tend to follow
their head noun, although (except in Romanian) determiners usually
In general, nouns, adjectives and determiners inflect only according
to grammatical gender (masculine or feminine) and grammatical number
(singular or plural).
Grammatical case is marked only on pronouns, as
in English; case marking, as in English, is of the
nominative–accusative type (rather than e.g. the
ergative–absolutive marking of Basque or the split ergativity of
Hindi). A significant exception, however, is Romanian, with two-case
marking (nominative/accusative vs. genitive/dative) on nominal
Verbs are inflected according to a complex morphology that marks
person, number (singular or plural), tense, mood (indicative,
subjunctive, imperative), and sometimes aspect or gender. Grammatical
voice (active, passive, middle/reflexive) and some grammatical aspects
(in particular, the perfect aspect) are expressed using periphrastic
constructions, as in the Italian present perfect (passato prossimo) io
ho amato/io sono stato amato "I have loved/I have been loved".
Romance languages are null subject languages (but modern French
is not, as a result of the phonetic decay of verb endings).
Romance languages have two articles (definite and indefinite), and
many have in addition a partitive article (expressing the concept of
"some"). In some languages (notably, French), the use of an article
with a noun is nearly obligatory; it serves to express grammatical
number (no longer marked on most nouns) and to cope with the extreme
homophony of French vocabulary as a result of extensive sound
The phonemic inventory of most
Romance languages is of moderate size
with few unusual phonemes. Phonemic vowel length is uncommon. Some
languages have developed nasal vowels and/or front rounded vowels.
Word accent is of the stress (dynamic) type, rather than making use of
pitch (as in
Ancient Greek and some modern Slavic languages). Stress
occurs more or less predictably on one of the last three syllables.
Changes from Classical Latin
Loss of the case system
The most significant changes between Classical
Latin and Proto-Romance
(and hence all the modern Romance languages) relate to the reduction
or loss of the
Latin case system, and the corresponding syntactic
changes that were triggered.
The case system was drastically reduced from the vigorous six-case
system of Latin. Although four cases can be constructed for
Proto-Romance nouns (nominative, accusative, combined genitive/dative,
and vocative), the vocative is marginal and present only in Romanian
(where it may be an outright innovation), and of the remaining cases,
no more than two are present in any one language. Romanian is the only
modern Romance language with case marking on nouns, with a two-way
opposition between nominative/accusative and genitive/dative. Some of
Gallo-Romance languages (in particular, Old French, Old
Sursilvan and Old Friulian, and in traces Old Catalan and
Old Venetian) had an opposition between nominative and general
oblique, and in
Ibero-Romance languages, such as Spanish and
Portuguese, as well as in Italian (see under Case), a couple of
examples are found which preserve the old nominative. As in English,
case is preserved better on pronouns.
Concomitant with the loss of cases, freedom of word order was greatly
Latin had a generally verb-final (SOV) but overall
quite free word order, with a significant amount of word scrambling
and mixing of left-branching and right-branching constructions. The
Romance languages eliminated word scrambling and nearly all
left-branching constructions, with most languages developing a rigid
SVO, right-branching syntax. (Old French, however, had a freer word
order due to the two-case system still present, as well as a
predominantly verb-second word order developed under the influence of
the Germanic languages.) Some freedom, however, is allowed in the
placement of adjectives relative to their head noun. In addition, some
languages (e.g. Spanish, Romanian) have an "accusative preposition"
(Romanian pe, Spanish "personal a") along with clitic doubling, which
allows for some freedom in ordering the arguments of a verb.
Romance languages developed grammatical articles where
none. Articles are often introduced around the time a robust case
system falls apart in order to disambiguate the remaining case markers
(which are usually too ambiguous by themselves) and to serve as
parsing clues that signal the presence of a noun (a function that used
to beserved by the case endings themselves).
This was the pattern followed by the Romance languages: In the Romance
languages that still preserved a functioning nominal case system
(e.g., Romanian and Old French), only the combination of article and
case ending serves to uniquely identify number and case (compare the
similar situation in modern German). All
Romance languages have a
definite article (originally developed from ipse "self" but replaced
in nearly all languages by ille "that (over there)") and an indefinite
article (developed from ūnus "one"). Many also have a partitive
article (dē "of" + definite article).
Latin had a large number of syntactic constructions expressed through
infinitives, participles, and similar nominal constructs. Examples are
the ablative absolute, the accusative-plus-infinitive construction
used for reported speech, gerundive constructions, and the common use
of reduced relative clauses expressed through participles. All of
these are replaced in the
Romance languages by subordinate clauses
expressed with finite verbs, making the
Romance languages much more
"verbal" and less "nominal" than Latin. Under the influence of the
Balkan sprachbund, Romanian has progressed the furthest, largely
eliminating the infinitive. (It is being revived, however, due to the
increasing influence of other Romance languages.)
Loss of phonemic vowel length, and change into a free-stressed
Latin had an automatically determined stress on
the second or third syllable from the end, conditioned by vowel
length; once vowel length was neutralized, stress was no longer
predictable so long as it remained where it was (which it mostly did).
Development of a series of palatal consonants as a result of
Loss of most traces of the neuter gender.
Development of a series of analytic perfect tenses, comparable to
English "I have done, I had done, I will have done".
Loss of the
Latin synthetic passive voice, replaced by an analytic
construction comparable to English "it is/was done".
Loss of deponent verbs, replaced by active-voice verbs.
Replacement of the
Latin future tense with a new tense formed
(usually) by a periphrasis of infinitive + present tense of habēre
"have", which usually contracts into a new synthetic tense. A
corresponding conditional tense is formed in the same way but using
one of the past-tense forms of habēre.
Numerous lexical changes. A number of words were borrowed from the
Germanic languages and Celtic languages. Many basic nouns and verbs,
especially those that were short or had irregular morphology, were
replaced by longer derived forms with regular morphology. Throughout
the medieval period, words were borrowed from Classical
Latin in their
original form (learned words) or in something approaching the original
form (semi-learned words), often replacing the popular forms of the
Every language has a different set of vowels from every other. Common
characteristics are as follows:
Most languages have at least five monophthongs /a e i o u/. The parent
language of most of the Italo-
Western Romance languages (which
includes the vast majority) actually had a seven-vowel system /a ɛ e
i ɔ o u/, which is kept in most Italo-Western languages. In some
languages, like Spanish and Romanian, the phonemic status and
difference between open-mid and close-mid vowels was lost. French has
probably the largest inventory of monophthongs, with conservative
varieties having 12 oral vowels /a ɑ ɛ e i ɔ o u œ ø y ə/
and 4 nasal vowels /ɑ̃ ɛ̃ ɔ̃ œ̃/.
European Portuguese also has
a large inventory, with 9 oral monophthongs /a ɐ ɛ e i ɔ o u ɨ/, 5
nasal monophthongs /ɐ̃ ẽ ĩ õ ũ/, and a large number of oral and
nasal diphthongs (see below). (The phonemic status of /ɐ ɨ/ is
somewhat doubtful, however, and neither phoneme exists in Brazilian
Some languages have a large inventory of falling diphthongs. These may
or may not be considered as phonemic units (rather than sequences of
vowel+glide or vowel+vowel), depending on their behavior. As an
example, French, Spanish and Italian have occasional instances of
putative falling diphthongs formed from a vowel plus a non-syllabic
/i/ or /u/ (e.g. Spanish veinte [ˈbejn̪te] "twenty", deuda
[ˈdewða] "debt"; French paille [pɑj] "straw", caoutchouc
[kawˈtʃu] "rubber"; Italian lui [ˈluj] "he", potei [poˈtej] "I
could"), but these are normally analyzed as sequences of vowel and
glide. The diphthongs in Romanian, Portuguese,
Catalan and Occitan, however, have various properties suggesting that
they are better analyzed as unit phonemes. Portuguese, for example,
has the diphthongs /aj ɐj ɛj ej ɔj oj uj aw ɛw ew iw (ow)/, where
/ow/ (and to a lesser extent /ej/) appear only in some dialects. All
except /aw ɛw/ appear frequently in verb or noun inflections.
(Portuguese also has nasal diphthongs; see below.)
Among the major Romance languages, Portuguese and French have nasal
vowel phonemes, stemming from nasalization before a nasal consonant
followed by loss of the consonant (this occurred especially when the
nasal consonant was not directly followed by a vowel). Originally,
vowels in both languages were nasalized before all nasal consonants,
but have subsequently become denasalized before nasal consonants that
still remain (except in Brazilian Portuguese, where the pre-nasal
vowels in words such as cama "bed", menos "less" remain highly
nasalized). In Portuguese, nasal vowels are sometimes analyzed as
phonemic sequences of oral vowels plus an underlying nasal consonant,
but such an analysis is difficult in French because of the existence
of minimal pairs such as bon /bɔ̃/ "good (masc.)", bonne /bɔn/
"good (fem.)". In both languages, there are fewer nasal than oral
vowels. Nasalization triggered vowel lowering in French, producing the
4 nasal vowels /ɑ̃ ɛ̃ ɔ̃ œ̃/ (although most speakers in France
nowadays pronounce /œ̃/ as /ɛ̃/).
Vowel raising was triggered in
Portuguese, however, producing the 5 nasal vowels /ɐ̃ ẽ ĩ õ ũ/.
Vowel contraction and other changes also resulted in the Portuguese
nasal diphthongs /ɐ̃w̃ õw̃ ɐ̃j̃ ẽj̃ õj̃ ũj̃/ (of which
/ũj̃/ occurs in only two words, muito /mũj̃tu/ "much, many, very",
and mui /mũj̃/ "very"; and /ẽj̃ õw̃/ are actually
final-syllable allophones of /ẽ õ/).
Most languages have fewer vowels in unstressed syllables than stressed
syllables. This again reflects the Italo-
Western Romance parent
language, which had a seven-vowel system in stressed syllables (as
described above) but only /a e i o u/ (with no low-mid vowels) in
unstressed syllables. Some languages have seen further reductions:
e.g. Standard Catalan has only [ə i u] in unstressed syllables. In
French, on the other hand, any vowel may take prosodic stress.
Most languages have even fewer vowels in word-final unstressed
syllables than elsewhere. For example,
Old Italian allowed only /a e i
o/, while the early stages of most
Western Romance languages allowed
only /a e o/. The
Gallo-Romance languages went even farther, deleting
all final vowels except /a/. Of these languages, French has carried
things to the extreme by deleting all vowels after the accented
syllable and uniformly accenting the final syllable (except for a
more-or-less non-phonemic final unstressed [ə] that occasionally
appears). Modern Spanish now allows final unstressed /i u/, and modern
Italian allows final unstressed /u/, but they tend to occur largely in
borrowed or onomatopoeic words, e.g. guru "guru", taxi "taxi", Spanish
tribu "tribe" and espíritu "spirit" (loanwords from Classical Latin),
Italian babau "bogeyman" (onomatopoeic, cf. English "boo!"). The
apparent Spanish exception casi "almost" originates from
"as if" < quam sī, and was probably influenced by si "if".
Phonemic vowel length is uncommon.
Vulgar Latin lost the phonemic
vowel length of Classical
Latin and replaced it with a non-phonemic
length system where stressed vowels in open syllables were long, and
all other vowels were short. Standard Italian still maintains this
system, and it was rephonemicized in the
Rhaeto-Romance languages) as a result of the deletion
of many final vowels. Some northern Italian languages (e.g. Friulan)
still maintain this secondary phonemic length, but in most languages
the new long vowels were either diphthongized or shortened again, in
the process eliminating phonemic length. French is again the odd man
out: Although it followed a normal
Gallo-Romance path by
diphthongizing five of the seven long vowels and shortening the
remaining two, it phonemicized a third vowel length system around 1300
AD in syllables that had been closed with an /s/ (still marked with a
circumflex accent), and now is phonemicizing a fourth system as a
result of lengthening before final voiced fricatives.
In modern spoken and literary Romanian, Slavic influences are evident
in phonetics and morphology. Phonetic Slavicisms include the iotation
of the initial e in words such as el, ea, este pronounced [jel], [ja],
[jeste] (compare Spanish: el, ella, estamos, without the Slavic
Romance languages have similar sets of consonants. The following
is a combined table of the consonants of the five major Romance
languages (French, Spanish, Italian, Portuguese, Romanian).
bold: Appears in all 5 languages.
italic: Appears in 3–4 languages.
(parentheses): Appears in 2 languages.
((double parentheses)): Appears in only 1 language.
Spanish has no phonemic voiced fricatives (however, [β ð ɣ] occur
as allophones of /b d ɡ/ after a vowel and after certain consonants).
The equivalent of /v/ merged with /b/, and all the rest became
voiceless. It also lost /ʃ/, which became /x/ or /h/ in some other
The western languages (French, Spanish, Portuguese) all used to have
the affricates /ts/, /dz/, /tʃ/, /dʒ/. By the fourteenth century or
so, these all turned into fricatives except for Spanish and dialectal
Portuguese /tʃ/. (Spanish /ts/ ended up becoming /θ/, at least in
Northern and Central Spain; elsewhere, it merged with /s/, as in the
other languages.) Romanian /dz/ likewise became /z/.
French, and most varieties of Spanish, have lost /ʎ/ (which merged
with /j/). Romanian merged both /ʎ/ and /ɲ/ into /j/.
Romanian was influenced by Slavic phonology, mostly the palatalization
of consonants in the plural form (for example pom-pomi and lup-lupi,
pronounced [pomʲ] and [lupʲ]) and changing of /l/ to /r/, for
Latin schola/scola > Slav. школа, școla > modern
Romanian școală [ˈʃko̯alə] "school".
Most instances of most of the sounds below that occur (or used to
occur, as described above) in all of the languages are cognate.
Although all of the languages have or used to have /tʃ/, almost none
of these sounds are cognate between pairs of languages. The only real
exception is many /tʃ/ between Italian and Romanian, stemming from
Latin C- before E or I. Italian also has /tʃ/ from
Vulgar Latin -CY-,
and from -TY- following a consonant (elsewhere /ts/). Former French
/tʃ/ is from
Latin C- before A, either word-initial or following a
consonant; Spanish /tʃ/ is from
Latin -CT-, or from PL, CL following
a consonant; former Portuguese /tʃ/ is from
Latin PL, CL, FL, either
word-initial or following a consonant.
Italian and former Romanian /dz/ (from some instances of Vulgar Latin
-DY-) are not cognate with former western /dz/ (from lenition of
Word stress was rigorously predictable in classical
Latin except in a
very few exceptional cases, either on the penultimate syllable (second
from last) or antepenultimate syllable (third from last), according to
the syllable weight of the penultimate syllable. Stress in the Romance
Languages mostly remains on the same syllable as in Latin, but various
sound changes have made it no longer so predictable. Minimal pairs
distinguished only by stress exist in some languages, e.g. Italian
Papa [ˈpa.pa] "Pope" vs. papà [pa.ˈpa] "daddy", or Spanish límite
[ˈli.mi.te] "[a] limit", present subjunctive limite [li.ˈmi.te]
"[that] [I/he] limit" and preterite limité [li.mi.ˈte] "[I]
Erosion of unstressed syllables following the stress has caused most
Spanish and Portuguese words to have either penultimate or ultimate
Latin trēdecim "thirteen" > Spanish trece, Portuguese
Latin amāre "to love" > Spanish/Portuguese amar. Most words
with antepenultimate stress are learned borrowings from Latin, e.g.
Spanish/Portuguese fábrica "factory" (the corresponding inherited
word is Spanish fragua, Portuguese frágua "forge"). This process has
gone even farther in French, with deletion of all post-stressed
vowels, leading to consistent, predictable stress on the last
Latin Stephanum "Stephen" >
Old French Estievne >
French Étienne /e.ˈtjɛn/;
Latin juvenis "young" > Old French
juevne > French jeune /ʒœn/. This applies even to borrowings:
Latin fabrica > French borrowing fabrique /fa.ˈbʀik/ (the
inherited word in this case being monosyllabic forge < Pre-French
Other than French (with consistent final stress), the position of the
stressed syllable generally falls on one of the last three syllables.
Exceptions may be caused by clitics or (in Italian) certain verb
endings, e.g. Italian telefonano [teˈlɛ.fo.na.no] "they telephone";
Spanish entregándomelo [en.tɾe.ˈɣan.do.me.lo] "delivering it to
me"; Italian mettiamocene [meˈtːjaː.mo.tʃe.ne] "let's put some of
it in there"; Portuguese dávamo-vo-lo [ˈda.vɐ.mu.vu.lu] "we were
giving it to you". Stress on verbs is almost completely predictable in
Spanish and Portuguese, but less so in Italian.
Nouns, adjectives, and pronouns can be marked for gender, number and
case. Adjectives and pronouns must agree in all features with the noun
they are bound to.
Romance languages inherited from
Latin two grammatical numbers,
singular and plural; the only trace of a dual number comes from Latin
ambō > Spanish and Portuguese ambos, Old Romanian îmbi >
Old French ambe, Italian ambedue, entrambi.
Romance languages have two grammatical genders, masculine and
feminine. The gender of animate nouns is generally natural (i.e. nouns
referring to men are generally masculine, and vice versa), but for
nonanimate nouns it is arbitrary.
Latin had a third gender (neuter), there is little trace of
this in most languages. The biggest exception is Romanian, where there
is a productive class of "neuter" nouns, which include the descendants
Latin neuter nouns and which behave like masculines in the
singular and feminines in the plural, both in the endings used and in
the agreement of adjectives and pronouns (e.g. un deget "one finger"
vs. două degete "two fingers", cf.
Latin digitus, pl. digiti).
Such nouns arose because of the identity of the
Latin neuter singular
-um with the masculine singular, and the identity of the
plural -a with the feminine singular. A similar class exists in
Italian, although it is no longer productive (e.g. il dito "the
finger" vs. le dita "the fingers", l'uovo "the egg" vs. le uova "the
eggs"). A similar phenomenon may be observed in Albanian (which is
heavily Romance-influenced), and the category remains highly
productive with a number of new words loaned or coined in the neuter
((një) hotel one hotel(m) vs. (tri) hotele three hotels (f)). (A few
isolated nouns in
Latin had different genders in the singular and
plural, but this was an unrelated phenomenon; this is similarly the
case with a few French nouns, such as amour, délice, orgue.)
Spanish also has vestiges of the neuter in the demonstrative
adjectives: esto, eso, aquello, the pronoun ello (meaning "it") and
the article lo (used to intensify adjectives). Portuguese also has
neuter demonstrative adjectives: "isto", "isso", "aquilo" (meaning
"this [near me]", "this/that [near you]", "that [far from the both of
Remnants of the neuter, interpretable now as "a sub-class of the
non-feminine gender" (Haase 2000:233), are vigorous in
Italy in an
area running roughly from Ancona to Matera and just north of Rome to
Naples. Oppositions with masculine typically have been recategorized,
so that neuter signifies the referent in general, while masculine
indicates a more specific instance, with the distinction marked by the
definite article. In Southeast Umbrian, for example, neuter lo pane is
'the bread', while masculine lu pane refers to an individual piece or
loaf of bread. Similarly, neuter lo vinu is wine in general, while
masculine lu vinu is a specific sort of wine, with the consequence
that mass lo vinu has no plural counterpart, but lu vinu can take a
sortal plural form li vini, referring to different types of wine.
Phonological forms of articles vary by locale.
Latin had an extensive case system, where all nouns were declined in
six cases (nominative, vocative, accusative, dative, genitive, and
ablative) and two numbers. Many adjectives were additionally
declined in three genders, leading to a possible 6 × 2 × 3 = 36
endings per adjective (although this was rarely the case). In
practice, some category combinations had identical endings to other
combinations, but a basic adjective like bonus "good" still had 14
Spanish pronoun inflections
suyo; de él
suyo; de ella
suyo; de ellos
suyo; de ellas
In all Romance languages, this system was drastically reduced. In most
modern Romance languages, in fact, case is no longer marked at all on
nouns, adjectives and determiners, and most forms are derived from the
Latin accusative case. Much like English, however, case has survived
somewhat better on pronouns.
Most pronouns have distinct nominative, accusative, genitive and
possessive forms (cf. English "I, me, mine, my"). Many also have a
separate dative form, a disjunctive form used after prepositions, and
(in some languages) a special form used with the preposition con
"with" (a conservative feature inherited from
Latin forms such as
mēcum, tēcum, nōbīscum).
Spanish inflectional classes
The system of inflectional classes is also drastically reduced. The
basic system is most clearly indicated in Spanish, where there are
only three classes, corresponding to the first, second and third
declensions in Latin: plural in -as (feminine), plural in -os
(masculine), plural in -es (either masculine or feminine). The
singular endings exactly track the plural, except the singular -e is
dropped after certain consonants.
The same system underlines many other modern Romance languages, such
as Portuguese, French and Catalan. In these languages, however,
further sound changes have resulted in various irregularities. In
Portuguese, for example, loss of /l/ and /n/ between vowels (with
nasalization in the latter case) produces various irregular plurals
(nação – nações "nation(s)"; hotel – hotéis "hotel(s)").
In French and Catalan, loss of /o/ and /e/ in most unstressed final
syllables has caused the -os and -es classes to merge. In French,
merger of remaining /e/ with final /a/ into [ə], and its subsequent
loss, has completely obscured the original Romance system, and loss of
final /s/ has caused most nouns to have identical pronunciation in
singular and plural, although they are still marked differently in
spelling (e.g. femme – femmes "woman – women", both pronounced
Romanian noun inflections
Noun inflection has survived in Romanian somewhat better than
elsewhere.:399 Determiners are still marked for two cases
(nominative/accusative and genitive/dative) in both singular and
plural, and feminine singular nouns have separate endings for the two
cases. In addition, there is a separate vocative case, enriched with
native development and Slavic borrowings (see some examples here) and
the combination of noun with a following clitic definite article
produces a separate set of "definite" inflections for nouns.
The inflectional classes of
Latin have also survived more in Romanian
than elsewhere, e.g. om – oameni "man – men" (
Latin homo –
homines); corp – corpuri "body – bodies" (
Latin corpus –
corpora). (Many other exceptional forms, however, are due to later
sound changes or analogy, e.g. casă – case "house(s)" vs. lună –
luni "moon(s)"; frate – fraţi "brother(s)" vs. carte – cărţi
"book(s)" vs. vale – văi "valley(s)".)
In Italian, the situation is somewhere in between Spanish and
Romanian. There are no case endings and relatively few classes, as in
Spanish, but noun endings are generally formed with vowels instead of
/s/, as in Romanian: amico – amici "friend(s) (masc.)", amica –
amiche "friend(s) (fem.)"; cane – cani "dog(s)". The masculine
plural amici is thought to reflect the
Latin nominative plural -ī
rather than accusative plural -ōs (Spanish -os); however, the other
plurals are thought to stem from special developments of
Evolution of case in various
Romance languages (
Latin bonus "good")
A different type of noun inflection survived into the medieval period
in a number of western
Romance languages (Old French, Old Occitan, and
the older forms of a number of
Rhaeto-Romance languages). This
inflection distinguished nominative from oblique, grouping the
accusative case with the oblique, rather than with the nominative as
The oblique case in these languages generally inherits from the Latin
accusative; as a result, masculine nouns have distinct endings in the
two cases while most feminine nouns do not.
A number of different inflectional classes are still represented at
this stage. For example, the difference in the nominative case between
masculine li voisins "the neighbor" and li pere "the father", and
feminine la riens "the thing" vs. la fame "the woman", faithfully
reflects the corresponding
Latin inflectional differences (vicīnus
vs. pater, fēmina vs. rēs).
A number of synchronically quite irregular differences between
nominative and oblique reflect direct inheritances of Latin
third-declension nouns with two different stems (one for the
nominative singular, one for all other forms), most with of which had
a stress shift between nominative and the other forms: li ber – le
baron "baron" (barō – barōnem); la suer – la seror "sister"
(soror – sorōrem); li prestre – le prevoire "priest" (presbyter
– presbyterem); li sire – le seigneur "lord" (senior –
seniōrem); li enfes – l'enfant "child" (infāns –
A few of these multi-stem nouns derive from
Latin forms without stress
shift, e.g. li om – le ome "man" (homō – hominem). All of these
multi-stem nouns refer to people; other nouns with stress shift in
Latin (e.g. amor – amōrem "love") have not survived. Some of the
same nouns with multiple stems in
Old French or Old
Occitan have come
down in Italian in the nominative rather than the accusative (e.g.
uomo "man" < homō, moglie "wife" < mulier), suggesting that a
similar system existed in pre-literary Italian.
The modern situation in
Sursilvan (one of the Rhaeto-Romance
languages) is unique in that the original nominative/oblique
distinction has been reinterpreted as a predicative/attributive
il hotel ej vɛɲiws natsionalizaws "the hotel has been nationalized"
il hotel natsionalizaw "the nationalized hotel"
As described above, case marking on pronouns is much more extensive
than for nouns. Determiners (e.g. words such as "a", "the", "this")
are also marked for case in Romanian.
Romance languages have the following sets of pronouns and
Personal pronouns, in three persons and two genders.
A reflexive pronoun, used when the object is the same as the subject.
This approximately corresponds to English "-self", but separate forms
exist only in the third person, with no number marking.
Definite and indefinite articles, and in some languages, a partitive
article that expresses the concept of "some".
A two-way or three-way distinction among demonstratives. Many
languages have a three-way distinction of distance (near me, near you,
near him) which, though not paralleled in current English, used to be
present as "this/that/yon".
Relative pronouns and interrogatives, with the same forms used for
both (similar to English "who" and "which").
Various indefinite pronouns and determiners (e.g. Spanish algún
"some", alguien "someone", algo "something"; ningún "no", nadie "no
one"; todo "every"; cada "each"; mucho "much/many/a lot", poco
"few/little"; otro "other/another"; etc.).
Unlike in English, a separate neuter personal pronoun ("it") generally
does not exist, but the third-person singular and plural both
distinguish masculine from feminine. Also, as described above, case is
marked on pronouns even though it is not usually on nouns, similar to
English. As in English, there are forms for nominative case (subject
pronouns), oblique case (object pronouns), and genitive case
(possessive pronouns); in addition, third-person pronouns distinguish
accusative and dative. There is also an additional set of possessive
determiners, distinct from the genitive case of the personal pronoun;
this corresponds to the English difference between "my, your" and
Development from Latin
Romance languages do not retain the
Latin third-person personal
pronouns, but have innovated a separate set of third-person pronouns
by borrowing the demonstrative ille ("that (over there)"), and
creating a separate reinforced demonstrative by attaching a variant of
ecce "behold!" (or "here is ...") to the pronoun.
Similarly, in place of the genitive of the
Latin pronouns, most
Romance languages adopted the reflexive possessive, which then serves
indifferently as both reflexive and non-reflexive possessive. Note
that the reflexive, and hence the third-person possessive, is unmarked
for the gender of the person being referred to. Hence, although
gendered possessive forms do exist—e.g. Portuguese seu (masc.) vs.
sua (fem.)—these refer to the gender of the object possessed, not
The gender of the possessor needs to be made clear by a collocation
such as French la voiture à lui/elle, Portuguese o carro dele/dela,
literally "the car of him/her". (In spoken Brazilian Portuguese, these
collocations are the usual way of expressing the third-person
possessive, since the former possessive seu carro now has the meaning
The same demonstrative ille was borrowed to create the definite
article (see below), which explains the similarity in form between
personal pronoun and definite article. When the two are different, it
is usually because of differing degrees of phonetic reduction.
Generally, the personal pronoun is unreduced (beyond normal sound
change), while the article has suffered various amounts of reduction,
e.g. Spanish ella "she" < illa vs. la "the (fem.)" < -la <
Object pronouns in
Latin were normal words, but in the Romance
languages they have become clitic forms, which must stand adjacent to
a verb and merge phonologically with it. Originally, object pronouns
could come either before or after the verb; sound change would often
produce different forms in these two cases, with numerous additional
complications and contracted forms when multiple clitic pronouns
Catalan still largely maintains this system with a highly complex
clitic pronoun system. Most languages, however, have simplified this
system by undoing some of the clitic mergers and requiring clitics to
stand in a particular position relative to the verb (usually after
imperatives, before other finite forms, and either before or after
non-finite forms depending on the language).
When a pronoun cannot serve as a clitic, a separate disjunctive form
is used. These result from dative object pronouns pronounced with
stress (which causes them to develop differently from the equivalent
unstressed pronouns), or from subject pronouns.
Romance languages are null subject languages. The subject
pronouns are used only for emphasis and take the stress, and as a
result are not clitics. In French, however (as in Friulian and in some
Gallo-Italian languages of northern Italy), verbal agreement marking
has degraded to the point that subject pronouns have become mandatory,
and have turned into clitics. These forms cannot be stressed, so for
emphasis the disjunctive pronouns must be used in combination with the
clitic subject forms. Friulian and the
Gallo-Italian languages have
actually gone further than this and merged the subject pronouns onto
the verb as a new type of verb agreement marking, which must be
present even when there is a subject noun phrase. (Some non-standard
varieties of French treat disjunctive pronouns as arguments and clitic
pronouns as agreement markers.)
In medieval times, most
Romance languages developed a distinction
between familiar and polite second-person pronouns (a so-called T-V
distinction), similar to the former English distinction between
familiar "thou" and polite "you". This distinction was determined by
the relationship between the speakers. As in English, this
generally developed by appropriating the plural second-person pronoun
to serve in addition as a polite singular. French is still at this
stage, with familiar singular tu vs. formal or plural vous. In cases
like this, the pronoun requires plural agreement in all cases whenever
a single affix marks both person and number (as in verb agreement
endings and object and possessive pronouns), but singular agreement
elsewhere where appropriate (e.g. vous-même "yourself" vs.
Many languages, however, innovated further in developing an even more
polite pronoun, generally composed of a noun phrase (e.g. Portuguese
vossa mercê "your mercy", progressively reduced to vossemecê,
vosmecê and finally você) and taking third-person singular
agreement. A plural equivalent was created at the same time or soon
after (Portuguese vossas mercês, reduced to vocês), taking
third-person plural agreement. Spanish innovated similarly, with
usted(es) from earlier vuestra(s) merced(es).
In Portuguese and Spanish (as in other languages with similar forms),
the "extra-polite" forms in time came to be the normal polite forms,
and the former polite (or plural) second-person vos knocked down to a
familiar form, either becoming a familiar plural (as in European
Spanish) or a familiar singular (as in many varieties of Latin
American Spanish). In the latter case, it either competes with the
original familiar singular tu (as in Guatemala), displaces it entirely
(as in Argentina), or is itself displaced (as in Mexico, except in
Chiapas). In American Spanish, the gap created by the loss of familiar
plural vos was filled by originally polite ustedes, with the result
that there is no familiar/polite distinction in the plural, just as in
the original tu/vos system.
A similar path was followed by Italian and Romanian. Romanian uses
dumneavoastră "your lordship", while Italian the former polite phrase
sua eccellenza "your excellency" has simply been supplanted by the
corresponding pronoun Ella or Lei (literally "she", but capitalized
when meaning "you"). As in European Spanish, the original
second-person plural voi serves as familiar plural. (In Italy, during
fascist times leading up to World War II, voi was resurrected as a
polite singular, and discarded again afterwards, although it remains
in some southern dialects.)
Portuguese innovated again in developing a new extra-polite pronoun o
senhor "the sir", which in turn downgraded você. Hence, modern
European Portuguese has a three-way distinction between "familiar" tu,
"equalizing" você and "polite" o senhor. (The original second-person
plural vós was discarded centuries ago in speech, and is used today
only in translations of the Bible, where tu and vós serve as
universal singular and plural pronouns, respectively.)
Brazilian Portuguese, however, has diverged from this system, and most
dialects simply use você (and plural vocês) as a general-purpose
second-person pronoun, combined with te (from tu) as the clitic object
pronoun. The form o senhor (and feminine a senhora) is sometimes used
in speech, but only in situations where an English speaker would say
"sir" or "ma'am". The result is that second-person verb forms have
disappeared, and the whole pronoun system has been radically
realigned. However that is the case only in the spoken language of
central and northern Brazil, with the northeastern and southern areas
of the country still largely preserving the second-person verb form
and the "tu" and "você" distinction.
Latin had no articles as such. The closest definite article was the
non-specific demonstrative is, ea, id meaning approximately
"this/that/the". The closest indefinite articles were the indefinite
determiners aliquī, aliqua, aliquod "some (non-specific)" and certus
Romance languages have both indefinite and definite articles, but none
of the above words form the basis for either of these. Usually the
definite article is derived from the
Latin demonstrative ille
("that"), but some languages (e.g. Sardinian, and some dialects spoken
around the Pyrenees) have forms from ipse (emphatic, as in "I
myself"). The indefinite article everywhere is derived from the number
Some languages, e.g. French and Italian, have a partitive article that
approximately translates as "some". This is used either with mass
nouns or with plural nouns—both cases where the indefinite article
cannot occur. A partitive article is used (and in French, required)
whenever a bare noun refers to specific (but unspecified or unknown)
quantity of the noun, but not when a bare noun refers to a class in
general. For example, the partitive would be used in both of the
I want milk.
Men arrived today.
But neither of these:
Milk is good for you.
I hate men.
The sentence "Men arrived today", however, (presumably) means "some
specific men arrived today" rather than "men, as a general class,
arrived today" (which would mean that there were no men before today).
On the other hand, "I hate men" does mean "I hate men, as a general
class" rather than "I hate some specific men".
As in many other cases, French has developed the farthest from Latin
in its use of articles. In French, nearly all nouns, singular and
plural, must be accompanied by an article (either indefinite,
definite, or partitive) or demonstrative pronoun.
Due to pervasive sound changes in French, most nouns are pronounced
identically in the singular and plural, and there is often heavy
homophony between nouns and identically pronounced words of other
classes. For example, all of the following are pronounced /sɛ̃/:
sain "healthy"; saint "saint, holy"; sein "breast"; ceins "(you) put
on, gird"; ceint "(he) puts on, girds"; ceint "put on, girded"; and
the equivalent noun and adjective plural forms sains, saints, seins,
ceints. The article helps identify the noun forms saint or sein, and
distinguish singular from plural; likewise, the mandatory subject of
verbs helps identify the verb ceint. In more conservative Romance
languages, neither articles nor subject pronouns are necessary, since
all of the above words are pronounced differently. In Italian, for
example, the equivalents are sano, santo, seno, cingi, cinge, cinto,
sani, santi, seni, cinti, where all vowels and consonants are
pronounced as written, and ⟨s⟩ /s/ and ⟨c⟩ /t͡ʃ/ are clearly
distinct from each other.
Latin, at least originally, had a three-way distinction among
demonstrative pronouns distinguished by distal value: hic 'this', iste
'that (near you)', ille 'that (over there)', similar to the
distinction that used to exist in English as "this" vs. "that" vs.
"yon(der)". In urban
Latin of Rome, iste came to have a specifically
derogatory meaning, but this innovation apparently did not reach the
provinces and is not reflected in the modern Romance languages. A
number of these languages still have such a three-way distinction,
although hic has been lost and the other pronouns have shifted
somewhat in meaning. For example, Spanish has este "this" vs. ese
"that (near you)" vs. aquel (fem. aquella) "that (over yonder)". The
Spanish pronouns derive, respectively, from
Latin iste ipse accu-ille,
where accu- is an emphatic prefix derived from eccum "behold (it!)"
(still vigorous in
Italy as Ecco! 'Behold!'), possibly with influence
from atque "and".
Reinforced demonstratives such as accu-ille arose as ille came to be
used as an article as well as a demonstrative. Such forms were often
created even when not strictly needed to distinguish otherwise
ambiguous forms. Italian, for example, has both questo "this"
(eccu-istum) and quello "that" (eccu-illum), in addition to dialectal
codesto "that (near you)" (*eccu-tē-istum). French generally prefers
forms derived from bare ecce "behold", as in the pronoun ce "this
one/that one" (earlier ço, from ecce-hoc; cf. Italian ciò 'that')
and the determiner ce/cet "this/that" (earlier cest, from ecce-istum).
Reinforced forms are likewise common in locative adverbs (words such
as English here and there), based on related
Latin forms such as hic
"this" vs. hīc "here", hāc "this way", and ille "that" vs. illīc
"there", illāc "that way". Here again French prefers bare ecce while
Spanish and Italian prefer eccum (French ici "here" vs. Spanish aquí,
Italian qui). In western languages such as Spanish, Portuguese and
Catalan, doublets and triplets arose such as Portuguese aqui, acá,
cá "(to) here" (accu-hīc, accu-hāc, eccu-hāc). From these, a
prefix a- was extracted, from which forms like aí "there (near you)"
(a-(i)bi) and ali "there (over yonder)" (a-(i)llīc) were created;
compare Catalan neuter pronouns açò (acce-hoc) "this", això
(a-(i)psum-hoc) "that (near you)", allò (a-(i)llum-hoc) "that
Subsequent changes often reduced the number of demonstrative
distinctions. Standard Italian, for example, has only a two-way
distinction "this" vs. "that", as in English, with second-person and
third-person demonstratives combined. In Catalan, however, a former
three-way distinction aquest, aqueix, aquell has been reduced
differently, with first-person and second-person demonstratives
combined. Hence aquest means either "this" or "that (near you)"; on
the phone, aquest is used to refer both to speaker and addressee.
Old French had a similar distinction to Italian (cist/cest vs.
cil/cel), both of which could function as either adjectives or
pronouns. Modern French, however, has no distinction between "this"
and "that": ce/cet, cette < cest, ceste is only an adjective, and
celui, celle < cel lui, celle is only a pronoun, and both forms
indifferently mean either "this" or "that". (The distinction between
"this" and "that" can be made, if necessary, by adding the suffixes
-ci "here" or -là "there", e.g. cette femme-ci "this woman" vs. cette
femme-là "that woman", but this is rarely done except when
specifically necessary to distinguish two entities from each other.)
See also: Romance verbs
Latin and Romance tenses
Imperfect subjunctive /
eres ("you are")
future of "to be"
in Old French
Simple preterite (literary except in Valencian)
Simple past (literary)
Preterite (Tuscan Standard Italian);
Literary Remote Past
(Regional Standard Italian in North); Preterite/Perfect
(Regional Standard Italian in South)
Simple past (literary except in the Oltenian dialect)
In Old Sardinian;
only traces in modern lang
Imperfect subjunctive (-ra form)
in Old Occitan
in very early Old French
(Sequence of Saint Eulalia)
(very much in use)
possible traces of
in Old Occitan
possible traces of
in Old Italian
Conditional in Old Romanian (until 17th cent.)
(split apart from
in 18th-century Romanian)
Preterite vs. present perfect
(present perfect exists,
but has different meaning)
both (but usually an analytic preterite
vado+infinitive is used)
present perfect only
present perfect only
both (Tuscan Standard Italian);
present perfect only
(Regional Standard Italian in North);
preference for preterite
(Regional Standard Italian in South)
present perfect only
present perfect only
Verbs have many conjugations, including in most languages:
A present tense, a preterite, an imperfect, a pluperfect, a future
tense and a future perfect in the indicative mood, for statements of
Present and preterite subjunctive tenses, for hypothetical or
uncertain conditions. Several languages (for example, Italian,
Portuguese and Spanish) have also imperfect and pluperfect
subjunctives, although it is not unusual to have just one subjunctive
equivalent for preterit and imperfect (e.g. no unique subjunctive
equivalent in Italian of the so-called passato remoto). Portuguese and
Spanish also have future and future perfect subjunctives, which have
no equivalent in Latin.
An imperative mood, for direct commands.
Three non-finite forms: infinitive, gerund, and past participle.
Distinct active and passive voices, as well as an impersonal passive
Note that, although these categories are largely inherited from
Classical Latin, many of the forms are either newly constructed or
inherited from different categories (e.g. the Romance imperfect
subjunctive most commonly is derived from the
subjunctive, while the Romance pluperfect subjunctive is derived from
a new present perfect tense with the auxiliary verb placed in the
Several tenses and aspects, especially of the indicative mood, have
been preserved with little change in most languages, as shown in the
following table for the
Latin verb dīcere (to say), and its
l'à détt / dgé
dìsser2, l'ha dit
a zice, zicere4
he was saying
1The spelling is conservative. Note the pronunciations: dire /diʁ/,
dit /di/, disait /dizɛ/, dise /diz/, dis /di/.
2Until the eighteenth century.
3With the disused variant dize.
5In modern times, scheva.
6Sicilian now uses imperfect subjunctive dicissi in place of present
The main tense and mood distinctions that were made in classical Latin
are generally still present in the modern Romance languages, though
many are now expressed through compound rather than simple verbs. The
passive voice, which was mostly synthetic in classical Latin, has been
completely replaced with compound forms.
Owing to sound changes which made it homophonous with the preterite,
Latin future indicative tense was dropped, and replaced with a
periphrasis of the form infinitive + present tense of habēre (to
have). Eventually, this structure was reanalysed as a new future
In a similar process, an entirely new conditional form was created.
While the synthetic passive voice of classical
Latin was abandoned in
favour of periphrastic constructions, most of the active voice
remained in use. However, several tenses have changed meaning,
especially subjunctives. For example:
Latin pluperfect indicative became a conditional in Sicilian, and
an imperfect subjunctive in Spanish.
Latin pluperfect subjunctive developed into an imperfect
subjunctive in all languages except Romansh, where it became a
conditional, and Romanian, where it became a pluperfect indicative.
Latin preterite subjunctive, together with the future perfect
indicative, became a future subjunctive in Old Spanish, Portuguese,
Latin imperfect subjunctive became a personal infinitive in
Portuguese and Galician.
Romance languages have two verbs "to be". One is derived from
Vulgar Latin *essere <
Latin esse "to be" with an admixture of
forms derived from sedēre "to sit", and is used mostly for essential
attributes; the other is derived from stāre "to stand", and mostly
used for temporary states. This development is most notable in
Spanish, Portuguese and Catalan. In French, Italian and Romanian, the
derivative of stāre largely preserved an earlier meaning of "to
stand/to stay", although in modern Italian, stare is used in a few
constructions where English would use "to be", as in sto bene "I am
well". In Old French, the derivatives of *essere and stāre were estre
and ester, respectively. In modern French, estre persists as être "to
be" while ester has been lost as a separate verb; but the former
imperfect of ester is used as the modern imperfect of être (e.g. il
était "he was"), replacing the irregular forms derived from Latin
(e.g. ere(t), iere(t) < erat). In Italian, the two verbs share the
same past participle, stato. sedēre persists most notably in the
future of *essere (e.g. Spanish/Portuguese/French/etc. ser-, Italian
sar-), although in
Old French the future is a direct derivation from
Latin, e.g. (i)ert "he will be" < erit. See
Romance copula for
For a more detailed illustration of how the verbs have changed with
respect to classical Latin, see Romance verbs.
During the Renaissance, Italian, Portuguese, Spanish and a few other
Romance languages developed a progressive aspect which did not exist
in Latin. In French, progressive constructions remain very limited,
the imperfect generally being preferred, as in Latin.
Romance languages now have a verbal construction analogous to the
present perfect of English. In some, it has taken the place of the old
preterite (at least in the vernacular); in others, the two coexist
with somewhat different meanings (cf. English I did vs. I have done).
A few examples:
preterite only: Galician, Asturian, Sicilian, Leonese, Portuguese,
some dialects of Spanish;
preterite and present perfect: Catalan, Occitan, standard Spanish;
present perfect predominant, preterite now literary: French, Romanian,
several dialects of Italian, some dialects of Spanish;
present perfect only: Romansh
Note that in Catalan, the synthetic preterite is predominantly a
literary tense, except in Valencian; but an analytic preterite (formed
using an auxiliary vadō, which in other languages signals the future)
persists in speech, with the same meaning. In Portuguese, a
morphological present perfect does exist but has a different meaning
(closer to "I have been doing").
The following are common features of the
Romance languages (inherited
from Vulgar Latin) that are different from Classical Latin:
Adjectives generally follow the noun they modify.
The normal clause structure is SVO, rather than SOV, and is much less
flexible than in Latin.
Latin constructions involving nominalized verbal forms (e.g. the
use of accusative plus infinitive in indirect discourse and the use of
the ablative absolute) were dropped in favor of constructions with
subordinate clause. Exceptions can be found in Italian, for example,
Latin tempore permittente > Italian tempo permettendo; L. hoc facto
> I. ciò fatto.
Romance languages have borrowed heavily, though mostly from other
Romance languages. However, some, such as Spanish, Portuguese,
Romanian, and French, have borrowed heavily from other language
Vulgar Latin borrowed first from indigenous languages of the
Roman empire, and during the Germanic folk movements, from Germanic
languages, especially Gothic; for
Eastern Romance languages, during
Bulgarian Empires, from Slavic languages, especially Bulgarian.
Notable examples are *blancus "white", replacing native albus (but
Romansh alv, Dalmatian jualb, Romanian alb); *guerra "war", replacing
native bellum; and the words for the cardinal directions, where
cognates of English "north", "south", "east" and "west" replaced the
native words septentriō, merīdiēs (also "noon; midday nap"; cf.
Romanian meriză), oriens, and occidens. (See History of French –
The Franks.) Some Celtic words were incorporated into the core
vocabulary, partly for words with no
Latin equivalent (betulla
"birch", camisia "shirt", cerevisia "beer"), but in some cases
Latin vocabulary (gladius "sword", replacing ensis;
cambiāre "to exchange", replacing mūtāre except in Romanian and
Portuguese; carrus "cart", replacing currus; pettia "piece", largely
displacing pars (later resurrected) and eliminating frustum). Many
Greek loans also entered the lexicon, e.g. spatha "sword" (Greek:
σπάθη spáthē, replacing gladius which shifted to
"iris", cf. French épée, Spanish espada, Italian spada and Romanian
spată); cara "face" (Greek: κάρα kára, partly replacing
faciēs); colpe "blow" (Greek: κόλαφος kólaphos,
replacing ictus, cf. Spanish golpe, French coup); cata "each" (Greek:
κατά katá, replacing quisque); common suffixes
*-ijāre/-izāre (Greek: -ίζειν -izein, French oyer/-iser,
Spanish -ear/-izar, Italian -eggiare/-izzare, etc.), -ista (Greek:
Many basic nouns and verbs, especially those that were short or had
irregular morphology, were replaced by longer derived forms with
regular morphology. Nouns, and sometimes adjectives, were often
replaced by diminutives, e.g. auris "ear" > auricula (orig. "outer
ear") > oricla (Sardinian origra, Italian orecchia/o, Portuguese
orelha, etc.); avis "bird" > avicellus (orig. "chick, nestling")
> aucellu (
Friulan ucel, French oiseau, etc.);
caput "head" > capitium (Portuguese cabeça, Spanish cabeza, French
chevet "headboard"; but reflexes of caput were retained also,
sometimes without change of meaning, as in Italian capo "head",
alongside testa); vetus "old" > vetulus > veclus (Dalmatian
vieklo, Italian vecchio, Portuguese velho, etc.). Sometimes
augmentative constructions were used instead: piscis "fish" > Old
French peis > peisson (orig. "big fish") > French poisson. Verbs
were often replaced by frequentative constructions: canere "to sing"
> cantāre; iacere "to throw" > iactāre > *iectāre (Italian
gettare, Portuguese jeitar, Spanish echar, etc.); iuvāre >
adiūtāre (Italian aiutare, Spanish ayudar, French aider, etc.,
meaning "help", alongside e.g. iuvāre > Italian giovare "to be of
use"); vēnārī "hunt" (Romanian "vâna", Aromanian "avin, avinari")
> replaced by *captiāre "to hunt", frequentative of capere "to
seize" (Italian cacciare, Portuguese caçar, Romansh catschar, French
Latin words became archaic or poetic and were replaced
by more colloquial terms: equus "horse" > caballus (orig. "nag")
(but equa "mare" remains, cf. Spanish yegua, Portuguese égua,
Sardinian ebba, Romanian iapă); domus "house" > casa (orig.
"hut"); ignis "fire" > focus (orig. "hearth"); strāta "street"
> rūga (orig. "furrow") or callis (orig. "footpath") (but strāta
is continued in Italian strada, Romanian stradă and secondarily in
e.g. Spanish/Portuguese estrada "causeway, paved road"). In some
cases, terms from common occupations became generalized: invenīre "to
find" replaced by
Ibero-Romance afflāre (orig. "to sniff out", in
hunting, cf. Spanish hallar, Portuguese achar, Romanian afla(to find
out)); advenīre "to arrive" gave way to
Ibero-Romance plicāre (orig.
"to fold (sails; tents)", cf. Spanish llegar, Portuguese chegar;
Romanian pleca), elsewhere arripāre (orig. "to harbor at a
riverbank", cf. Italian arrivare, French arriver) (advenīre is
continued with the meaning "to achieve, manage to do" as in Middle
French aveindre, or "to happen" in Italian avvenire) . The same thing
sometimes happened to religious terms, due to the pervasive influence
of Christianity: loquī "to speak" succumbed to parabolāre (orig. "to
tell parables", cf.
Occitan parlar, French parler, Italian parlare )
or fabulārī (orig. "to tell stories", cf. Spanish hablar, Portuguese
falar), based on Jesus' way of speaking in parables.
Many prepositions were used as verbal particles to make new roots and
verb stems, e.g. Italian estrarre, Aromanian astragu, astradziri "to
Latin ex- "out of" and trahere "to pull" (Italian trarre
"draw, pull"), or to augment already existing words, e.g. French
coudre, Italian cucire, Portuguese coser "to sew", from cōnsuere "to
sew up", from suere "to sew", with total loss of the bare stem. Many
prepositions and commonly became compounded, e.g. de ex > French
dès "as of", ab ante > Italian avanti "forward". Some words
derived from phrases, e.g. Portuguese agora, Spanish ahora "now" <
hāc hōrā "at this hour"; French avec "with" (prep.) < Old French
avuec (adv.) < apud hoc ("near that"); Spanish tamaño, Portuguese
tamanho "size" < tam magnum "so big"; Italian codesto "this, that"
(near you) <
Old Italian cotevesto < eccum tibi istum approx.
"here's that thing of yours"; Portuguese você "you" < vosmecê
< vossemecê <
Galician-Portuguese vossa mercee "your
A number of common
Latin words that have disappeared in many or most
Romance languages have survived either in the periphery or in remote
Sardinia and Romania), or as secondary terms,
sometimes differing in meaning. For example,
Latin caseum "cheese" in
the more outer places (Portuguese queijo, Spanish queso, Romansh
caschiel, Sardinian càsu, Romanian caş), but in the central areas
has been replaced by formāticum, originally "moulded (cheese)"
(French fromage, Occitan/Catalan formatge, Italian formaggio, with,
however, cacio also available; similarly (com)edere "to eat (up)",
which survives as Spanish/Portuguese comer but elsewhere is replaced
by mandūcāre, originally "to chew" (French manger, Italian mangiare,
Catalan menjar, but Spanish/Portuguese noun manjar "food" or
"uplifting meal"). In some cases, one language happens to preserve a
word displaced elsewhere, e.g. Italian ogni "each, every" < omnes,
displaced elsewhere by tōtum, originally "whole" or by a reflex of
Greek κατά (e.g. Italian ognuno, Catalan tothom "everyone";
Italian ogni giorno, Spanish cada día "every day");
Friulan vaî "to
cry" < flere "to weep";
Vegliote otijemna "fishing pole" <
antenna "yardarm"; Aromanian "sprunã" (warm ashes) < pruna
(burning coal). Sardinian even preserves some words that were already
archaic in Classical Latin, e.g. àchina "grape" < acinam, also
found in Sicilian ràcina.
During the Middle Ages, scores of words were borrowed directly from
Latin (so-called Latinisms), either in their original form
(learned loans) or in a somewhat nativized form (semi-learned loans).
These resulted in many doublets—pairs of inherited and learned
words—such as those in the table below:
fabrica "craft, manufacture"
făură "blacksmith (archaic)"
advōcātus "advocate (noun)"
avoué "solicitor (attorney)"
avocat "barrister (attorney)"
polīre "to polish"
puir "to wear thin"
polir "to polish"
Sometimes triplets arise:
Latin articulus "joint" > Portuguese
artículo "joint, knuckle" (learned), artigo "article" (semi-learned),
artelho "ankle" (inherited; archaic and dialectal). In many cases, the
learned word simply displaced the original popular word: e.g. Spanish
crudo "crude, raw" (
Old Spanish cruo); French légume "vegetable" (Old
French leüm); Portuguese flor "flower" (
The learned loan always looks more like the original than the
inherited word does, because regular sound change has been bypassed;
and likewise, the learned word usually has a meaning closer to that of
the original. In French, however, the stress of the learned loan may
be on the "wrong" syllable, whereas the stress of the inherited word
always corresponds to the
Latin stress: e.g.
Latin vipera vs. French
vipère, learned loan, and guivre/vouivre, inherited.
Borrowing from Classical
Latin has produced a large number of suffix
doublets. Examples from Spanish (learned form first): -ción vs. -zon;
-cia vs. -za; -ificar vs. -iguar; -izar vs. -ear; -mento vs. -miento;
-tud (< nominative -tūdō) vs. -dumbre (< accusative -tūdine);
-ículo vs. -ejo; etc. Similar examples can be found in all the other
This borrowing also introduced large numbers of classical prefixes in
their original form (dis-, ex-, post-, trans-) and reinforced many
others (re-, popular Spanish/Portuguese des- < dis-, popular French
dé- < dis-, popular Italian s- < ex-). Many Greek prefixes and
suffixes (hellenisms) also found their way into the lexicon: tele-,
poli-/poly-, meta-, pseudo-, -scope/scopo, -logie/logia/logía, etc.
This article contains
IPA phonetic symbols. Without proper rendering
support, you may see question marks, boxes, or other symbols instead
Unicode characters. For an introductory guide on
IPA symbols, see
See also: Vulgar Latin
Significant sound changes affected the consonants of the Romance
There was a tendency to eliminate final consonants in Vulgar Latin,
either by dropping them (apocope) or adding a vowel after them
Many final consonants were rare, occurring only in certain
prepositions (e.g. ad "towards", apud "at, near (a person)"),
conjunctions (sed "but"), demonstratives (e.g. illud "that (over
there)", hoc "this"), and nominative singular noun forms, especially
of neuter nouns (e.g. lac "milk", mel "honey", cor "heart"). Many of
these prepositions and conjunctions were replaced by others, while the
nouns were regularized into forms based on their oblique stems that
avoided the final consonants (e.g. *lacte, *mele, *core).
Final -m was dropped in Vulgar Latin. Even in Classical Latin, final
-am, -em, -um (inflectional suffixes of the accusative case) were
often elided in poetic meter, suggesting the m was weakly pronounced,
probably marking the nasalisation of the vowel before it. This nasal
vowel lost its nasalization in the
Romance languages except in
monosyllables, where it became /n/ e.g. Spanish quien < quem
"whom", French rien "anything" < rem "thing"; note especially
French and Catalan mon < meum "my (m.sg.)" pronounced as one
syllable (/meu̯m/ > */meu̯n/, /mun/) but Spanish mío and
Portuguese and Catalan meu < meum pronounced as two (/ˈme.um/ >
As a result, only the following final consonants occurred in Vulgar
Final -t in third-person singular verb forms, and -nt (later reduced
in many languages to -n) in third-person plural verb forms.
Final -s (including -x) in a large number of morphological endings
(verb endings -ās/-ēs/-īs/-is, -mus, -tis; nominative singular
-us/-is; plural -ās/-ōs/-ēs) and certain other words (trēs
"three", sex "six", crās "tomorrow", etc.).
Final -n in some monosyllables (from earlier -m).
Final -r, -d in some prepositions (e.g. ad, per), which were clitics
that attached phonologically to the following word.
Very occasionally, final -c, e.g.
Occitan oc "yes" < hoc, Old
French avuec "with" < apud hoc (although these instances were
possibly protected by a final epenthetic vowel at one point).
Final -t was eventually dropped in many languages, although this often
occurred several centuries after the
Vulgar Latin period. For example,
the reflex of -t was dropped in
Old French and
Old Spanish only around
1100. In Old French, this occurred only when a vowel still preceded
the t (generally /ə/ <
Latin a). Hence amat "he loves" > Old
French aime but venit "he comes" >
Old French vient: the /t/ was
never dropped and survives into Modern French in liaison, e.g.
vient-il? "is he coming?" /vjɛ̃ti(l)/ (the corresponding /t/ in
aime-t-il? is analogical, not inherited).
Old French also kept the
third-person plural ending -nt intact.
In Italo-Romance and the
Eastern Romance languages, eventually all
final consonants were either dropped or protected by an epenthetic
vowel, except in clitic forms (e.g. prepositions con, per). Modern
Standard Italian still has almost no consonant-final words, although
Romanian has resurfaced them through later loss of final /u/ and /i/.
For example, amās "you love" > ame > Italian ami; amant "they
love" > *aman > Ital. amano. On the evidence of "sloppily
Lombardic language documents, however, the loss of final /s/
Italy did not occur until the 7th or 8th century, after the Vulgar
Latin period, and the presence of many former final consonants is
betrayed by the syntactic gemination (raddoppiamento sintattico) that
they trigger. It is also thought that after a long vowel /s/ became
/j/ rather than simply disappearing: nōs > noi "we", se(d)ēs >
sei "you are", crās > crai "tomorrow" (southern Italian). In
unstressed syllables, the resulting diphthongs were simplified: canēs
> /ˈkanej/ > cani "dogs"; amīcās > /aˈmikaj/ > amiche
/aˈmike/ "(female) friends", where nominative amīcae should produce
**amice rather than amiche (note masculine amīcī > amici not
Western Romance languages eventually regained a large number
of final consonants through the general loss of final /e/ and /o/,
e.g. Catalan llet "milk" < lactem, foc "fire" < focum, peix
"fish" < piscem. In French, most of these secondary final
consonants (as well as primary ones) were lost before around 1700, but
tertiary final consonants later arose through the loss of /ə/ <
-a. Hence masculine frīgidum "cold" >
Old French freit /frwεt/
> froid /fʁwa/, feminine frigidam >
Old French freide
/frwεdə/ > froide /fʁwad/.
For a table of examples of palatalized n and l in the Romance
Palatalization (sound change) § Mouillé.
Palatalization was one of the most important processes affecting
consonants in Vulgar Latin. This eventually resulted in a whole series
of "palatal" and postalveolar consonants in most Romance languages,
e.g. Italian /ʃ/, /ʒ/, /tʃ/, /dʒ/, /ts/, /dz/, /ɲ/, /ʎ/.
The following historical stages occurred:
before /j/ (from e, i in hiatus)
all remaining, except labial consonants
/ttʃʲ~ttsʲ/ < /kj/, /jj~ddʒʲ/ < /ɡj/, /ɲɲ/, /ʎʎ/,
all except Sardinian
all except Sardinian and Dalmatian
before /a/, /au/
Gallo-Romance (e.g. French, northern Occitan);
Note how the environments become progressively less "palatal", and the
languages affected become progressively fewer.
The outcomes of palatalization depended on the historical stage, the
consonants involved, and the languages involved. The primary division
is between the
Western Romance languages, with /ts/ resulting from
palatalization of /k/, and the remaining languages (Italo-Dalmatian
and Eastern Romance), with /tʃ/ resulting. It is often suggested that
/tʃ/ was the original result in all languages, with /tʃ/ > /ts/ a
later innovation in the
Western Romance languages. Evidence of this is
the fact that Italian has both /ttʃ/ and /tts/ as outcomes of
palatalization in different environments, while
Western Romance has
only /(t)ts/. Even more suggestive is the fact that the Mozarabic
language in al-Andalus (modern southern Spain) had /tʃ/ as the
outcome despite being in the "Western Romance" area and geographically
disconnected from the remaining /tʃ/ areas; this suggests that
Mozarabic was an outlying "relic" area where the change /tʃ/ >
/ts/ failed to reach. (Northern French dialects, such as Norman and
Picard, also had /tʃ/, but this may be a secondary development, i.e.
due to a later sound change /ts/ > /tʃ/.) Note that /ts, dz, dʒ/
eventually became /s, z, ʒ/ in most
Western Romance languages. Thus
Latin caelum (sky, heaven), pronounced [ˈkai̯lu(m)] with an initial
[k], became Italian cielo [ˈtʃɛlo], Romanian cer [tʃer], Spanish
cielo [ˈθjelo]/[ˈsjelo], French ciel [sjɛl], Catalan cel
[ˈsɛɫ], and Portuguese céu [ˈsɛw].
The outcome of palatalized /d/ and /ɡ/ is less clear:
Original /j/ has the same outcome as palatalized /ɡ/ everywhere.
Romanian fairly consistently has /z/ < /dz/ from palatalized /d/,
but /dʒ/ from palatalized /ɡ/.
Italian inconsistently has /ddz~ddʒ/ from palatalized /d/, and /ddʒ/
from palatalized /ɡ/.
Most other languages have the same results for palatalized /d/ and
/ɡ/: consistent /dʒ/ initially, but either /j/ or /dʒ/ medially
(depending on language and exact context). But Spanish has /j/
(phonetically [ɟ͡ʝ]) initially except before /o/, /u/; nearby
Gascon is similar.
This suggests that palatalized /d/ > /dʲ/ > either /j/ or /dz/
depending on location, while palatalized /ɡ/ > /j/; after this,
/j/ > /(d)dʒ/ in most areas, but Spanish and Gascon (originating
from isolated districts behind the western Pyrenees) were relic areas
unaffected by this change.
In French, the outcomes of /k, ɡ/ palatalized by /e, i, j/ and by /a,
au/ were different: centum "hundred" > cent /sɑ̃/ but cantum
"song" > chant /ʃɑ̃/. French also underwent palatalization of
labials before /j/:
Vulgar Latin /pj, bj~vj, mj/ >
Old French /tʃ,
dʒ, ndʒ/ (sēpia "cuttlefish" > seiche, rubeus "red" > rouge,
sīmia "monkey" > singe).
The original outcomes of palatalization must have continued to be
phonetically palatalized even after they had developed into
alveolar/postalveolar/etc. consonants. This is clear from French,
where all originally palatalized consonants triggered the development
of a following glide /j/ in certain circumstances (most visible in the
endings -āre, -ātum/ātam). In some cases this /j/ came from a
consonant palatalized by an adjoining consonant after the late loss of
a separating vowel. For example, mansiōnātam > /masʲoˈnata/
> masʲˈnada/ > /masʲˈnʲæðə/ > early Old French
maisnieḍe /maisˈniɛðə/ "household". Similarly, mediētātem >
/mejeˈtate/ > /mejˈtade/ > /mejˈtæðe/ > early Old French
meitieḍ /mejˈtʲɛθ/ > modern French moitié /mwaˈtje/ "half".
In both cases, phonetic palatalization must have remained in primitive
Old French at least through the time when unstressed intertonic vowels
were lost (?c.8th century), well after the fragmentation of the
The effect of palatalization is indicated in the writing systems of
almost all Romance languages, where the letters have the "hard"
pronunciation [k, ɡ] in most situations, but a "soft" pronunciation
(e.g. French/Portuguese [s, ʒ], Italian/Romanian [tʃ, dʒ]) before
⟨e, i, y⟩. (This orthographic trait has passed into Modern English
through Norman French-speaking scribes writing Middle English; this
replaced the earlier system of Old English, which had developed its
own hard-soft distinction with the soft ⟨c, g⟩ representing [tʃ,
j~dʒ].) This has the effect of keeping the modern spelling similar to
Latin spelling, but complicates the relationship between
sound and letter. In particular, the hard sounds must be written
differently before ⟨e, i, y⟩ (e.g. Italian ⟨ch, gh⟩,
Portuguese ⟨qu, gu⟩), and likewise for the soft sounds when not
before these letters (e.g. Italian ⟨ci, gi⟩, Portuguese ⟨ç,
j⟩). Furthermore, in Spanish, Catalan,
Occitan and Brazilian
Portuguese, the use of digraphs containing ⟨u⟩ to signal the hard
pronunciation before ⟨e, i, y⟩ means that a different spelling is
also needed to signal the sounds /kw, ɡw/ before these vowels
(Spanish ⟨cu, gü⟩, Catalan,
Occitan and Brazilian Portuguese
⟨qü, gü⟩). This produces a number of orthographic
alternations in verbs whose pronunciation is entirely regular. The
following are examples of corresponding first-person plural indicative
and subjunctive in a number of regular Portuguese verbs: marcamos,
marquemos "we mark"; caçamos, cacemos "we hunt"; chegamos, cheguemos
"we arrive"; averiguamos, averigüemos "we verify"; adequamos,
adeqüemos "we adapt"; oferecemos, ofereçamos "we offer"; dirigimos,
dirijamos "we drive" erguemos, ergamos "we raise"; delinquimos,
delincamos "we commit a crime". In the case of Italian, the convention
of digraphs <ch> and <gh> to represent /k/ and /g/ before
written <e, i> results in similar orthographic alternations,
such as dimentico 'I forget', dimentichi 'you forget', baco 'worm',
bachi 'worms' with [k] or pago 'I pay', paghi 'you pay' and lago
'lake', laghi 'lakes' with [g]. The use in Italian of <ci> and
<gi> to represent /tʃ/ or /dʒ/ before vowels written
<a,o,u> neatly distinguishes dico 'I say' with /k/ from dici
'you say' with /tʃ/ or ghiro 'dormouse' /g/ and giro 'turn,
revolution' /dʒ/, but with orthographic <ci> and <gi>
also representing the sequence of /tʃ/ or /dʒ/ and the actual vowel
/i/ (/ditʃi/ dici, /dʒiro/ giro), and no generally observed
convention of indicating stress position, the status of i when
followed by another vowel in spelling can be unrecognizable. For
example, the written forms offer no indication that <cia> in
camicia 'shirt' represents a single unstressed syllable /tʃa/ with no
/i/ at any level (/kaˈmitʃa/ → [kaˈmiːtʃa] ~ [kaˈmiːʃa]),
but that underlying the same spelling <cia> in farmacia
'pharmacy' is a bisyllabic sequence of /tʃ/ and stressed /i/
(/farmaˈtʃia/ → [farmaˈtʃiːa] ~ [farmaˈʃiːa]).
Stop consonants shifted by lenition in Vulgar Latin.
The voiced labial consonants /b/ and /w/ (represented by ⟨b⟩ and
⟨v⟩, respectively) both developed a fricative [β] as an
intervocalic allophone. This is clear from the orthography; in
medieval times, the spelling of a consonantal ⟨v⟩ is often used
for what had been a ⟨b⟩ in Classical Latin, or the two spellings
were used interchangeably. In many
Romance languages (Italian, French,
Portuguese, Romanian, etc.), this fricative later developed into a
/v/; but in others (Spanish, Galician, some Catalan and Occitan
dialects, etc.) reflexes of /b/ and /w/ simply merged into a single
Several other consonants were "softened" in intervocalic position in
Western Romance (Spanish, Portuguese, French, Northern Italian), but
normally not phonemically in the rest of
Italy (except some cases of
"elegant" or Ecclesiastical words), nor apparently at all in Romanian.
The dividing line between the two sets of dialects is called the La
Spezia–Rimini Line and is one of the most important isoglosses of
the Romance dialects. The changes (instances of diachronic lenition)
are as follows:
Single voiceless plosives became voiced: -p-, -t-, -c- > -b-, -d-,
-g-. Subsequently, in some languages they were further weakened,
either becoming fricatives or approximants, [β̞], [ð̞], [ɣ˕] (as
in Spanish) or disappearing entirely (as /t/ and /k/, but not /p/, in
French). The following example shows progressive weakening of original
/t/: e.g. vītam > Italian vita [ˈvita], Portuguese vida [ˈvidɐ]
European Portuguese [ˈviðɐ]), Spanish vida [ˈbiða] (Southern
Peninsular Spanish [ˈbia]), and French vie [vi]. Some have speculated
that these sound changes may be due in part to the influence of
Continental Celtic languages.
The voiced plosives /d/ and /ɡ/ tended to disappear.
The plain sibilant -s- [s] was also voiced to [z] between vowels,
although in many languages its spelling has not changed. (In Spanish,
intervocalic [z] was later devoiced back to [s]; [z] is only found as
an allophone of /s/ before voiced consonants in Modern Spanish.)
The double plosives became single: -pp-, -tt-, -cc-, -bb-, -dd-, -gg-
> -p-, -t-, -c-, -b-, -d-, -g- in most languages. In French
spelling, double consonants are merely etymological, except for -ll-
after -i (pronounced [ij]), in most cases.
The double sibilant -ss- [sː] also became phonetically single [s],
although in many languages its spelling has not changed.
Consonant length is no longer phonemically distinctive in most Romance
languages. However some languages of
Italy (Italian, Sardinian,
Sicilian, and numerous other varieties of central and southern Italy)
do have long consonants like /ɡɡ/, /dd/, /bb/, /kk/, /tt/, /pp/,
/ll/, /mm/, /nn/, /ss/, /rr/, etc., where the doubling indicates
either actual length or, in the case of plosives and affricates, a
short hold before the consonant is released, in many cases with
distinctive lexical value: e.g. note /ˈnɔ.te/ (notes) vs. notte
/ˈnɔt.te/ (night), cade /ˈka.de/ (s/he, it falls) vs. cadde
/ˈkad.de/ (s/he, it fell), caro /ˈka.ro/ (dear, expensive) vs. carro
/ˈkar.ro/ (cart). They may even occur at the beginning of words in
Romanesco, Neapolitan, Sicilian and other southern varieties, and are
occasionally indicated in writing, e.g. Sicilian cchiù (more), and
ccà (here). In general, the consonants /b/, /ts/, and /dz/ are long
at the start of a word, while the archiphoneme R[dubious –
discuss] is realised as a trill /r/ in the same position. In much of
central and southern Italy, the affricates /t͡ʃ/ and /d͡ʒ/ weaken
synchronically to fricative [ʃ] and [ʒ] between vowels, while their
geminate congeners do not, e.g. cacio /ˈka.t͡ʃo/ → [ˈkaːʃo]
(cheese) vs. caccio /ˈkat.t͡ʃo/ → [ˈkat.t͡ʃo] (I chase).
A few languages have regained secondary geminate consonants. The
double consonants of
Piedmontese exist only after stressed /ə/,
written ë, and are not etymological: vëdde (
Latin vidēre, to see),
Latin sicca, dry, feminine of sech). In standard Catalan and
Occitan, there exists a geminate sound /lː/ written ŀl (Catalan) or
ll (Occitan), but it is usually pronounced as a simple sound in
colloquial (and even some formal) speech in both languages.
In Western Romance, an epenthetic or prosthetic vowel was inserted at
the beginning of any word that began with /s/ and another consonant:
spatha "sword" > Spanish/Portuguese espada, Catalan espasa, Old
French espeḍe > modern épée; Stephanum "Stephen" > Spanish
Esteban, Catalan Esteve, Portuguese Estêvão,
Old French Estievne
> modern Étienne; status "state" > Spanish/Portuguese estado,
Old French estat > modern état; spiritus "spirit"
> Spanish espíritu, Portuguese espírito, Catalan esperit, French
esprit. Epenthetic /e/ in
Western Romance languages was also probably
influenced by Continental Celtic languages. While
Western Romance words undergo word-initial epenthesis (prothesis),
cognates in Italian do not: spatha > spada, Stephanum > Stefano,
status > stato, spiritus > spirito. In Italian, syllabification
rules were preserved instead by vowel-final articles, thus feminine
spada as la spada, but instead of rendering the masculine *il
spaghetto, lo spaghetto came to be the norm. Though receding at
present, Italian once had an epenthetic /i/ if a consonant preceded
such clusters, so that 'in Switzerland' was in [i]Svizzera. Some
speakers still use the prothetic [i] productively, and it is
fossilized in a few set phrases as per iscritto 'in writing' (although
in this case its survival may be due partly to the influence of the
separate word iscritto <
Loss of vowel length, reorientation
Evolution of the stressed vowels in early Romance
/oj/ > /eː/
/aj/ > [ɛː]
(a few words)
/aw/ > /oː/
1 Traditional academic transcription in
Latin and Romance studies,
One profound change that affected
Vulgar Latin was the reorganisation
of its vowel system. Classical
Latin had five short vowels, ă, ĕ,
ĭ, ŏ, ŭ, and five long vowels, ā, ē, ī, ō, ū, each of which
was an individual phoneme (see the table in the right, for their
likely pronunciation in IPA), and four diphthongs, ae, oe, au and eu
(five according to some authors, including ui). There were also long
and short versions of y, representing the rounded vowel /y(ː)/ in
Greek borrowings, which however probably came to be pronounced /i(ː)/
even before Romance vowel changes started.
There is evidence that in the imperial period all the short vowels
except a differed by quality as well as by length from their long
counterparts. So, for example ē was pronounced close-mid /eː/
while ĕ was pronounced open-mid /ɛ/, and ī was pronounced close
/iː/ while ĭ was pronounced near-close /ɪ/.
Proto-Romance period, phonemic length distinctions were
lost. Vowels came to be automatically pronounced long in stressed,
open syllables (i.e. when followed by only one consonant), and
pronounced short everywhere else. This situation is still maintained
in modern Italian: cade [ˈkaːde] "he falls" vs. cadde [ˈkadde] "he
Proto-Romance loss of phonemic length originally produced a system
with nine different quality distinctions in monophthongs, where only
original /ă ā/ had merged. Soon, however, many of these vowels
The simplest outcome was in Sardinian, where the former long and
short vowels in
Latin simply coalesced, e.g. /ĕ ē/ > /e/, /ĭ ī/
> /i/: This produced a simple five-vowel system /a e i o u/.
In most areas, however (technically, the Italo-Western languages), the
near-close vowels /ɪ ʊ/ lowered and merged into the high-mid vowels
/e o/. As a result,
Latin pira "pear" and vēra "true", came to rhyme
(e.g. Italian and Spanish pera, vera, and
Old French poire, voire).
Latin nucem (from nux "nut") and vōcem (from vōx "voice")
become Italian noce, voce, Portuguese noz, voz, and French noix, voix.
This produced a seven-vowel system /a ɛ e i ɔ o u/, still maintained
in conservative languages such as Italian and Portuguese, and lightly
transformed in Spanish (where /ɛ/ > /je/, /ɔ/ > /we/).
Eastern Romance languages
Eastern Romance languages (particularly, Romanian), the front
vowels /ĕ ē ĭ ī/ evolved as in the majority of languages, but the
back vowels /ŏ ō ŭ ū/ evolved as in Sardinian. This produced an
unbalanced six-vowel system: /a ɛ e i o u/. In modern Romanian, this
system has been significantly transformed, with /ɛ/ > /je/ and
with new vowels /ə ɨ/ evolving, leading to a balanced seven-vowel
system with central as well as front and back vowels: /a e i ə ɨ o
Sicilian is sometimes described as having its own distinct vowel
system. In fact, Sicilian passed through the same developments as the
main bulk of Italo-Western languages. Subsequently, however, high-mid
vowels (but not low-mid vowels) were raised in all syllables, stressed
and unstressed; i.e. /e o/ > /i u/. The result is a five-vowel /a
ɛ i ɔ u/.
Proto-Romance allophonic vowel-length system was rephonemicized in
Gallo-Romance languages as a result of the loss of many final
vowels. Some northern Italian languages (e.g. Friulan) still maintain
this secondary phonemic length, but most languages dropped it by
either diphthongizing or shortening the new long vowels.
French phonemicized a third vowel length system around AD 1300 as a
result of the sound change /VsC/ > /VhC/ > /VːC/ (where V is
any vowel and C any consonant). This vowel length was eventually lost
by around AD 1700, but the former long vowels are still marked with a
circumflex. A fourth vowel length system, still non-phonemic, has now
arisen: All nasal vowels as well as the oral vowels /ɑ o ø/ (which
mostly derive from former long vowels) are pronounced long in all
stressed closed syllables, and all vowels are pronounced long in
syllables closed by the voiced fricatives /v z ʒ ʁ vʁ/. This system
in turn has been phonemicized in some non-standard dialects (e.g.
Haitian Creole), as a result of the loss of final /ʁ/.
Latin diphthongs ae and oe, pronounced /ai/ and /oi/ in earlier
Latin, were early on monophthongized.
ae became /ɛː/ by the 1st century a.d. at the latest. Although this
sound was still distinct from all existing vowels, the neutralization
Latin vowel length eventually caused its merger with /ɛ/ <
short e: e.g. caelum "sky" > French ciel, Spanish/Italian cielo,
Portuguese céu /sɛw/, with the same vowel as in mele "honey" >
French/Spanish miel, Italian miele, Portuguese mel /mɛl/. Some words
show an early merger of ae with /eː/, as in praeda "booty" >
*prēda /preːda/ > French proie (vs. expected **priée), Italian
preda (not **prieda) "prey"; or faenum "hay" > *fēnum [feːnũ]
> Spanish heno, French foin (but Italian fieno /fjɛno/).
oe generally merged with /eː/: poenam "punishment" > Romance
*/pena/ > Spanish/Italian pena, French peine; foedus "ugly" >
Romance */fedo/ > Spanish feo, Portuguese feio. There are
relatively few such outcomes, since oe was rare in Classical Latin
(most original instances had become Classical ū, as in Old Latin
oinos "one" > Classical ūnus) and so oe was mostly limited to
Greek loanwords, which were typically learned (high-register) terms.
au merged with ō /oː/ in the popular speech of Rome already by the
1st century b.c. A number of authors remarked on this explicitly, e.g.
Cicero's taunt that the populist politician Publius Clodius Pulcher
had changed his name from Claudius to ingratiate himself with the
masses. This change never penetrated far from Rome, however, and the
pronunciation /au/ was maintained for centuries in the vast majority
of Latin-speaking areas, although it eventually developed into some
variety of o in many languages. For example, Italian and French have
/ɔ/ as the usual reflex, but this post-dates diphthongization of /ɔ/
and the French-specific palatalization /ka/ > /tʃa/ (hence causa
> French chose, Italian cosa /kɔza/ not **cuosa). Spanish has /o/,
but Portuguese spelling maintains ⟨ou⟩, which has developed to /o/
(and still remains as /ou/ in some dialects, and /oi/ in others).
Occitan, Romanian, southern Italian languages, and many other minority
Romance languages still have /au/. A few common words, however, show
an early merger with ō /oː/, evidently reflecting a generalization
of the popular Roman pronunciation: e.g. French queue, Italian coda
Occitan co(d)a, Romanian coadă (all meaning "tail") must all
derive from cōda rather than Classical cauda (but notice Portuguese
cauda). Similarly, Portuguese orelha, French oreille, Romanian
ureche, and Sardinian olícra, orícla "ear" must derive from
ōric(u)la rather than Classical auris (
Occitan aurelha was probably
influenced by the unrelated ausir < audīre "to hear"), and the
form oricla is in fact reflected in the Appendix Probi.
Metaphony (Romance languages)
An early process that operated in all
Romance languages to varying
degrees was metaphony (vowel mutation), conceptually similar to the
umlaut process so characteristic of the Germanic languages. Depending
on the language, certain stressed vowels were raised (or sometimes
diphthongized) either by a final /i/ or /u/ or by a directly following
Metaphony is most extensive in the Italo-Romance languages, and
applies to nearly all languages in Italy; however, it is absent from
Tuscan, and hence from standard Italian. In many languages affected by
metaphony, a distinction exists between final /u/ (from most cases of
Latin -um) and final /o/ (from
Latin -ō, -ud and some cases of -um,
esp. masculine "mass" nouns), and only the former triggers metaphony.
Servigliano in the
Marche of Italy, stressed /ɛ e ɔ o/ are raised
to /e i o u/ before final /i/ or /u/: /ˈmetto/ "I put" vs.
/ˈmitti/ "you put" (< *metti < *mettes <
/moˈdɛsta/ "modest (fem.)" vs. /moˈdestu/ "modest (masc.)";
/ˈkwesto/ "this (neut.)" (<
Latin eccum istud) vs. /ˈkwistu/
"this (masc.)" (<
Latin eccum istum).
Calvallo in Basilicata, southern Italy, is similar, but the low-mid
vowels /ɛ ɔ/ are diphthongized to /je wo/ rather than raised:
/ˈmette/ "he puts" vs. /ˈmitti/ "you put", but /ˈpɛnʒo/ "I think"
vs. /ˈpjenʒi/ "you think".
Metaphony also occurs in most northern Italian dialects, but only by
(usually lost) final *i; apparently, final *u was lowered to *o
(usually lost) before metaphony could take effect.
Some of the
Astur-Leonese languages in northern
Spain have the same
distinction between final /o/ and /u/ as in the Central-Southern
Italian languages, with /u/ triggering metaphony. The plural
of masculine nouns in these dialects ends in -os, which does not
trigger metaphony, unlike in the singular (vs. Italian plural -i,
which does trigger metaphony).
Sardinian has allophonic raising of mid vowels /ɛ ɔ/ to [e o] before
final /i/ or /u/. This has been phonemicized in the Campidanese
dialect as a result of the raising of final /e o/ to /i u/.
Raising of /ɔ/ to /o/ occurs sporadically in Portuguese in the
masculine singular, e.g. porco /ˈporku/ "pig" vs. porcos /ˈpɔrkus/
"pig". It is thought that
Galician-Portuguese at one point had
singular /u/ vs. plural /os/, exactly as in modern Astur-Leonese.
In all of the
Western Romance languages, final /i/ (primarily
occurring in the first-person singular of the preterite) raised
mid-high /e o/ to /i u/, e.g. Portuguese fiz "I did" (< *fidzi <
Latin fēcī) vs. fez "he did" (< *fedze < Latin
Old Spanish similarly had fize "I did" vs. fezo "he did" (-o
by analogy with amó "he loved"), but subsequently generalized
stressed /i/, producing modern hice "I did" vs. hizo "he did". The
same thing happened prehistorically in Old French, yielding fis "I
did", fist "he did" (< *feist <
A number of languages diphthongized some of the free vowels,
especially the open-mid vowels /ɛ ɔ/:
Spanish consistently diphthongized all open-mid vowels /ɛ ɔ/ >
/je we/ except for before certain palatal consonants (which raised the
vowels to close-mid before diphthongization took place).
Romanian similarly diphthongized /ɛ/ to /je/ (the corresponding vowel
/ɔ/ did not develop from Proto-Romance).
Italian diphthongized /ɛ/ > /jɛ/ and /ɔ/ > /wɔ/ in open
syllables (in the situations where vowels were lengthened in
Proto-Romance), the most salient exception being /ˈbɛne/ bene
'well', perhaps due to the high frequency of apocopated ben (e.g. ben
difficile 'quite difficult', ben fatto 'well made', ben due 'a good
French similarly diphthongized /ɛ ɔ/ in open syllables (when
lengthened), along with /a e o/: /aː ɛː eː ɔː oː/ > /aɛ iɛ
ei uɔ ou/ > middle OF /e je ɔi we eu/ > modern /e je wa œ ~
ø œ ~ ø/.
French also diphthongized /ɛ ɔ/ before palatalized consonants,
especially /j/. Further development was as follows: /ɛj/ > /iej/
> /i/; /ɔj/ > /uoj/ > early OF /uj/ > modern /ɥi/.
Catalan diphthongized /ɛ ɔ/ before /j/ from palatalized consonants,
just like French, with similar results: /ɛj/ > /i/, /ɔj/ >
These diphthongizations had the effect of reducing or eliminating the
distinctions between open-mid and close-mid vowels in many languages.
In Spanish and Romanian, all open-mid vowels were diphthongized, and
the distinction disappeared entirely. Portuguese is the most
conservative in this respect, keeping the seven-vowel system more or
less unchanged (but with changes in particular circumstances, e.g. due
to metaphony). Other than before palatalized consonants, Catalan keeps
/ɔ o/ intact, but /ɛ e/ split in a complex fashion into /ɛ e ə/
and then coalesced again in the standard dialect (Eastern Catalan) in
such a way that most original /ɛ e/ have reversed their quality to
become /e ɛ/.
In French and Italian, the distinction between open-mid and close-mid
vowels occurred only in closed syllables. Standard Italian more or
less maintains this. In French, /e/ and /ɛ/ merged by the twelfth
century or so, and the distinction between /ɔ/ and /o/ was eliminated
without merging by the sound changes /u/ > /y/, /o/ > /u/.
Generally this led to a situation where both [e,o] and [ɛ,ɔ] occur
allophonically, with the close-mid vowels in open syllables and the
open-mid vowels in closed syllables. This is still the situation in
modern Spanish, for example. In French, however, both
[e/ɛ] and [o/ɔ] were partly rephonemicized: Both /e/ and /ɛ/ occur
in open syllables as a result of /aj/ > /ɛ/, and both /o/ and /ɔ/
occur in closed syllables as a result of /al/ > /au/ > /o/.
Old French also had numerous falling diphthongs resulting from
diphthongization before palatal consonants or from a fronted /j/
originally following palatal consonants in
Proto-Romance or later:
e.g. pācem /patsʲe/ "peace" > PWR */padzʲe/ (lenition) > OF
paiz /pajts/; *punctum "point" >
Gallo-Romance */ponʲto/ >
*/pojɲto/ (fronting) > OF point /põjnt/. During the Old French
period, preconsonantal /l/ [ɫ] vocalized to /w/, producing many new
falling diphthongs: e.g. dulcem "sweet" > PWR */doltsʲe/ > OF
dolz /duɫts/ > douz /duts/; fallet "fails, is deficient" > OF
falt > faut "is needed"; bellus "beautiful" > OF bels [bɛɫs]
> beaus [bɛaws]. By the end of the
Middle French period, all
falling diphthongs either monophthongized or switched to rising
diphthongs: proto-OF /aj ɛj jɛj ej jej wɔj oj uj al ɛl el il ɔl
ol ul/ > early OF /aj ɛj i ej yj oj yj aw ɛaw ew i ɔw ow y/ >
modern spelling ⟨ai ei i oi ui oi ui au eau eu i ou ou u⟩ >
mod. French /ɛ ɛ i wa ɥi wa ɥi o o ø i u u y/.
In both French and Portuguese, nasal vowels eventually developed from
sequences of a vowel followed by a nasal consonant (/m/ or /n/).
Originally, all vowels in both languages were nasalized before any
nasal consonants, and nasal consonants not immediately followed by a
vowel were eventually dropped. In French, nasal vowels before
remaining nasal consonants were subsequently denasalized, but not
before causing the vowels to lower somewhat, e.g. dōnat "he gives"
> OF dune /dunə/ > donne /dɔn/, fēminam > femme /fam/.
Other vowels remained diphthongized, and were dramatically lowered:
fīnem "end" > fin /fɛ̃/ (often pronounced [fæ̃]); linguam
"tongue" > langue /lɑ̃ɡ/; ūnum "one" > un /œ̃/, /ɛ̃/.
In Portuguese, /n/ between vowels was dropped, and the resulting
hiatus eliminated through vowel contraction of various sorts, often
producing diphthongs: manum, *manōs > PWR *manu, ˈmanos "hand(s)"
> mão, mãos /mɐ̃w̃, mɐ̃w̃s/; canem, canēs "dog(s)" >
PWR *kane, ˈkanes > *can, ˈcanes > cão, cães /kɐ̃w̃,
kɐ̃j̃s/; ratiōnem, ratiōnēs "reason(s)" > PWR *raˈdʲzʲone,
raˈdʲzʲones > *raˈdzon, raˈdzones > razão, razões
/χaˈzɐ̃w̃, χaˈzõj̃s/ (Brazil), /ʁaˈzɐ̃ũ, ʁɐˈzõj̃ʃ/
(Portugal). Sometimes the nasalization was eliminated: lūna "moon"
Galician-Portuguese lũa > lua; vēna "vein" >
Galician-Portuguese vẽa > veia. Nasal vowels that remained
actually tend to be raised (rather than lowered, as in French): fīnem
"end" > fim /fĩ/; centum "hundred" > PWR tʲsʲɛnto > cento
/ˈsẽtu/; pontem "bridge" > PWR pɔnte > ponte /ˈpõtʃi/
(Brazil), /ˈpõtɨ/ (Portugal). In Portugal, vowels before a nasal
consonant have become denasalized, but in
Brazil they remain heavily
Characteristic of the
Gallo-Romance languages and Rhaeto-Romance
languages are the front rounded vowels /y ø œ/. All of these
languages show an unconditional change /u/ > /y/, e.g. lūnam >
French lune /lyn/,
Occitan /ˈlyno/. Many of the languages in
Italy show the further change /y/ > /i/. Also very
common is some variation of the French development /ɔː oː/
(lengthened in open syllables) > /we ew/ > /œ œ/, with mid
back vowels diphthongizing in some circumstances and then
re-monophthongizing into mid-front rounded vowels. (French has both
/ø/ and /œ/, with /ø/ developing from /œ/ in certain
Evolution of unstressed vowels in early Italo-Western Romance
∅; /e/ (prop)
∅; /ə/ (prop)
1 Traditional academic transcription in Romance studies.
There was more variability in the result of the unstressed vowels.
Originally in Proto-Romance, the same nine vowels developed in
unstressed as stressed syllables, and in Sardinian, they coalesced
into the same five vowels in the same way.
In Italo-Western Romance, however, vowels in unstressed syllables were
significantly different from stressed vowels, with yet a third outcome
for final unstressed syllables. In non-final unstressed syllables, the
seven-vowel system of stressed syllables developed, but then the
low-mid vowels /ɛ ɔ/ merged into the high-mid vowels /e o/. This
system is still preserved, largely or completely, in all of the
Romance languages (e.g. Italian, Spanish, Portuguese,
In final unstressed syllables, results were somewhat complex. One of
the more difficult issues is the development of final short -u, which
appears to have been raised to /u/ rather than lowered to /o/, as
happened in all other syllables. However, it is possible that in
reality, final /u/ comes from long *-ū < -um, where original final
-m caused vowel lengthening as well as nasalization. Evidence of this
comes from Rhaeto-Romance, in particular Sursilvan, which preserves
reflexes of both final -us and -um, and where the latter, but not the
former, triggers metaphony. This suggests the development -us >
/ʊs/ > /os/, but -um > /ũː/ > /u/.
Examples of evolution of final unstressed vowels:
From least- to most-changed languages
a, e, i, o, u
a, e, i, o, u
a, e, i, o
a, e/-, o
1 These columns use
IPA symbols /ɔ, ɛ/ to indicate open-mid vowels.
The original five-vowel system in final unstressed syllables was
preserved as-is in some of the more conservative central Italian
languages, but in most languages there was further coalescence:
In Tuscan (including standard Italian), final /u/ merged into /o/.
Western Romance languages, final /i/ eventually merged into /e/
(although final /i/ triggered metaphony before that, e.g. Spanish
hice, Portuguese fiz "I did" < *fize <
Conservative languages like Spanish largely maintain that system, but
drop final /e/ after certain single consonants, e.g. /r/, /l/, /n/,
/d/, /z/ (< palatalized c).
Gallo-Romance languages (part of Western Romance), final /o/
and /e/ were dropped entirely unless that produced an impossible final
cluster (e.g. /tr/), in which case a "prop vowel" /e/ was added. This
left only two final vowels: /a/ and prop vowel /e/. Catalan preserves
Loss of final stressless vowels in Venetian shows a pattern
Central Italian and the Gallo-Italic branch, and
the environments for vowel deletion vary considerably depending on the
dialect. In the table above, final /e/ is uniformly absent in mar,
absent in some dialects in part(e) /part(e)/ and set(e) /sɛt(e)/, but
retained in mare (<
Latin mātrem) as a relic of the earlier
Old French (one of the
Gallo-Romance languages), these
two remaining vowels merged into /ə/.
Various later changes happened in individual languages, e.g.:
In French, most final consonants were dropped, and then final /ə/ was
also dropped. The /ə/ is still preserved in spelling as a final
silent -e, whose main purpose is to signal that the previous consonant
is pronounced, e.g. port "port" /pɔʁ/ vs. porte "door" /pɔʁt/.
These changes also eliminated the difference between singular and
plural in most words: ports "ports" (still /pɔʁ/), portes "doors"
(still /pɔʁt/). Final consonants reappear in liaison contexts (in
close connection with a following vowel-initial word), e.g. nous [nu]
"we" vs. nous avons [nu.za.ˈvɔ̃] "we have", il fait [il.fɛ] "he
does" vs. fait-il ? [fɛ.til] "does he?".
In Portuguese, final unstressed /o/ and /u/ were apparently preserved
intact for a while, since final unstressed /u/, but not /o/ or /os/,
triggered metaphony (see above). Final-syllable unstressed /o/ was
raised in preliterary times to /u/, but always still written ⟨o⟩.
At some point (perhaps in late Galician-Portuguese), final-syllable
unstressed /e/ was raised to /i/ (but still written ⟨e⟩); this
remains in Brazilian Portuguese, but has developed to /ɨ/ in European
In Catalan, final unstressed /as/ > /es/. In many dialects,
unstressed /o/ and /u/ merge into /u/ as in Portuguese, and unstressed
/a/ and /e/ merge into /ə/. However, some dialects preserve the
original five-vowel system, most notably standard Valencian.
The so-called intertonic vowels are word-internal unstressed vowels,
i.e. not in the initial, final, or tonic (i.e. stressed) syllable,
hence intertonic. Intertonic vowels were the most subject to loss or
modification. Already in
Vulgar Latin intertonic vowels between a
single consonant and a following /r/ or /l/ tended to drop: vétulum
"old" > veclum > Dalmatian vieklo, Sicilian vecchiu, Portuguese
velho. But many languages ultimately dropped almost all intertonic
Generally, those languages south and east of the La Spezia–Rimini
Line (Romanian and Central-Southern Italian) maintained intertonic
vowels, while those to the north and west (Western Romance) dropped
all except /a/. Standard Italian generally maintained intertonic
vowels, but typically raised unstressed /e/ > /i/. Examples:
septimā́nam "week" > Italian settimana, Romanian săptămână
vs. Spanish/Portuguese semana, French semaine, Occitan/Catalan
quattuórdecim "fourteen" > Italian quattordici, Venetian
Piedmontese quatòrdes, vs. Spanish catorce,
metipsissimus > medipsimus /medíssimos/ ~ /medéssimos/
"self" > Italian medésimo vs. Venetian medemo, Lombard medemm,
Old Spanish meísmo, meesmo (> modern mismo), Galician-Portuguese
meesmo (> modern mesmo),
Old French meḍisme (> later meïsme
> MF mesme > modern même)
bonitā́tem "goodness" > Italian bonità ~ bontà, Romanian
bunătate but Spanish bondad, Portuguese bondade, French bonté
collocā́re "to position, arrange" > Italian coricare vs. Spanish
colgar "to hang", Romanian culca "to lie down", French coucher "to lay
sth on its side; put s.o. to bed"
commūnicā́re "to take communion" > Romanian cumineca vs.
Portuguese comungar, Spanish comulgar,
Old French comungier
carricā́re "to load (onto a wagon, cart)" > Portuguese/Catalan
carregar vs. Spanish/
Occitan cargar "to load", French charger, Lombard
cargà/caregà, Venetian carigar/cargar(e) "to load"
fábricam "forge" > /*fawrɡa/ > Spanish fragua, Portuguese
frágua, Occitan/Catalan farga, French forge
disjējūnā́re "to break a fast" > *disjūnā́re > Old French
disner "to have lunch" > French dîner "to dine" (but *disjū́nat
Old French desjune "he has lunch" > French (il) déjeune "he
adjūtā́re "to help" > Italian aiutare, Romanian ajuta but French
aider, Lombard aidà/aiuttà (Spanish ayudar, Portuguese ajudar based
on stressed forms, e.g. ayuda/ajuda "he helps"; cf.
Old French aidier
"to help" vs. aiue "he helps")
Portuguese is more conservative in maintaining some intertonic vowels
other than /a/: e.g. *offerḗscere "to offer" > Portuguese
oferecer vs. Spanish ofrecer, French offrir (< *offerīre). French,
on the other hand, drops even intertonic /a/ after the stress:
Stéphanum "Stephen" > Spanish Esteban but
Old French Estievne >
French Étienne. Many cases of /a/ before the stress also ultimately
dropped in French: sacraméntum "sacrament" >
Old French sairement
> French serment "oath".
Romance languages for the most part have kept the writing system
of Latin, adapting it to their evolution. One exception was Romanian
before the nineteenth century, where, after the Roman retreat,
literacy was reintroduced through the Romanian Cyrillic alphabet, a
Slavic influence. A Cyrillic alphabet was also used for Romanian
(Moldovan) in the USSR. The non-Christian populations of
used the scripts of their religions (Arabic and Hebrew) to write
Romance languages such as Ladino and Mozarabic in aljamiado.
Spelling of results of palatalization and related sounds
/k/, not + ⟨e, i, y⟩
palatalized /k/ (/tʃ/~/s/~/θ/), + ⟨e, i, y⟩
palatalized /k/ (/tʃ/~/s/~/θ/), not + ⟨e, i, y⟩
/kw/, not + ⟨e, i, y⟩
/k/ + ⟨e, i⟩ (inherited)
/kw/ + ⟨e, i⟩ (learned)
/g/, not + ⟨e, i, y⟩
palatalized /k, g/
(/dʒ/~/ʒ/~/x/), + ⟨e, i, y⟩
palatalized /k, g/
(/dʒ/~/ʒ/~/x/), not + ⟨e, i, y⟩
/gw/, not + ⟨e ,i⟩
/g/ + ⟨e, i⟩ (inherited)
/gw/ + ⟨e, i⟩ (learned)
Romance languages are written with the classical
Latin alphabet of
23 letters – A, B, C, D, E, F, G, H, I, K, L, M, N, O, P, Q, R,
S, T, V, X, Y, Z – subsequently modified and augmented in
various ways. In particular, the single
Latin letter V split into V
(consonant) and U (vowel), and the letter I split into I and J. The
Latin letter K and the new letter W, which came to be widely used in
Germanic languages, are seldom used in most Romance languages –
mostly for unassimilated foreign names and words. Indeed, in Italian
prose kilometro is properly chilometro. Catalan eschews importation of
"foreign" letters more than most languages. Thus is
Viquipèdia in Catalan but in Spanish.
While most of the 23 basic
Latin letters have maintained their
phonetic value, for some of them it has diverged considerably; and the
new letters added since the
Middle Ages have been put to different
uses in different scripts. Some letters, notably H and Q, have been
variously combined in digraphs or trigraphs (see below) to represent
phonetic phenomena that could not be recorded with the basic Latin
alphabet, or to get around previously established spelling
conventions. Most languages added auxiliary marks (diacritics) to some
letters, for these and other purposes.
The spelling rules of most
Romance languages are fairly simple, and
consistent within any language. Since the spelling systems are based
on phonemic structures rather than phonetics, however, the actual
pronunciation of what is represented in standard orthography can be
subject to considerable regional variation, as well as to allophonic
differentiation by position in the word or utterance. Among the
letters representing the most conspicuous phonological variations,
Romance languages or with respect to Latin, are the following:
B, V: Merged in Spanish and most dialects of Catalan, where both
letters represent a single phoneme pronounced as either [b] or [β]
depending on position, with no differentiation between B and V.
C: Generally a "hard" [k], but "soft" (fricative or affricate) before
e, i, or y.
G: Generally a "hard" [ɡ], but "soft" (fricative or affricate) before
e, i, or y. In some languages, like Spanish, the hard g, phonemically
/g/, is pronounced as a fricative [ɣ] after vowels. In Romansch, the
soft g is a voiced palatal plosive [ɟ] or a voiced alveolo-palatal
H: Silent in most languages; used to form various digraphs. But
represents [h] in Romanian, Walloon and Gascon Occitan.
J: Represents the fricative [ʒ] in most languages, or the palatal
approximant [j] in Romansh and in several of the languages of Italy.
Italian does not use this letter in native words.
Q: As in Latin, its phonetic value is that of a hard c, i.e. [k], and
in native words it is always followed by a (sometimes silent) u.
Romanian does not use this letter in native words.
S: Generally voiceless [s], but voiced [z] between vowels in some
languages. In Spanish, Romanian, Galician and several varieties of
Italian, however, it is always pronounced voiceless between vowels. If
the phoneme /s/ is represented by the letter S, predictable
assimilations are normally not shown (e.g. Italian /slitta/ 'sled',
spelled slitta but pronounced [zlitːa], never with [s]). Also at the
end of syllables it may represent special allophonic pronunciations.
In Romansh, it also stands for a voiceless or voiced fricative, [ʃ]
or [ʒ], before certain consonants.
W: No Romance language uses this letter in native words, with the
exception of Walloon.
X: Its pronunciation is rather variable, both between and within
languages. In the Middle Ages, the languages of Iberia used this
letter to denote the voiceless postalveolar fricative [ʃ], which is
still the case in Modern Catalan and Portuguese. With the Renaissance
the classical pronunciation [ks] – or similar consonant
clusters, such as [ɡz], [ɡs], or [kθ] – were frequently
reintroduced in latinisms and hellenisms. In Venetian it represents
[z], and in Ligurian the voiced postalveolar fricative [ʒ]. Italian
does not use this letter in native words.
Y: This letter is not used in most languages, with the prominent
exceptions of French and Spanish, where it represents [j] before
vowels (or various similar fricatives such as the palatal fricative
[ʝ], in Spanish), and the vowel [i] or semivowel [j] elsewhere.
Z: In most languages it represents the sound [z]. However, in Italian
it denotes the affricates [dz] and [ts] (which are two separate
phonemes, but rarely contrast; among the few examples of minimal pairs
are razza "ray" with [ddz], razza "race" with [tts] (note that both
are phonetically long between vowels); in Romansh the voiceless
affricate [ts]; and in Galician and Spanish it denotes either the
voiceless dental fricative [θ] or [s].
Otherwise, letters that are not combined as digraphs generally
represent the same phonemes as suggested by the International Phonetic
Alphabet (IPA), whose design was, in fact, greatly influenced by
Romance spelling systems.
Digraphs and trigraphs
Romance languages have more sounds than can be accommodated
in the Roman
Latin alphabet they all resort to the use of digraphs and
trigraphs – combinations of two or three letters with a single
phonemic value. The concept (but not the actual combinations) is
derived from Classical Latin, which used, for example, TH, PH, and CH
when transliterating the Greek letters "θ", "ϕ" (later "φ"), and
"χ". These were once aspirated sounds in Greek before changing to
corresponding fricatives, and the H represented what sounded to the
Romans like an /ʰ/ following /t/, /p/, and /k/ respectively. Some of
the digraphs used in modern scripts are:
CI: used in Italian,
Romance languages in Italy, Corsican and Romanian
to represent /tʃ/ before A, O, or U.
CH: used in Italian,
Romance languages in Italy, Corsican, Romanian,
Romansh and Sardinian to represent /k/ before E or I (including yod
/j/); /tʃ/ in Occitan, Spanish, Astur-leonese and Galician; [c] or
[tɕ] in Romansh before A, O or U; and /ʃ/ in most other languages.
In Catalan it is used in some old spelling conventions for /k/.
DD: used in Sicilian and Sardinian to represent the voiced retroflex
plosive /ɖ/. In recent history more accurately transcribed as DDH.
DJ: used in Walloon and Catalan for /dʒ/.
GI: used in Italian,
Romance languages in Italy, Corsican and Romanian
to represent /dʒ/ before A, O, or U, and in Romansh to represent
[ɟi] or /dʑi/ or (before A, E, O, and U) [ɟ] or /dʑ/
GH: used in Italian,
Romance languages in Italy, Corsican, Romanian,
Romansh and Sardinian to represent /ɡ/ before E or I (including yod
/j/), and in Galician for the voiceless pharyngeal fricative /ħ/ (not
GL: used in Romansh before consonants and I and at the end of words
GLI: used in Italian and Corsican for /ʎʎ/ and Romansh for /ʎ/.
GN: used in French, some
Romance languages in Italy, Corsican and
Romansh for /ɲ/, as in champignon; in Italian to represent /ɲɲ/, as
in "ogni" or "lo gnocco".
GU: used before E or I to represent /ɡ/ or /ɣ/ in all Romance
languages except Italian,
Romance languages in Italy, Corsican,
Romansh, and Romanian, which use GH instead.
IG: used at the end of word in Catalan for /tʃ/, as in maig, safareig
IX: used between vowels or at the end of word in Catalan for /ʃ/, as
in caixa or calaix.
LH: used in Portuguese and
LL: used in Spanish, Catalan, Galician, Astur-leonese, Norman and
Dgèrnésiais, originally for /ʎ/ which has merged in some cases with
/j/. Represents /l/ in French unless it follows I (i) when it
represents /j/ (or /ʎ/ in some dialects). As in Italian, it is used
Occitan for a long /ll/.
L·L: used in Catalan for a geminate consonant /ɫɫ/.
NH: used in Portuguese and
Occitan for /ɲ/, used in official Galician
for /ŋ/ .
N-: used in
Piedmontese and Ligurian for /ŋ/ between two vowels.
NN: used in Leonese for /ɲ/, in Italian for geminate /nn/.
NY: used in Catalan for /ɲ/.
QU: represents /kw/ in Italian,
Romance languages in Italy, and
Romansh; /k/ in French, Astur-leonese (normally before e or i); /k/
(before e or i) or /kw/ (normally before a or o) in Occitan, Catalan
and Portuguese; /k/ in Spanish (always before e or i).
RR: used between vowels in several languages (Occitan, Catalan,
Spanish...) to denote a trilled /r/ or a guttural R, instead of the
SC: used before E or I in Italian,
Romance languages in
European Portuguese as /ʃˈs/ and in French, Brazilian
Portuguese, Catalan and
Latin American Spanish as /s/ in words of
certain etymology (notice this would represent /θ/ in standard
SCH: used in Romansh for [ʃ] or [ʒ], in Italian for /sk/ before
"E"or "I", including yod /j/.
SCI: used in Italian,
Romance languages in Italy, and Corsican to
represent /ʃ/ or /ʃʃ/ before A, O, or U.
SH: used in Aranese
Occitan for /ʃ/.
SS: used in French, Portuguese, Piedmontese, Romansh, Occitan, and
Catalan for /s/ between vowels, in Italian,
Romance languages of
Italy, and Corsican for long /ss/.
TS: used in Catalan for /ts/.
TG: used in Romansh for [c] or [tɕ]. In Catalan is used for /dʒ/
before E and I, as in metge or fetge.
TH: used in
Jèrriais for /θ/; used in Aranese for either /t/ or
TJ: used between vowels and before A, O or U, in Catalan for /dʒ/, as
in sotjar or mitjó.
TSCH: used in Romansh for [tʃ].
TX: used at the beginning or at the end of word or between vowels in
Catalan for /tʃ/, as in txec, esquitx or atxa.
TZ: used in Catalan for /dz/.
While the digraphs CH, PH, RH and TH were at one time used in many
words of Greek origin, most languages have now replaced them with
C/QU, F, R and T. Only French has kept these etymological spellings,
which now represent /k/ or /ʃ/, /f/, /ʀ/ and /t/, respectively.
Gemination, in the languages where it occurs, is usually indicated by
doubling the consonant, except when it does not contrast phonemically
with the corresponding short consonant, in which case gemination is
not indicated. In Jèrriais, long consonants are marked with an
apostrophe: S'S is a long /zz/, SS'S is a long /ss/, and T'T is a long
/tt/. Phonemic contrast of geminates vs. single consonants is
widespread in Italian, and normally indicated in the traditional
orthography: fatto /fatto/ 'done' vs. fato /fato/ 'fate, destiny';
cadde /kadde/ 's/he, it fell' vs. cade /kade/ 's/he, it falls'. The
double consonants in French orthography, however, are merely
etymological. In Catalan, the gemination of the l is marked by a punt
volat = flying point – l·l.
Romance languages also introduced various marks (diacritics) that may
be attached to some letters, for various purposes. In some cases,
diacritics are used as an alternative to digraphs and trigraphs;
namely to represent a larger number of sounds than would be possible
with the basic alphabet, or to distinguish between sounds that were
previously written the same.
Diacritics are also used to mark word
stress, to indicate exceptional pronunciation of letters in certain
words, and to distinguish words with same pronunciation (homophones).
Depending on the language, some letter-diacritic combinations may be
considered distinct letters, e.g. for the purposes of lexical sorting.
This is the case, for example, of Romanian ș ([ʃ]) and Spanish ñ
The following are the most common use of diacritics in Romance
Vowel quality: the system of marking close-mid vowels with an acute
accent, é, and open-mid vowels with a grave accent, è, is widely
used (e.g. Catalan, French, Italian). Portuguese, however, uses the
circumflex (ê) for the former, and the acute (é), for the latter.
Romance languages use an umlaut (diaeresis mark) in the
case of ä, ö, ü to indicate fronted vowel variants, as in German.
Centralized vowels (/ɐ/, /ə/) are indicated variously (â in
Portuguese, ă/î in Romanian, ë in Piedmontese, etc.). In French,
Occitan and Romanian, these accents are used whenever necessary to
distinguish the appropriate vowel quality, but in the other languages,
they are used only when it is necessary to mark unpredictable stress,
or in some cases to distinguish homophones.
Vowel length: French uses a circumflex to indicate what had been a
long vowel (although nowadays this rather indicates a difference in
vowel quality, if it has any effect at all on pronunciation). This
same usage is found in some minority languages.
Nasality: Portuguese marks nasal vowels with a tilde (ã) when they
occur before other written vowels and in some other instances.
Palatalization: some historical palatalizations are indicated with the
cedilla (ç) in French, Catalan,
Occitan and Portuguese. In Spanish
and several other world languages influenced by it, the grapheme ñ
represents a palatal nasal consonant.
Separate pronunciation: when a vowel and another letter that would
normally be combined into a digraph with a single sound are
exceptionally pronounced apart, this is often indicated with a
diaeresis mark on the vowel. This is particularly common in the case
of gü /gw/ before e or i, because plain gu in this case would be
pronounced /g/. This usage occurs in Spanish, French, Catalan and
Occitan, and occurred before the 2009 spelling reform in Brazilian
Portuguese. French also uses the diaeresis on the second of two
adjacent vowels to indicate that both are pronounced separately, as in
Noël "Christmas" and haïr "to hate".
Stress: the stressed vowel in a polysyllabic word may be indicated
with an accent, when it cannot be predicted by rule. In Italian,
Portuguese and Catalan, the choice of accent (acute, grave or
circumflex) may depend on vowel quality. When no quality needs to be
indicated, an acute accent is normally used (ú), but Italian and
Romansh use a grave accent (ù). Portuguese puts a diacritic on all
stressed monosyllables that end in a e o as es os, to distinguish them
from unstressed function words: chá "tea", más "bad (fem. pl.)", sé
"seat (of government)", dê "give! (imperative)", mês "month", só
"only", nós "we" (cf. mas "but", se "if/oneself", de "of", nos "us").
Word-final stressed vowels in polysyllables are marked by the grave
accent in Italian, thus università "university/universities", virtù
"virtue/virtues", resulting in occasional minimal or near-minimal
pairs such as parlo "I speak" ≠ parlò "s/he spoke", capi "heads,
bosses" ≠ capì "s/he understood", gravita "it, s'/he gravitates"
≠ gravità "gravity, seriousness".
Homophones: words (especially monosyllables) that are pronounced
exactly or nearly the same way and are spelled identically, but have
different meanings, can be differentiated by a diacritic. Typically,
if one of the pair is stressed and the other isn't, the stressed word
gets the diacritic, using the appropriate diacritic for notating
stressed syllables (see above). Portuguese does this consistently as
part of notating stress in certain monosyllables, whether or not there
is an unstressed homophone (see examples above). Spanish also has many
pairs of identically pronounced words distinguished by an acute accent
on the stressed word: si "if" vs. sí "yes", mas "but" vs. más
"more", mi "my" vs. mí "me", se "oneself" vs. sé "I know", te "you
(object)" vs. té "tea", que/quien/cuando/como "that/who/when/how" vs.
qué/quién/cuándo/cómo "what?/who?/when?/how?", etc. A similar
strategy is common for monosyllables in writing Italian, but not
necessarily determined by stress: stressed dà "it, s/he gives" vs.
unstressed da "by, from", but also tè "tea" and te "you", both
capable of bearing phrasal stress. Catalan has some pairs where both
words are stressed, and one is distinguished by a vowel-quality
diacritic, e.g. os "bone" vs. ós "bear". When no vowel-quality needs
distinguishing, French and Catalan use a grave accent: French ou "or"
vs. où "where", French la "the" vs. "là "there", Catalan ma "my" vs.
Upper and lower case
Most languages are written with a mixture of two distinct but
phonetically identical variants or "cases" of the alphabet: majuscule
("uppercase" or "capital letters"), derived from Roman stone-carved
letter shapes, and minuscule ("lowercase"), derived from Carolingian
writing and Medieval quill pen handwriting which were later adapted by
printers in the fifteenth and sixteenth centuries.
In particular, all
Romance languages capitalize (use uppercase for the
first letter of) the following words: the first word of each complete
sentence, most words in names of people, places, and organizations,
and most words in titles of books. The
Romance languages do not follow
the German practice of capitalizing all nouns including common ones.
Unlike English, the names of months, days of the weeks, and
derivatives of proper nouns are usually not capitalized: thus, in
Italian one capitalizes Francia ("France") and Francesco ("Francis"),
but not francese ("French") or francescano ("Franciscan"). However,
each language has some exceptions to this general rule.
The tables below provide a vocabulary comparison that illustrates a
number of examples of sound shifts that have occurred between Latin
and Romance languages, along with a selection of minority languages.
Words are given in their conventional spellings. In addition, for
French the actual pronunciation is given, due to the dramatic
differences between spelling and pronunciation. (French spelling
approximately reflects the pronunciation of Old French, c. 1200 AD.)
Domina, femina,mulier, mulierem
OOc mólher (nom.) /
oculum > *oclum
euj (Western Piedmontese), eugg (Eastern Piedmontese)
auriculam > *oriclam
Legacy of the Roman Empire
^ Hammarström, Harald; Forkel, Robert; Haspelmath, Martin, eds.
Glottolog 3.0. Jena, Germany: Max Planck Institute
for the Science of Human History.
^ M. Paul Lewis, "Summary by language size", Ethnologue: Languages of
the World, Sixteenth Edition.
Nationalencyklopedin "Världens 100 största språk 2007" The
World's 100 Largest Languages in 2007/2010
^ David Dalby, 1999/2000, The Linguasphere register of the world's
languages and speech communities. Observatoire Linguistique,
Linguasphere Press. Volume 2, pp. 390–410 (zone 51). Oxford.
^ a b Zhang, Huiying (2015). "From
Latin to the Romance languages: A
normal evolution to what extent?" (PDF). Quarterly Journal of Chinese
Studies. 3 (4): 105–111.
^ Ilari, Rodolfo (2002). Lingüística Românica. Ática. p. 50.
^ From the French substantive clos 'closed', itself from
in T. F. Hoad, The Concise Oxford Dictionary of English language,
1993, ISBN 0-19-283098-8, p. 80b.
^ From French verb diner, itself from Late
Latin disjūnāre 'break
one's fast' in HOAD, p. 125b.
^ Rochette, p. 550
^ Stefan Zimmer, "Indo-European," in Celtic Culture: A Historical
Encyclopedia (ABC-Clio, 2006), p. 961
^ Curchin, Leonard A. (1995). "Literacy in the Roman Provinces:
Qualitative and Quantitative Data from Central Spain". The American
Journal of Philology. 116 (3): 461–476 (464). doi:10.2307/295333.
^ Herman, Jozsef (1 November 2010). Vulgar Latin. Penn State Press.
ISBN 0-271-04177-3. , pp. 108–115
^ a b c d e Price, Glanville (1984). The French language: past and
present. London: Grant and Cutler Ltd.
^ "Na" is a contraction of "em" (in) + "a" (the), the form "em a" is
never used, it is always replaced by "na". The same happens with other
prepositions: "de" (of) + o/a/os/as (singular and plural forms for
"the" in masculine and feminine) = do, da, dos, das; etc.
^ Verb; literally means "to put in mouth"
^ Ilona Czamańska, "
Slavs in the
Middle Ages and Modern
Era", Res Historica, 41, Lublin, 2016
^ See Portuguese in Africa.
^ See Portuguese in Asia and Oceania.
^ See List of countries where Portuguese is an official language.
^ I.S. Nistor, "Istoria românilor din Transnistria" (The history of
Romanians from Transnistria), București, 1995
^ Djuvara Neagu, “La Diaspora aroumaine aux XVIIIe et XIXe siècles
“ In: Les Aroumains, Paris : Publications Langues’O, 1989
(Cahiers du Centre d’étude des civilisations d’
Europe centrale et
du Sud-Est ; 8). P. 95-125.
^ 1993 Statistical Abstract of
Israel reports 250,000 speakers of
Romanian in Israel, while the 1995 census puts the total figure of the
Israeli population at 5,548,523
^ "Reports of about 300,000
Jews who left the country after WW2".
Eurojewcong.org. Retrieved 2010-11-06. [permanent dead link]
^ "Encarta Dictionary". Microsoft Encarta 2006. Archived from the
original on 2009-10-28. Retrieved 2009-11-16.
^ "Ethnologue". SIL Haley.
^ "Romance languages". Encyclopædia Britannica. Retrieved 2 December
^ Sardos etiam, qui non Latii sunt sed Latiis associandi videntur,
eiciamus, quoniam soli sine proprio vulgari esse videntur, gramaticam
tanquam simie homines imitantes: nam domus nova et dominus meus
locuntur. ["As for the Sardinians, who are not Italian but may be
associated with Italians for our purposes, out they must go, because
they alone seem to lack a vernacular of their own, instead imitating
gramatica as apes do humans: for they say domus nova [my house] and
dominus meus [my master]." (English translation provided by Dante
Online, De Vulgari Eloquentia, I-xi)] It is unclear whether this
indic.tes that Sardinian still had a two-case system at the time;
modern Sardinian lacks grammatical case.
^ "NEO-ROMANTICISM IN LANGUAGE PLANNING (Edo BERNASCONI)". Archived
from the original on 2015-02-04.
^ "NEO-ROMANTICISM IN LANGUAGE PLANNING (Edo BERNASCONI)". Archived
from the original on 2015-07-10.
^ Peano, Giuseppe (1903). De Latino Sine Flexione. Lingua Auxiliare
Internationale , Revista de Mathematica (Revue de Mathématiques),
Tomo VIII, pp. 74–83. Fratres Bocca Editores: Torino.
^ "Eall fhoil de Bhreathanach". Archived from the original on June 10,
^ Henrik Theiling (2007-10-28). "Þrjótrunn: A North Romance
Language: History". Kunstsprachen.de. Retrieved 2010-11-06.
^ "Relay 10/R – Jelbazech". Steen.free.fr. 2004-08-28. Retrieved
^ a b Cordin, Patrizia (2011). "From verbal prefixes to
direction/result markers in Romance". Linguistica. 51 (1).
^ /ə/ can occur only in unstressed syllables, and it tends to be
rounded [ɵ̞]; it is replaced by [ø] when stressed.
^ /ɐ/ developed as the allophone of /a/ before nasals and under low
stress, and the two are still nearly in complementary distribution. A
few minimal pairs like falamos /fɐˈlɐmuʃ/ "we speak" vs. falámos
/fɐˈlamuʃ/ "we spoke" seem to clearly indicate that /ɐ/ must be a
phoneme, but other analyses are possible. /ɨ/, which developed from
earlier /e/ in unstressed syllables, is even more doubtful.
^ Haase, Martin. 2000. "Reorganization of a gender system: The Central
Italian neuters". in Gender in Grammar and Cognition, ed. by Barbara
Unterbeck et al., pp. 221–236. Berlin: Mouton De Gruyter
^ a b c d Harris, Martin; Vincent, Nigel (1988). The Romance
Languages. London: Routledge.
^ Kibler, William W. (1984). An introduction to Old French. New York:
Modern Language Association of America.
^ Henri Wittmann. "Le français de Paris dans le français des
Amériques" (PDF). (52.1 KB), Proceedings of the
International Congress of Linguists 16.0416 (Paris, 20–25 juillet
1997). Oxford: Pergamon (CD edition).
^ Nanbakhsh, Golnaz (2012). "Moving beyond T/V pronouns of power and
solidarity in interaction : Persian agreement mismatch
construction". Linguistica. 52 (1).
^ ipse originally meant "self", as in ego ipse or egomet ipse "I
myself". ipse later shifted to mean "the" (still reflected in
Sardinian and in the Catalan spoken in the Balearic Islands), and
still later came to be a demonstrative pronoun. From -met ipse the
emphatic (superlative) form metipsimum was created, later evolving
into medisimum and eventually Spanish mismo, French même, Italian
medesimo, which replaced both
Latin ipse "self" and idem "same". The
alternative form metipse eventually produced Catalan mateix,
Galician-Portuguese medês. The more frequent Italian equivalent,
however, is stesso, derived from the combination istum-ipsum.
^ a b Accademia della Crusca On the use of the passato remoto (in
Italian) Archived June 7, 2006, at the Wayback Machine.
^ Used sometimes a past conditional; also used in an apodosis
(then-clause) when the protasis (if-clause) is in the imperfect
subjunctive. Frede Jensen, Syntaxe de l'ancien occitan (Tübingen:
Niemeyer, 1994), 244–5; Povl Skårup, Morphologie synchronique de
l'ancien français (Copenhagen: Stougaard Jensen, 1994), 121–2.
^ Cf. auret "she had" <
Latin habuerat, voldrent "they wanted" <
Latin voluerant. Not clearly distinct in meaning from the first
(normal) preterite, cf. the parallel lines por o fut presentede "for
this reason she was presented" (fut = first preterite, from Latin
fuit) vs. por o's furet morte "for these reasons she was killed"
(furet = second preterite, from
Latin fuerat) in the same poem.
^ Paden, William D. 1998. An Introduction to Old Occitan. Modern
Language Association of America. ISBN 0-87352-293-1. (NEED PAGE
^ σπάθη. Liddell, Henry George; Scott, Robert; A Greek–English
Lexicon at the Perseus Project.
^ Harper, Douglas. "spatha". Online
^ κάρα in Liddell and Scott.
^ κόλαφος in Liddell and Scott.
^ Harper, Douglas. "coup". Online
^ κατά in Liddell and Scott.
^ Harper, Douglas. "-ize". Online
Etymology Dictionary. .
^ Harper, Douglas. "-ist". Online
^ Wolf Dietrich, "Griechisch und Romanisch", Lexikon der
romanistischen Linguistik, vol 7: Kontakt, Migration und
Kunstsprachen: Kontrastivität, Klassifikation und Typologie, eds.
Günter Holtus, Michael Metzeltin & Christian Schmitt (Tübingen:
Max Niemeyer, 1998), 121–34:123–4.
^ Originally formal, now equalizing or informal.
^ Likewise Spanish usted < vuestra merced, Catalan vostè <
^ Note that the current Portuguese spelling (Portuguese Language
Orthographic Agreement of 1990) abolished the use of the diaeresis for
^ Pope (1934).
^ Allen (2003) states: "There appears to have been no great difference
in quality between long and short a, but in the case of the close and
mid vowels (i and u, e and o) the long appear to have been appreciably
closer than the short." He then goes on to the historical development,
quotations from various authors (from around the second century AD),
as well as evidence from older inscriptions where "e" stands for
normally short i, and "i" for long e, etc.
^ Technically, Sardinian is one of the Southern Romance languages. The
same vowel outcome occurred in a small strip running across southern
Italy (the Lausberg Zone), and is thought to have occurred in the
Romance languages of northern Africa.
^ Palmer (1954).
^ cauda would produce French **choue, Italian */kɔda/, Occitan
**cauda, Romanian **caudă.
^ Kaze, Jeffery W. (1991). "
Metaphony and Two Models for the
Vowel Systems". Phonology. 8 (1): 163–170.
doi:10.1017/s0952675700001329. JSTOR 4420029.
^ Calabrese, Andrea. "Metaphony" (PDF). Archived from the original
(PDF) on 2013-09-21. Retrieved 2012-05-15.
^ Álvaro Arias. El morfema de ‘neutro de materia’ en asturiano.
Santiago de Compostela, Universidade de Santiago de Compostela, 1999,
I Premio «Dámaso Alonso» de Investigación Filológica.
^ a b Penny, Ralph (1994). "Continuity and Innovation in Romance:
Metaphony and Mass-Noun Reference in
Spain and Italy". The Modern
Language Review. 89 (2): 273–281. doi:10.2307/3735232.
^ Álvaro Arias. "La armonización vocálica en fonología funcional
(de lo sintagmático en fonología a propósito de dos casos de
metafonía hispánica)", Moenia 11 (2006): 111–139.
^ Note that the outcome of -am -em -om would be the same regardless of
whether lengthening occurred, and that -im was already rare in
Classical Latin, and appears to have barely survived in Proto-Romance.
The only likely survival is in "-teen" numerals such as trēdecim
"thirteen" > Italian tredici. This favors the vowel-lengthening
hypothesis -im > /ĩː/ > /i/; but notice unexpected decem >
Italian dieci (rather than expected *diece). It is possible that dieci
comes from *decim, which analogically replaced decem based on the
-decim ending; but it is also possible that the final /i/ in dieci
represents an irregular development of some other sort and that the
process of analogy worked in the other direction.
Latin forms are attested; metipsissimus is the superlative of
the formative -metipse, found for example in egometipse "myself in
^ Ralph Penny, A History of the Spanish Language, 2nd edn. (Cambridge:
Cambridge UP, 2002), 144.
^ Espinosa, Aurelio M. (1911). "Metipsimus in Spanish and French".
PMLA. 26 (2): 356–378. doi:10.2307/456649. JSTOR 456649.
^ Formerly ⟨qü⟩ in Brazilian Portuguese
^ Formerly ⟨gü⟩ in Brazilian Portuguese
^ "Ditzionàriu in línia". Retrieved 2013-09-14.
^ "Sicilian–English Dictionary". Italian.about.com. 2010-06-15.
^ "Dictionary Sicilian – Italian". Utenti.lycos.it. Retrieved
^ "Indo-European Languages". Retrieved 2013-09-18.
^ "Grand Dissionari Piemontèis / Grande Dizionario Piemontese".
^ "Dictionary English–Friulian Friulian–English".
Sangiorgioinsieme.it. Archived from the original on 2011-07-22.
^ Beaumont (2008-12-16). "Occitan–English Dictionary". Freelang.net.
^ "English Aragonese Dictionary Online". Glosbe. Retrieved
^ "English Asturian Dictionary Online". Glosbe. Retrieved
^ Developed from *pluviūtam.
^ Initial h- due to contamination of Germanic *hauh "high". Although
no longer pronounced, it reveals its former presence by inhibiting
elision of a preceding schwa, e.g. le haut "the high" vs. l'eau "the
^ a b c d
Latin mē, not ego. Note that this parallels
the state of affairs in Celtic, where the cognate of ego is not
attested anywhere, and the use of the accusative form cognate to mē
has been extended to cover the nominative, as well.
^ a b Developed from an assimilated form *nossum rather than from
Frederick Browning Agard. A Course in Romance Linguistics. Vol. 1: A
Synchronic View, Vol. 2: A Diachronic View. Georgetown University
Harris, Martin; Vincent, Nigel (1988). The Romance Languages. London:
Routledge. . Reprint 2003.
Posner, Rebecca (1996). The Romance Languages. Cambridge: Cambridge
Gerhard Ernst et al., eds. Romanische Sprachgeschichte: Ein
internationales Handbuch zur Geschichte der romanischen Sprachen. 3
vols. Berlin: Mouton de Gruyter, 2003 (vol. 1), 2006 (vol. 2).
Alkire, Ti; Rosen, Carol (2010). Romance Languages: A Historical
Introduction. Cambridge: Cambridge University Press.
Martin Maiden, John Charles Smith & Adam Ledgeway, eds., The
Cambridge History of the Romance Languages. Vol. 1: Structures, Vol.
2: Contexts. Cambridge: Cambridge UP, 2011 (vol. 1) & 2013 (vol.
Martin Maiden & Adam Ledgeway, eds. The Oxford Guide to the
Romance Languages. Oxford: Oxford University Press, 2016.
Lindenbauer, Petrea; Metzeltin, Michael; Thir, Margit (1995). Die
romanischen Sprachen. Eine einführende Übersicht. Wilhelmsfeld: G.
Metzeltin, Michael (2004). Las lenguas románicas estándar. Historia
de su formación y de su uso. Uviéu: Academia de la Llingua
Boyd-Bowman, Peter (1980). From
Latin to Romance in Sound Charts.
Washington, D.C.: Georgetown University Press.
Cravens, Thomas D. Comparative Historical Dialectology: Italo-Romance
Ibero-Romance Sound Change. Amsterdam: John Benjamins, 2002.
Sónia Frota & Pilar Prieto, eds. Intonation in Romance. Oxford:
Oxford UP, 2015.
Christoph Gabriel & Conxita Lleó, eds. Intonational Phrasing in
Romance and Germanic: Cross-Linguistic and Bilingual studies.
Amsterdam: John Benjamins, 2011.
Philippe Martin. The Structure of Spoken Language: Intonation in
Romance. Cambridge: Cambridge UP, 2016.
Prosthesis in Romance. Oxford: Oxford UP, 2010.
Holtus, Günter; Metzeltin, Michael; Schmitt, Christian (1988).
Lexikon der Romanistischen Linguistik. (LRL, 12 volumes). Tübingen:
Price, Glanville (1971). The French language: present and past. Edward
Kibler, William W. (1984). An introduction to Old French. New York:
Modern Language Association of America.
Lodge, R. Anthony (1993). French: From
Dialect to Standard. London:
Williams, Edwin B. (1968). From
Latin to Portuguese, Historical
Phonology and Morphology of the Portuguese Language (2nd ed.).
University of Pennsylvania.
Wetzels, W. Leo; Menuzzi, Sergio; Costa, João (2016). The Handbook of
Portuguese Linguistics. Oxford: Wiley Blackwell.
Penny, Ralph (2002). A History of the Spanish Language (2nd ed.).
Cambridge: Cambridge University Press.
Lapesa, Rafael (1981). Historia de la Lengua Española. Madrid:
Pharies, David (2007). A Brief History History of the Spanish
Language. Chicago: University of Chicago Press.
Zamora Vicente, Alonso (1967). Dialectología Española (2nd ed.).
Madrid: Editorial Gredos.
Devoto, Giacomo; Giacomelli, Gabriella (2002). I Dialetti delle
Regioni d'Italia (3rd ed.). Milano: RCS Libri (Tascabili
Devoto, Giacomo (1999). Il Linguaggio d'Italia. Milano: RCS Libri
(Biblioteca Universale Rizzoli).
Maiden, Martin (1995). A Linguistic History of Italian. London:
John Haiman & Paola Benincà, eds., The
London: Routledge, 1992.
Michael de Vaan, Etymological Dictionary of
Latin and the other Italic
Languages, Brill, 2008, 826pp. (part available freely online)
Lexikon der Romanistischen Linguistik (LRL), edd. Holtus / Metzeltin /
Schmitt[permanent dead link]
Michael Metzeltin, Las lenguas románicas estándar. Historia de su
formación y de su uso, Oviedo, 2004
Orbis Latinus, site on Romance languages
Hugh Wilkinson's papers on Romance Languages
Spanish is a Romance language, but what does that have to do with the
type of romance between lovers?, dictionary.com
Comparative Grammar of the Romance Languages
Comparison of the computer terms in Romance languages
Romance languages (Classification)
North Italian dialects
Gallo-Italic of Sicily
Gallo-Italic of Basilicata
Mediterranean Lingua Franca
Central, Sardinian and Eastern
Italics indicate extinct languages
Bold indicates languages with more than 5 million speakers
Languages between parentheses are varieties of the language on their