HOME

TheInfoList



OR:

The origin of speech refers to the general problem of the origin of language in the context of the physiological development of the human speech organs such as the tongue, lips, and vocal organs used to produce phonological units in all spoken languages. The origin of speech has been studied through many fields and topics such as:
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
, anatomy, and history of linguistics. The origin of speech is related to the more general problem of the origin of language, the
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
of distinctively human speech capacities has become a distinct and in many ways separate area of scientific research. The topic is a separate one because language is not necessarily spoken: it can equally be written or signed. Speech is in this sense optional, although it is the default modality for language.


Background

There are many different theories and ideas that give us a theoretical framework of how speech in humans originated. Multiple of these theories play on the idea of how humans evolved over time.
Monkey Monkey is a common name that may refer to most mammals of the infraorder Simiiformes, also known as the simians. Traditionally, all animals in the group now known as simians are counted as monkeys except the apes, which constitutes an incomple ...
s, apes and humans, like many other animals, have evolved specialized mechanisms for producing ''sound'' for purposes of social communication. On the other hand, no monkey or ape uses its ''tongue'' for such purposes. The human species' unprecedented use of the tongue, lips and other moveable parts seems to place speech in a quite separate category, making its evolutionary emergence an intriguing theoretical challenge in the eyes of many scholars. Nevertheless, recent insights in human evolution – more specifically, human Pleistocene littoral evolution – help understand how human speech evolved: different biological pre-adaptations to spoken language find their origin in humanity's waterside past, such as a larger brain (thanks to DHA and other brain-specific nutrients in seafoods), voluntary breathing (
breath-hold diving Freediving, free-diving, free diving, breath-hold diving, or skin diving is a form of underwater diving that relies on breath-holding until resurfacing rather than the use of breathing apparatus such as scuba gear. Besides the limits of breath-h ...
for shellfish, etc.), and suction feeding of soft-slippery seafoods. Suction feeding explains why humans, as opposed to other hominoids, evolved hyoidal descent (tongue-bone descended in the throat), closed tooth-rows (with incisiform canine teeth) and a globular tongue perfectly fitting in a vaulted and smooth palate (without transverse ridges as in apes): all this allowed the pronunciation of
consonant In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are and pronounced with the lips; and pronounced with the front of the tongue; and pronounced w ...
s. Other, probably older, pre-adaptations to human speech are territorial songs and gibbon-like duetting and vocal learning. Vocal learning, the ability to imitate sounds – as in many birds and bats and a number of
Cetacea Cetacea (; , ) is an infraorder of aquatic mammals that includes whales, dolphins, and porpoises. Key characteristics are their fully aquatic lifestyle, streamlined body shape, often large size and exclusively carnivorous diet. They propel th ...
and
Pinniped Pinnipeds (pronounced ), commonly known as seals, are a widely distributed and diverse clade of carnivorous, fin-footed, semiaquatic, mostly marine mammals. They comprise the extant families Odobenidae (whose only living member is the ...
ia – is arguably required for locating or finding back (amid the foliage or in the sea) the offspring or parents. Indeed, independent lines of evidence ( comparative,
fossil A fossil (from Classical Latin , ) is any preserved remains, impression, or trace of any once-living thing from a past geological age. Examples include bones, shells, exoskeletons, stone imprints of animals or microbes, objects preserved ...
,
archeological Archaeology or archeology is the scientific study of human activity through the recovery and analysis of material culture. The archaeological record consists of artifacts, architecture, biofacts or ecofacts, sites, and cultural landsca ...
, paleo-environmental, isotopic, nutritional, and physiological) show that early-
Pleistocene The Pleistocene ( , often referred to as the ''Ice age'') is the geological Epoch (geology), epoch that lasted from about 2,580,000 to 11,700 years ago, spanning the Earth's most recent period of repeated glaciations. Before a change was fina ...
"archaic"
Homo ''Homo'' () is the genus that emerged in the (otherwise extinct) genus '' Australopithecus'' that encompasses the extant species ''Homo sapiens'' ( modern humans), plus several extinct species classified as either ancestral to or closely rela ...
spread intercontinentally along the
Indian Ocean The Indian Ocean is the third-largest of the world's five oceanic divisions, covering or ~19.8% of the water on Earth's surface. It is bounded by Asia to the north, Africa to the west and Australia to the east. To the south it is bounded by ...
shores (they even reached overseas islands such as
Flores Flores is one of the Lesser Sunda Islands, a group of islands in the eastern half of Indonesia. Including the Komodo Islands off its west coast (but excluding the Solor Archipelago to the east of Flores), the land area is 15,530.58 km2, and t ...
) where they regularly dived for littoral foods such as shell- and crayfish, which are extremely rich in brain-specific nutrients, explaining Homo's brain enlargement.
Shallow diving Shallow diving is an extreme sport, whereby enthusiasts attempt to dive from the greatest height into the shallowest depth of water, without sustaining injury. It is typically associated with traveling circuses along with the strongman, performing ...
for seafoods requires voluntary airway control, a prerequisite for spoken language. Seafood such as shellfish generally does not require biting and chewing, but stone tool use and suction feeding. This finer control of the oral apparatus was arguably another biological pre-adaptation to human speech, especially for the production of consonants.


Modality-independence

The term ''
modality Modality may refer to: Humanities * Modality (theology), the organization and structure of the church, as distinct from sodality or parachurch organizations * Modality (music), in music, the subject concerning certain diatonic scales * Modaliti ...
'' means the chosen representational format for encoding and transmitting information. A striking feature of language is that it is ''modality-independent.'' Should an impaired child be prevented from hearing or producing sound, its innate capacity to master a language may equally find expression in signing.
Sign language Sign languages (also known as signed languages) are languages that use the visual-manual modality to convey meaning, instead of spoken words. Sign languages are expressed through manual articulation in combination with non-manual markers. Sign ...
s of the deaf are independently invented and have all the major properties of spoken language except for the modality of transmission. From this it appears that the language centres of the human brain must have evolved to function optimally, irrespective of the selected modality. Animal communication systems routinely combine visible with audible properties and effects, but none is modality-independent. For example, no vocally-impaired whale, dolphin, or songbird could express its song repertoire equally in visual display. Indeed, in the case of animal communication, message and modality are not capable of being disentangled. Whatever message is being conveyed stems from the intrinsic properties of the signal. Modality independence should not be confused with the ordinary phenomenon of multimodality. Monkeys and apes rely on a repertoire of species-specific "gesture-calls" – emotionally-expressive vocalisations inseparable from the visual displays which accompany them. Humans also have species-specific gesture-calls – laughs, cries, sobs, etc. – together with involuntary gestures accompanying speech. Many animal displays are polymodal in that each appears designed to exploit multiple channels simultaneously. The human linguistic property of modality independence is conceptually distinct from polymodality. It allows the speaker to encode the informational content of a message in a single channel whilst switching between channels as necessary. Modern city-dwellers switch effortlessly between the spoken word and writing in its various forms – handwriting, typing,
email Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" mean ...
, etc. Whichever modality is chosen, it can reliably transmit the full message content without external assistance of any kind. When talking on the
telephone A telephone is a telecommunications device that permits two or more users to conduct a conversation when they are too far apart to be easily heard directly. A telephone converts sound, typically and most efficiently the human voice, into e ...
, for example, any accompanying facial or manual gestures, however natural to the speaker, are not strictly necessary. When typing or manually signing, conversely, there is no need to add sounds. In many
Australian Aboriginal culture Australian Aboriginal culture includes a number of practices and ceremonies centered on a belief in the Dreamtime and other mythology. Reverence and respect for the land and oral traditions are emphasised. Over 300 languages and other groupi ...
s, a section of the population – perhaps women observing a ritual taboo – traditionally restrict themselves for extended periods to a silent (manually-signed) version of their language. Then, when released from the taboo, these same individuals resume narrating stories by the fireside or in the dark, switching to pure sound without sacrifice of informational content.


Evolution of the speech organs

Speaking is the default modality for language in all cultures. Humans' first recourse is to encode our thoughts in sound – a method which depends on sophisticated capacities for controlling the lips, tongue and other components of the vocal apparatus. The speech organs evolved in the first instance not for speech but for more basic bodily functions such as feeding and breathing. Nonhuman primates have broadly similar organs, but with different neural controls. Non-human apes use their highly-flexible, maneuverable tongues for eating but not for vocalizing. When an ape is not eating, fine motor control over its tongue is deactivated. ''Either'' it is performing gymnastics with its tongue ''or'' it is vocalising; it cannot perform both activities simultaneously. Since this applies to
mammal Mammals () are a group of vertebrate animals constituting the class Mammalia (), characterized by the presence of mammary glands which in females produce milk for feeding (nursing) their young, a neocortex (a region of the brain), fur ...
s in general, ''Homo sapiens'' are exceptional in harnessing mechanisms designed for respiration and
ingestion Ingestion is the consumption of a substance by an organism. In animals, it normally is accomplished by taking in a substance through the mouth into the gastrointestinal tract, such as through eating or drinking. In single-celled organisms in ...
for the radically different requirements of articulate speech.


Tongue

The word "language" derives from the Latin ''lingua,'' "tongue". Phoneticians agree that the tongue is the most important speech articulator, followed by the lips. A natural language can be viewed as a particular way of using the tongue to express thought. The human tongue has an unusual shape. In most mammals, it is a long, flat structure contained largely within the mouth. It is attached at the rear to the
hyoid bone The hyoid bone (lingual bone or tongue-bone) () is a horseshoe-shaped bone situated in the anterior midline of the neck between the chin and the thyroid cartilage. At rest, it lies between the base of the mandible and the third cervical verteb ...
, situated below the oral level in the
pharynx The pharynx (plural: pharynges) is the part of the throat behind the mouth and nasal cavity, and above the oesophagus and trachea (the tubes going down to the stomach and the lungs). It is found in vertebrates and invertebrates, though its st ...
. In humans, the tongue has an almost circular
sagittal The sagittal plane (; also known as the longitudinal plane) is an anatomical plane that divides the body into right and left sections. It is perpendicular to the transverse and coronal planes. The plane may be in the center of the body and divi ...
(midline) contour, much of it lying vertically down an extended
pharynx The pharynx (plural: pharynges) is the part of the throat behind the mouth and nasal cavity, and above the oesophagus and trachea (the tubes going down to the stomach and the lungs). It is found in vertebrates and invertebrates, though its st ...
, where it is attached to a hyoid bone in a lowered position. Partly as a result of this, the horizontal (inside-the-mouth) and vertical (down-the-throat) tubes forming the supralaryngeal vocal tract (SVT) are almost equal in length (whereas in other species, the vertical section is shorter). As we move our jaws up and down, the tongue can vary the cross-sectional area of each tube independently by about 10:1, altering formant frequencies accordingly. That the tubes are joined at a right angle permits pronunciation of the
vowel A vowel is a syllabic speech sound pronounced without any stricture in the vocal tract. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness and also in quantity (len ...
s '' and '', which nonhuman primates cannot do. Even when not performed particularly accurately, in humans the articulatory gymnastics needed to distinguish these vowels yield consistent, distinctive acoustic results, illustrating the quantal nature of human speech sounds. It may not be coincidental that '' and '' are the most common vowels in the world's languages.Ladefoged, P. and Maddieson, I. 1996. ''The Sounds of the World's Languages.'' Oxford: Blackwell. Human tongues are a lot shorter and thinner than other mammals and are composed of a large number of muscles, which helps shape a variety of sounds within the oral cavity. The diversity of sound production is also increased with the human’s ability to open and close the airway, allowing varying amounts of air to exit through the nose. The fine motor movements associated with the tongue and the airway, make humans more capable of producing a wide range of intricate shapes in order to produce sounds at different rates and intensities.


Lips

In humans, the lips are important for the production of stops and fricatives, in addition to
vowel A vowel is a syllabic speech sound pronounced without any stricture in the vocal tract. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness and also in quantity (len ...
s. Nothing, however, suggests that the lips evolved for those reasons. During
primate evolution The evolutionary history of the primates can be traced back 57-85/90 million years. One of the oldest known primate-like mammal species, ''Plesiadapis'', came from North America; another, '' Archicebus'', came from China. Other similar basal prim ...
, a shift from
nocturnal Nocturnality is an animal behavior characterized by being active during the night and sleeping during the day. The common adjective is "nocturnal", versus diurnal meaning the opposite. Nocturnal creatures generally have highly developed sens ...
to diurnal activity in tarsiers, monkeys and apes (the
haplorhines Haplorhini (), the haplorhines ( Greek for "simple-nosed") or the "dry-nosed" primates, is a suborder of primates containing the tarsiers and the simians (Simiiformes or anthropoids), as sister of the Strepsirrhini ("moist-nosed"). The name is ...
) brought with it an increased reliance on vision at the expense of
olfaction The sense of smell, or olfaction, is the special sense through which smells (or odors) are perceived. The sense of smell has many functions, including detecting desirable foods, hazards, and pheromones, and plays a role in taste. In humans, ...
. As a result, the snout became reduced and the rhinarium or "wet nose" was lost. The muscles of the face and lips consequently became less constrained, enabling their co-option to serve purposes of facial expression. The lips also became thicker, and the oral cavity hidden behind became smaller. Hence, according to Ann MacLarnon, "the evolution of mobile, muscular lips, so important to human speech, was the exaptive result of the evolution of diurnality and visual communication in the common ancestor of haplorhines". It is unclear whether human lips have undergone a more recent adaptation to the specific requirements of speech.


Respiratory control

Compared with nonhuman primates, humans have significantly enhanced control of breathing, enabling exhalations to be extended and inhalations shortened as we speak. Whilst we are speaking, intercostal and interior abdominal muscles are recruited to expand the
thorax The thorax or chest is a part of the anatomy of humans, mammals, and other tetrapod animals located between the neck and the abdomen. In insects, crustaceans, and the extinct trilobites, the thorax is one of the three main divisions of the c ...
and draw air into the lungs, and subsequently to control the release of air as the lungs deflate. The muscles concerned are markedly more innervated in humans than in nonhuman primates. Evidence from fossil hominins suggests that the necessary enlargement of the
vertebral canal The spinal canal (or vertebral canal or spinal cavity) is the canal that contains the spinal cord within the vertebral column. The spinal canal is formed by the vertebrae through which the spinal cord passes. It is a process of the dorsal body ...
, and therefore
spinal cord The spinal cord is a long, thin, tubular structure made up of nervous tissue, which extends from the medulla oblongata in the brainstem to the lumbar region of the vertebral column (backbone). The backbone encloses the central canal of the sp ...
dimensions, may not have occurred in ''
Australopithecus ''Australopithecus'' (, ; ) is a genus of early hominins that existed in Africa during the Late Pliocene and Early Pleistocene. The genus ''Homo'' (which includes modern humans) emerged within ''Australopithecus'', as sister to e.g. ''Australo ...
'' or ''
Homo erectus ''Homo erectus'' (; meaning "upright man") is an extinct species of archaic human from the Pleistocene, with its earliest occurrence about 2 million years ago. Several human species, such as '' H. heidelbergensis'' and '' H. antecessor ...
'' but was present in the
Neanderthal Neanderthals (, also ''Homo neanderthalensis'' and erroneously ''Homo sapiens neanderthalensis''), also written as Neandertals, are an Extinction, extinct species or subspecies of archaic humans who lived in Eurasia until about 40,000 years ag ...
s and early modern humans.


Larynx

The
larynx The larynx (), commonly called the voice box, is an organ in the top of the neck involved in breathing, producing sound and protecting the trachea against food aspiration. The opening of larynx into pharynx known as the laryngeal inlet is about ...
or voice box is an organ in the neck housing the
vocal folds In humans, vocal cords, also known as vocal folds or voice reeds, are folds of throat tissues that are key in creating sounds through vocalization. The size of vocal cords affects the pitch of voice. Open when breathing and vibrating for speec ...
, which are responsible for
phonation The term phonation has slightly different meanings depending on the subfield of phonetics. Among some phoneticians, ''phonation'' is the process by which the vocal folds produce certain sounds through quasi-periodic vibration. This is the defin ...
. In humans, the larynx is ''descended,'' it is positioned lower than in other primates. This is because the evolution of humans to an upright position shifted the head directly above the spinal cord, forcing everything else downward. The repositioning of the larynx resulted in a longer cavity called the pharynx, which is responsible for increasing the range and clarity of the sound being produced. Other primates have almost no pharynx; therefore, their vocal power is significantly lower. Humans are not unique in this respect: goats, dogs, pigs and tamarins lower the larynx temporarily, to emit loud calls. Several deer species have a permanently lowered larynx, which may be lowered still further by males during their roaring displays. Lions, jaguars, cheetahs and domestic cats also do this. However, laryngeal descent in nonhumans (according to
Philip Lieberman Philip Lieberman (October 25, 1934 – July 12, 2022) was a cognitive scientist at Brown University, Providence, Rhode Island, United States. Originally trained in phonetics, he wrote a dissertation on intonation. His career focused on topic ...
) is not accompanied by descent of the hyoid; hence the tongue remains horizontal in the oral cavity, preventing it from acting as a pharyngeal articulator. Despite all this, scholars remain divided as to how "special" the human vocal tract really is. It has been shown that the larynx does descend to some extent during development in
chimpanzee The chimpanzee (''Pan troglodytes''), also known as simply the chimp, is a species of great ape native to the forest and savannah of tropical Africa. It has four confirmed subspecies and a fifth proposed subspecies. When its close relative t ...
s, followed by hyoidal descent. As against this, Philip Lieberman points out that only humans have evolved permanent and substantial laryngeal descent in association with hyoidal descent, resulting in a curved tongue and two-tube vocal tract with 1:1 proportions. Uniquely in the human case, simple contact between the
epiglottis The epiglottis is a leaf-shaped flap in the throat that prevents food and water from entering the trachea and the lungs. It stays open during breathing, allowing air into the larynx. During swallowing, it closes to prevent aspiration of food in ...
and
velum Velum may refer to: Human anatomy * Superior medullary velum, anterior medullary velum or valve of Vieussens, white matter, in the brain, which stretches between the superior cerebellar peduncles ** Frenulum of superior medullary velum, a slightl ...
is no longer possible, disrupting the normal mammalian separation of the respiratory and digestive tracts during swallowing. Since this entails substantial costs – increasing the risk of choking whilst swallowing food – we are forced to ask what benefits might have outweighed those costs. Some claim the clear benefit must have been speech, but other contest this. One objection is that humans are in fact not seriously at risk of choking on food: medical statistics indicate that accidents of this kind are extremely rare. Another objection is that in the view of most scholars, speech as we know it emerged relatively late in human evolution, roughly contemporaneously with the emergence of ''Homo sapiens.'' A development as complex as the reconfiguration of the human vocal tract would have required much more time, implying an early date of origin. This discrepancy in timescales undermines the idea that human vocal flexibility was initially driven by selection pressures for speech. At least one
orangutan Orangutans are great apes native to the rainforests of Indonesia and Malaysia. They are now found only in parts of Borneo and Sumatra, but during the Pleistocene they ranged throughout Southeast Asia and South China. Classified in the genu ...
has demonstrated the ability to control the voice box.


The size exaggeration hypothesis

To lower the larynx is to increase the length of the vocal tract, in turn lowering formant frequencies so that the voice sounds "deeper" – giving an impression of greater size.
John Ohala John Jerome Ohala (July 19, 1941 – August 22, 2020) was a linguist specializing in phonetics and phonology. He was a Professor Emeritus in linguistics at the University of California, Berkeley. Career He received his PhD in linguistics in 19 ...
argued that the function of the lowered larynx in humans, especially males, is probably to enhance threat displays rather than speech itself. Ohala pointed out that if the lowered larynx were an adaptation for speech, we would expect adult human males to be better adapted in this respect than adult females, whose larynx is considerably less low. In fact, females invariably outperform males in verbal tests, falsifying this whole line of reasoning. William Tecumseh Fitch likewise argues that this was the original selective advantage of laryngeal lowering in our species. Although, according to Fitch, the initial lowering of the larynx in humans had nothing to do with speech, the increased range of possible formant patterns was subsequently co-opted for speech. Size exaggeration remains the sole function of the extreme laryngeal descent observed in male deer. Consistent with the size exaggeration hypothesis, a second descent of the larynx occurs at puberty in humans, although only in males. In response to the objection that the larynx is descended in human females, Fitch suggests that mothers vocalising to protect their infants would also have benefited from this ability.


Neanderthal speech

Most specialists credit the Neanderthals with speech abilities not radically different from those of modern ''Homo sapiens''. An indirect line of argument is that their
tool A tool is an object that can extend an individual's ability to modify features of the surrounding environment or help them accomplish a particular task. Although many animals use simple tools, only human beings, whose use of stone tools dates b ...
making and hunting tactics would have been difficult to learn or execute without some kind of speech. A recent extraction of DNA from Neanderthal bones indicates that Neanderthals had the same version of the FOXP2 gene as modern humans. This gene, mistakenly described as the "grammar gene", plays a role in controlling the orofacial movements which (in modern humans) are involved in speech. During the 1970s, it was widely believed that the Neanderthals lacked modern speech capacities. It was claimed that they possessed a hyoid bone so high up in the vocal tract as to preclude the possibility of producing certain vowel sounds. The hyoid bone is present in many mammals. It allows a wide range of tongue, pharyngeal and laryngeal movements by bracing these structures alongside each other in order to produce variation. It is now realised that its lowered position is not unique to ''Homo sapiens'', whilst its relevance to vocal flexibility may have been overstated: although men have a lower larynx, they do not produce a wider range of sounds than women or two-year-old babies. There is no evidence that the larynx position of the Neanderthals impeded the range of vowel sounds they could produce. The discovery of a modern-looking hyoid bone of a Neanderthal man in the
Kebara Cave Kebara Cave ( he, מערת כבארה, Me'arat Kebbara, ar, مغارة الكبارة, Mugharat al-Kabara) is a limestone cave locality in Wadi Kebara, situated at above sea level on the western escarpment of the Carmel Range, in the Ramat HaN ...
in
Israel Israel (; he, יִשְׂרָאֵל, ; ar, إِسْرَائِيل, ), officially the State of Israel ( he, מְדִינַת יִשְׂרָאֵל, label=none, translit=Medīnat Yīsrāʾēl; ), is a country in Western Asia. It is situated ...
led its discoverers to argue that the Neanderthals had a descended
larynx The larynx (), commonly called the voice box, is an organ in the top of the neck involved in breathing, producing sound and protecting the trachea against food aspiration. The opening of larynx into pharynx known as the laryngeal inlet is about ...
, and thus human-like speech capabilities. However, other researchers have claimed that the morphology of the hyoid is not indicative of the larynx's position. It is necessary to take into consideration the skull base, the
mandible In anatomy, the mandible, lower jaw or jawbone is the largest, strongest and lowest bone in the human facial skeleton. It forms the lower jaw and holds the lower teeth in place. The mandible sits beneath the maxilla. It is the only movable bone ...
, the
cervical vertebrae In tetrapods, cervical vertebrae (singular: vertebra) are the vertebrae of the neck, immediately below the skull. Truncal vertebrae (divided into thoracic and lumbar vertebrae in mammals) lie caudal (toward the tail) of cervical vertebrae. In ...
and a cranial reference plane. The morphology of the
outer {{Short pages monitor