Kenneth Noble Stevens (March 24, 1924 – August 19, 2013) was the Clarence J. LeBel Professor of Electrical Engineering and Computer Science, and Professor of Health Sciences and Technology at the Research Laboratory of Electronics at

MIT The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the m ...

. Stevens was head of the Speech Communication Group in MIT's Research Laboratory of Electronics (RLE), and was one of the world's leading scientists in

acoustic phonetics Acoustic phonetics is a subfield of phonetics, which deals with acoustic aspects of speech sounds. Acoustic phonetics investigates time domain features such as the mean squared amplitude of a waveform, its duration, its fundamental frequency, o ...

. He was awarded the

National Medal of Science The National Medal of Science is an honor bestowed by the President of the United States to individuals in science and engineering who have made important contributions to the advancement of knowledge in the fields of behavioral and social scienc ...

from President

Bill Clinton William Jefferson Clinton ( né Blythe III; born August 19, 1946) is an American politician who served as the 42nd president of the United States from 1993 to 2001. He previously served as governor of Arkansas from 1979 to 1981 and agai ...

in 1999, and the IEEE James L. Flanagan Speech and Audio Processing Award in 2004. He died in 2013 from complications of Alzheimer's disease.

Education

Early education

Ken Stevens was born in Toronto on March 23, 1924. His older brother, Pete, was born in England; Ken was born four years later, shortly after the family emigrated to Canada. His childhood ambition was to become a doctor, because he admired an uncle who was a doctor. He attended high school at a school attached to the Department of Education at the

University of Toronto The University of Toronto (UToronto or U of T) is a public university, public research university in Toronto, Ontario, Canada, located on the grounds that surround Queen's Park (Toronto), Queen's Park. It was founded by royal charter in 1827 ...

. Stevens attended college in the School of Engineering at the

on a full scholarship. He lived at home throughout his undergraduate years. Though Stevens himself could not fight in World War II because of his visual impairment, his brother was away for the entire war; his parents tuned in nightly to the BBC for updates. Stevens majored in engineering physics at the university, covering topics from the design of motorized machines through to basic physics, which was taught by the physics department. During summers he worked in the defense industry, including one summer at a company that was developing radar. He received both his S.B. and S.M. degrees in 1945. Stevens had been a teacher since his undergraduate years, when he lectured sections of home economics that involved some aspect of physics. After receiving his master's degree, he stayed at the University of Toronto as an instructor, teaching courses to young men returning from the war, including his own older brother. He was a fellow of the Ontario Foundation from 1945 to 1946, then worked as an instructor at the University of Toronto until 1948. During his master's research Stevens became interested in control theory, and took courses from the applied mathematics department, where one of his professors recommended that he should apply to

for doctoral studies.

Doctoral studies

Shortly after Stevens was admitted to MIT, a new professor named

Leo Beranek Leo Leroy Beranek (September 15, 1914 – October 10, 2016) was an American acoustics expert, former MIT professor, and a founder and former president of Bolt, Beranek and Newman (now BBN Technologies). He authored ''Acoustics'', considered a cl ...

noticed that Stevens had taken acoustics. Beranek contacted Stevens in Toronto, to ask if he would be a teaching assistant for Beranek's new acoustics course, and Stevens agreed. Shortly after that, Beranek contacted Stevens again to offer him a research position on a new speech project, which Stevens also accepted. The

Radiation Laboratory The Radiation Laboratory, commonly called the Rad Lab, was a microwave and radar research laboratory located at the Massachusetts Institute of Technology (MIT) in Cambridge, Massachusetts. It was first created in October 1940 and operated until 31 ...

at MIT (building 20) was converted, after the war, into the Research Laboratory of Electronics (RLE); among other labs, RLE hosted Beranek's new Acoustics Lab. In November 1949, the office next to Ken's was given to a visiting doctoral student from Sweden named

Gunnar Fant Carl Gunnar Michael Fant (October 8, 1919 – June 6, 2009) was a leading researcher in speech science in general and speech synthesis in particular who spent most of his career as a professor at the Swedish Royal Institute of Technology (KTH) in ...

, with whom he formed a friendship and collaboration that would last more than half a century. Stevens focused on the study of vowels during his doctoral research; in 1950 he published a short paper arguing that the autocorrelation could be used to discriminate vowels, while his 1952 doctoral thesis reported perceptual results for vowels synthesized using a set of electronic resonators. Fant convinced Stevens that a transmission-line model of the vocal tract was more flexible than a resonator model and the two published this work together in 1953. Ken credits Fant with the association between the Linguistics Department and the Research Laboratory for Electronics at MIT.

Roman Jakobson Roman Osipovich Jakobson (russian: Рома́н О́сипович Якобсо́н; October 11, 1896Kucera, Henry. 1983. "Roman Jakobson." ''Language: Journal of the Linguistic Society of America'' 59(4): 871–883. – July 18,Harvard, had an office at MIT by 1957, while

Morris Halle Morris Halle (; July 23, 1923 – April 2, 2018) was a Latvian-born Jewish American linguist who was an Institute Professor, and later professor emeritus, of linguistics at the Massachusetts Institute of Technology. The father of "modern phonolo ...

joined the MIT Linguistics Department and moved to RLE in 1951. Stevens' collaborations with Halle began with acoustics, but grew to focus on the way in which acoustics and articulation organize the sound systems of language. Stevens defended his doctoral thesis in 1952; his doctoral committee included his adviser

, as well as J. C. R. Licklider and Walter A. Rosenblith. After receiving his doctorate, Stevens went to work at Bolt, Beranek and Newman (now

BBN Technologies Raytheon BBN (originally Bolt Beranek and Newman Inc.) is an American research and development company, based next to Fresh Pond in Cambridge, Massachusetts, United States. In 1966, the Franklin Institute awarded the firm the Frank P. Brown ...

) in Harvard Square. In the early 1950s, Beranek decided to retire from the MIT faculty in order to work full-time at BBN. He knew that Stevens loved to teach, so he encouraged Stevens to apply for a position on the MIT faculty. Stevens did so, and joined the faculty in 1954.

Research, teaching and service

Scientific contributions

Stevens is best known for his contributions to the fields of

Phonology Phonology is the branch of linguistics that studies how languages or dialects systematically organize their sounds or, for sign languages, their constituent parts of signs. The term can also refer specifically to the sound or sign system of a ...

, speech perception, and speech production. Stevens' most well-known book, Acoustic Phonetics, is organized according to the distinctive features of Stevens' phonological system.

Contributions to phonology

Stevens is perhaps best known for his proposal of a theory that answers the question: Why are the sounds of the world's languages (their

phonemes In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-west ...

or segments) so similar to one another? On first learning a foreign language, one is struck by the remarkable differences that can exist between one language's sound system and that of any other. Stevens turned the student's perception on its head: rather than asking why languages are different, he asked, if the sound system of each language is completely arbitrary, why are languages so similar? His answer is the

quantal theory of speech The quantal theory of speech is a phonetic answer to one of the fundamental questions of phonology, specifically: if each language community is free to arbitrarily select a system of phonemes or segments, then why are the phoneme In phonolog ...

. Quantal theory is supported by a theory of language change, developed in collaboration with Samuel Jay Keyser, which postulates the existence of redundant or enhancement features. Stevens' methodology in the investigation of speech sounds is organized into three steps. The first step is to use physics (mainly tube models) to model the shape of the articulators (e.g. the shapes of the front and back cavity, rounding or non-rounding of lips, etc). Based on the articulatory tube models, resonant frequencies can be calculated, which are the formant frequencies. Once the resonant frequencies are calculated, speech data are collected and analyzed to compare to theoretical calculations. This second stage is mainly experimental, where tokens of interest are usually recorded either in isolation, and/or embedded in a controlled carrier phrase, usually spoken by both several female/male native speakers of the language. The key to data collection is controlling for as many factors as possible so that the acoustic evidence of interest can be investigated with minimum amount of artifacts. The last stage in the investigation is to compare the data results with the theoretical predictions and to account for the differences that occur. Differences can sometimes be explained by the fact that tube models usually are simplified as to not account for loss due to softness of vocal walls (though resistors can be added to the theoretical model). Subglottal system might also affect the vocal tract productive system when the glottal opening is large (please see research on subglottal resonance on effects of speech). Theretical model predictions can give general predictions about what one can expect to find in real speech, and evidence from real speech can also help refine the original model, and give better insight to the production of speech sounds. Quantal theory aims to elegantly describe (using physics) and organize all the acoustic features of all possible sounds into a matrix. (See chapter five in Acoustics Phonetics) The ultimate constraint on all speech sounds is the physical articulatory system itself, thus supporting the claim that there can only be a finite set of sounds among languages. The reason that the set of speech sounds is finite is that while the movement of the articulators is continuous, only certain configurations tend to be articulatorily and/or acoustically stable, giving rise to fix frequencies for formants that form sounds that are relatively universal for all languages (i.e. vowels and consonants). Each acoustic sound can thus be described by a handful of defining features (usually binary). For example, lip-round (either on or off) is a feature. Tongue height (either high or low) is another feature. In addition to these defining features which serve as the essential description of the acoustic sounds, there are also enhancing features which help to make the sounds more recognizable. For each of these features, one can apply Stevens' methodology to first use a tube model to model the articulators, and predict the resonant frequencies, then collect data to examine the acoustic properties of that feature, and finally to reconcile with the theoretical model and summarize the acoustic properties of that feature. To get an introduction to the world of speech science, one can first read the book "The Speech Chain" by Denes P. and Pinson E., where one is given a broad overview of the production and transmission of speech. One is introduced to spectrograms and

formant In speech science and phonetics, a formant is the broad spectral maximum that results from an acoustic resonance of the human vocal tract. In acoustics, a formant is usually defined as a broad peak, or local maximum, in the spectrum. For harmoni ...

frequencies, which are the main acoustic description of sound segments.

= the glottis

= As the

vocal folds In humans, vocal cords, also known as vocal folds or voice reeds, are folds of throat tissues that are key in creating sounds through vocalization. The size of vocal cords affects the pitch of voice. Open when breathing and vibrating for speec ...

vibrate, puffs of air pushed through (filtered) by the vocal tract, producing sound. This sound source is modeled as a current source in a circuit modeling the production of sound. Changes in the vocal tract would cause change to the sound that is produced. The frequency of vibration for females vocal folds tend to be higher than that of males, giving female voices higher pitch than male voices. Research (Hanson, H.M. 1997) has shown there is a difference between how females and males vibrate their vocal folds; there is a greater spread for female glottis, which gives female voices a more breathy quality than male voices.

= the subglottal system

= The subglottal system refers to the system that is below the glottis in the human body. It includes the

trachea The trachea, also known as the windpipe, is a cartilaginous tube that connects the larynx to the bronchi of the lungs, allowing the passage of air, and so is present in almost all air- breathing animals with lungs. The trachea extends from the ...

bronchi A bronchus is a passage or airway in the lower respiratory tract that conducts air into the lungs. The first or primary bronchi pronounced (BRAN-KAI) to branch from the trachea at the carina are the right main bronchus and the left main bronchus. ...

, and the

lungs The lungs are the primary organs of the respiratory system in humans and most other animals, including some snails and a small number of fish. In mammals and most other vertebrates, two lungs are located near the backbone on either side ...

. It is essentially a fixed system, so does not change for each individual speaker. Research results have shown that during the open phase of the glottal cycle (when the glottis is open), coupling is introduced due to the subglottal system, manifesting acoustically as pole/zero pairs in the frequency domain. These pole/zero pairs introduced by the coupling serve are hypothesized to serve as prohibited or unstable regions in the spectra, serving as natural boundaries for

vowel A vowel is a syllabic speech sound pronounced without any stricture in the vocal tract. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness and also in quantity (leng ...

features such as +front or +back. For adult males, the resonant frequencies of their subglottal system have been measured (using invasive methods) to be 600, 1550, and 2200 Hz. (Acoustic Phonetics, pg 197, Ishizaka et. al., Crane & Boves). The subglottal resonant frequencies of females are slightly higher due to their smaller dimensions. One non-invasive way of measuring these peaks is to use an accelerometer placed above the sternal notch (Henke) to record the acceleration of the skin during phonation. The vibration would capture the resonant frequencies below the glottis (of the subglottal system).

= the
vocal tract The vocal tract is the cavity in human bodies and in animals where the sound produced at the sound source ( larynx in mammals; syrinx in birds) is filtered. In birds it consists of the trachea, the syrinx, the oral cavity, the upper part of th ...

= The vocal tract refers to the passage way that is above the glottis, all the way to opening of the lips. A two-tube model is usually used to model the vocal tract, one capturing the dimension (cross-sectional area and length) of the back cavity, the other modeling the front cavity. Resonant frequencies calculated from the tube model are the formant frequencies. To produce the schwa vowel /ə/, the vocal tract is relatively open all the way from the glottis to the mouth, thus the tube model can be thought of as a relatively uniform open tube, making the resonant frequencies (or formants) evenly apart. The radiation at the mouth would cause these resonant frequencies to be about five percent lower. (Acoustics Phonetics, pg 139) Female vocal tracts (average of 14.1 cm) are on average shorter than the male vocal tracts (average of 17.7 cm), thus making them having higher formant frequencies than males. Since the vocal tract walls are soft, energy is lost in the vocal tract, which increases the bandwidth of the formants.

= the
nasal cavity The nasal cavity is a large, air-filled space above and behind the human nose, nose in the middle of the face. The nasal septum divides the cavity into two cavities, also known as fossae. Each cavity is the continuation of one of the two nostrils. ...

= When the velopharyngeal port opens during the production of certain sounds, such as /n/ and /m/, coupling is introduced due to the naval cavity, which gives the output a nasal quality.

Contributions to speech perception

The quantal theory suggests that the phonological inventory of a language is defined primarily by the acoustic characteristics of each segment, with boundaries specified by the acoustic-articulatory mapping. The implication is that phonological segments must have some type of acoustic invariance. Blumstein and Stevens demonstrated what appeared to be an invariant relationship between the acoustic spectrum and the perceived sound: by adding energy to the burst spectrum of "pa" at a particular frequency, it is possible to turn it into "ta" or "ka" respectively, depending on the frequency. Presence of the extra energy causes perception of the lingual consonant; its absence causes perception of the labial. Stevens' recent work has re-structured the theory of acoustic invariance into a shallow hierarchical perceptual model, the model of acoustic landmarks and distinctive features.

Contributions to speech production

While on sabbatical at

KTH KTH may refer to: * Keat Hong LRT station, Singapore, LRT station abbreviation * Kent House railway station, London, National Rail station code * KTH Royal Institute of Technology, a university in Sweden * KTH Krynica, a Polish ice hockey team * Khy ...

in Sweden in 1962, Stevens volunteered as a participant in cineradiography experiments being conducted by Sven Öhman. Stevens' cineradiographic films are among the most widely distributed; copies exist on laserdisc, and some are available online. After returning to MIT, Stevens agreed to supervise the research of a dentistry student named Joseph S. Perkell. Perkell's knowledge of oral anatomy permitted him to trace Stevens' X-ray films onto paper, and to publish the results. Other contributions to the study of speech production include a model by which one can predict the spectral shape of turbulent speech excitation (depending on the dimensions of the turbulent jet), and work related to the vocal fold configurations that lead to different modes of phonation. In fact, the spectral properties (formants, bandwidth of formants, other glottal characteristics) of all possible sound phonemes in all languages can theoretically be modeled and predicted using physics-based resonator models. Basic tube resonators can be used to give a general prediction of formants for vowels. Additional refinement to the basic model is used by adding resistors and/or capacitors to the model to represent energy losses due to vocal tract walls. Acoustical coupling due to the subglottal system can also be modeled by adding additional tubes to the model of the original vocal tract, introducing pole/zero in the spectra that represent the effects of subglottal coupling. (The locations of these pole/zero pairs are the resonant frequencies of the subglottal system). Glottal characteristics such as vocal pitch (F0), open quotient (H1-H2), and degree of breathiness (H1-A3) can also be modeled and measured from the spectra. (Hanson & Stevens).

Stevens as a mentor

Stevens joined MIT as an assistant professor in 1954. He became an associate professor in 1957, a full professor in 1963, and was appointed as the Clarence J. Lebel Chaired Professor in 1977. One of his long-time collaborators, Dennis Klatt (who wrote

DECtalk DECtalk was a speech synthesizer and text-to-speech technology developed by Digital Equipment Corporation in 1983, based largely on the work of Dennis Klatt at MIT, whose source-filter algorithm was variously known as KlattTalk or MITalk. U ...

while working in Stevens' lab), said that "As a leader, Ken is known for his devotion to students and his miraculous ability to run a busy laboratory while appearing to manage by a principle of benevolent anarchy." The first doctoral thesis Stevens signed at MIT was that of his fellow student,

James L. Flanagan James Loton Flanagan (August 26, 1925 – August 25, 2015) was an American electrical engineer. He was Rutgers University's vice president for research until 2004. He was also director of Rutgers' Center for Advanced Information Processing and t ...

, in 1955. Flanagan started graduate school at MIT in the same year as Stevens, but without a prior master's degree; he earned his M.S. in 1950 under Beranek's supervision, then finished his doctoral thesis under Stevens' supervision in 1955. Stevens estimated in 2001 that he had supervised approximately forty Ph.D. candidates. On the occasion of his receipt of the Gold Medal of the Acoustical Society of America, in 1995, colleagues wrote of Stevens' Speech Group that "during its existence of almost four decades" it "has been outstanding in the support that it has provided to women researchers, many of whom have gone on to populate the upper echelons of research labs throughout the world.". Stevens’ laboratory has been referred to by colleagues as a "national treasure"

Professional service

Stevens was active in the

Acoustical Society of America The Acoustical Society of America (ASA) is an international scientific society founded in 1929 dedicated to generating, disseminating and promoting the knowledge of acoustics and its practical applications. The Society is primarily a voluntary org ...

since his time as a graduate student. He was a member of the executive council from 1963 to 1966, Vice President from 1971–2, and President of the Society from 1976–7. He is a Fellow of the ASA. In 1983 he received its Silver Medal in Speech Communication, and in 1995 he received the Gold Medal from the society. Stevens was also active in the

IEEE The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operat ...

, where he held the rank of IEEE Life Fellow. In 2004, Ken Stevens and Gunnar Fant were the joint first winners of the IEEE James L. Flanagan Speech and Audio Processing Award. Stevens was a Fellow of the

American Academy of Arts and Sciences The American Academy of Arts and Sciences (abbreviation: AAA&S) is one of the oldest learned societies in the United States. It was founded in 1780 during the American Revolution by John Adams, John Hancock, James Bowdoin, Andrew Oliver, a ...

, a member of the

National Academy of Engineering The National Academy of Engineering (NAE) is an American nonprofit, non-governmental organization. The National Academy of Engineering is part of the National Academies of Sciences, Engineering, and Medicine, along with the National Academy of ...

, a member of the National Academy of Sciences, and a 1999 recipient of the United States

References

External links

Stevens biography at RLE/MIT
{{DEFAULTSORT:Stevens, Kenneth Noble 1924 births 2013 deaths National Medal of Science laureates Engineers from Toronto Phoneticians Members of the United States National Academy of Sciences Members of the United States National Academy of Engineering American electrical engineers American computer scientists Speech processing researchers Fellow Members of the IEEE Fellows of the American Academy of Arts and Sciences Fellows of the Acoustical Society of America ASA Gold Medal recipients Neurological disease deaths in Oregon Deaths from Alzheimer's disease MIT School of Engineering faculty Canadian emigrants to the United States