A vocoder (, a
portmanteau
A portmanteau word, or portmanteau (, ) is a blend of words[category
Category, plural categories, may refer to:
Philosophy and general uses
* Categorization, categories in cognitive science, information science and generally
*Category of being
* ''Categories'' (Aristotle)
*Category (Kant)
*Categories (Peirce)
* ...](_blank)
of
speech coding
Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
that analyzes and
synthesizes the human voice signal for
audio data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
,
multiplexing
In telecommunications and computer networking, multiplexing (sometimes contracted to muxing) is a method by which multiple analog or digital signals are combined into one signal over a shared medium. The aim is to share a scarce resource - a ...
,
voice encryption
Secure voice (alternatively secure speech or ciphony) is a term in cryptography for the encryption of voice communication over a range of communication types such as radio, telephone or IP.
History
The implementation of voice encryption date ...
or voice transformation.
The vocoder was invented in 1938 by
Homer Dudley
Homer W. Dudley (14 November 1896– 18 September 1980) was a pioneering electronic and acoustic engineer who created the first electronic voice synthesizer for Bell Labs in the 1930s and led the development of a method of sending secure voice tra ...
at
Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
as a means of synthesizing human speech. This work was developed into the channel vocoder which was used as a voice
codec
A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder.
In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or da ...
for
telecommunications
Telecommunication is the transmission of information by various types of technologies over wire, radio, optical, or other electromagnetic systems. It has its origin in the desire of humans for communication over a distance greater than that fe ...
for speech coding to conserve
bandwidth
Bandwidth commonly refers to:
* Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range
* Bandwidth (computing), the rate of data transfer, bit rate or thr ...
in transmission.
By
encrypting
In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can deci ...
the control signals, voice transmission can be secured against interception. Its primary use in this fashion is for secure radio communication. The advantage of this method of encryption is that none of the original signal is sent, only envelopes of the bandpass filters. The receiving unit needs to be set up in the same filter configuration to re-synthesize a version of the original signal spectrum.
The vocoder has also been used extensively as an
electronic musical instrument
An electronic musical instrument or electrophone is a musical instrument that produces sound using electronic circuitry. Such an instrument sounds by outputting an electrical, electronic or digital audio signal that ultimately is plugged into ...
. The decoder portion of the vocoder, called a
voder The Bell Telephone Laboratory's Voder (from ''Voice Operating Demonstrator'') was the first attempt to electronically synthesize human speech by breaking it down into its acoustic components. It was invented by Homer Dudley in 1937–1938 and deve ...
, can be used independently for speech synthesis.
Theory
The
human voice
The human voice consists of sound made by a human being using the vocal tract, including talking, singing, laughing, crying, screaming, shouting, humming or yelling. The human voice frequency is specifically a part of human sound production ...
consists of sounds generated by the opening and closing of the
glottis
The glottis is the opening between the vocal folds (the rima glottidis). The glottis is crucial in producing vowels and voiced consonants.
Etymology
From Ancient Greek ''γλωττίς'' (glōttís), derived from ''γλῶττα'' (glôtta), va ...
by the
vocal cords
In humans, vocal cords, also known as vocal folds or voice reeds, are folds of throat tissues that are key in creating sounds through vocalization. The size of vocal cords affects the pitch of voice. Open when breathing and vibrating for speech ...
, which produces a periodic waveform with many
harmonic
A harmonic is a wave with a frequency that is a positive integer multiple of the ''fundamental frequency'', the frequency of the original periodic signal, such as a sinusoidal wave. The original signal is also called the ''1st harmonic'', the ...
s. This basic sound is then
filter
Filter, filtering or filters may refer to:
Science and technology
Computing
* Filter (higher-order function), in functional programming
* Filter (software), a computer program to process a data stream
* Filter (video), a software component tha ...
ed by the nose and throat (a complicated
resonant
Resonance describes the phenomenon of increased amplitude that occurs when the frequency of an applied periodic force (or a Fourier component of it) is equal or close to a natural frequency of the system on which it acts. When an oscillatin ...
piping system) to produce differences in harmonic content (
formant
In speech science and phonetics, a formant is the broad spectral maximum that results from an acoustic resonance of the human vocal tract. In acoustics, a formant is usually defined as a broad peak, or local maximum, in the spectrum. For harmoni ...
s) in a controlled way, creating the wide variety of sounds used in speech. There is another set of sounds, known as the
unvoiced
In linguistics, voicelessness is the property of sounds being pronounced without the larynx vibrating. Phonologically, it is a type of phonation, which contrasts with other states of the larynx, but some object that the word phonation implies v ...
and
plosive
In phonetics, a plosive, also known as an occlusive or simply a stop, is a pulmonic consonant in which the vocal tract is blocked so that all airflow ceases.
The occlusion may be made with the tongue tip or blade (, ), tongue body (, ), lips ...
sounds, which are created or modified by the mouth in different fashions.
The vocoder examines speech by measuring how its spectral characteristics change over time. This results in a series of signals representing these frequencies at any particular time as the user speaks. In simple terms, the signal is split into a number of frequency bands (the larger this number, the more accurate the analysis) and the level of signal present at each frequency band gives the instantaneous representation of the spectral energy content. To recreate speech, the vocoder simply reverses the process, processing a broadband noise source by passing it through a stage that filters the frequency content based on the originally recorded series of numbers.
Specifically, in the encoder, the input is passed through a multiband
filter
Filter, filtering or filters may refer to:
Science and technology
Computing
* Filter (higher-order function), in functional programming
* Filter (software), a computer program to process a data stream
* Filter (video), a software component tha ...
, then the output of each band is measured using an
envelope follower
An envelope detector (sometimes called a peak detector) is an electronic circuit that takes a (relatively) high-frequency amplitude modulated signal as input and provides an output, which is the demodulated ''envelope'' of the original signal.
...
, and the signals from the envelope followers are transmitted to the decoder. The decoder applies these as control signals to corresponding amplifiers of the output filter channels.
Information about the instantaneous frequency of the original voice signal (as distinct from its spectral characteristic) is discarded; it was not important to preserve this for the vocoder's original use as an encryption aid. It is this dehumanizing aspect of the vocoding process that has made it useful in creating special voice effects in popular music and audio entertainment.
Instead of a point-by-point recreation of the waveform, the vocoder process sends only the parameters of the vocal model over the communication link. Since the parameters change slowly compared to the original speech waveform, the bandwidth required to transmit speech can be reduced. This allows more speech channels to utilize a given
communication channel
A communication channel refers either to a physical transmission medium such as a wire, or to a logical connection over a multiplexed medium such as a radio channel in telecommunications and computer networking. A channel is used for informa ...
, such as a radio channel or a
submarine cable Submarine cable is any electrical cable that is laid on the seabed, although the term is often extended to encompass cables laid on the bottom of large freshwater bodies of water.
Examples include:
*Submarine communications cable
*Submarine power ...
.
Analog vocoders typically analyze an incoming signal by splitting the signal into multiple tuned frequency bands or ranges. To reconstruct the signal, a
carrier signal
In telecommunications, a carrier wave, carrier signal, or just carrier, is a waveform (usually sinusoidal) that is modulated (modified) with an information-bearing signal for the purpose of conveying information. This carrier wave usually has a ...
is sent through a series of these tuned
bandpass filter
A band-pass filter or bandpass filter (BPF) is a device that passes frequencies within a certain range and rejects (attenuates) frequencies outside that range.
Description
In electronics and signal processing, a filter is usually a two-port ...
s. In the example of a typical robot voice the carrier is noise or a
sawtooth waveform
The sawtooth wave (or saw wave) is a kind of non-sinusoidal waveform. It is so named based on its resemblance to the teeth of a plain-toothed saw with a zero rake angle. A single sawtooth, or an intermittently triggered sawtooth, is called a ...
. There are usually between 8 and 20 bands.
The amplitude of the modulator for each of the individual analysis bands generates a voltage that is used to control amplifiers for each of the corresponding carrier bands. The result is that frequency components of the modulating signal are mapped onto the carrier signal as discrete amplitude changes in each of the frequency bands.
Often there is an unvoiced band or
sibilance
Sibilants are fricative consonants of higher amplitude and pitch, made by directing a stream of air with the tongue towards the teeth. Examples of sibilants are the consonants at the beginning of the English words ''sip'', ''zip'', ''ship'', and ...
channel. This is for frequencies that are outside the analysis bands for typical speech but are still important in speech. Examples are words that start with the letters ''s'', ''f'', ''ch'' or any other sibilant sound. Using this band produces recognizable speech, although somewhat mechanical sounding. Vocoders often include a second system for generating unvoiced sounds, using a
noise generator
A noise generator is a circuit that produces electrical noise (i.e., a random signal). Noise generators are used to test signals for measuring noise figure, frequency response, and other parameters. Noise generators are also used for generating ...
instead of the
fundamental frequency
The fundamental frequency, often referred to simply as the ''fundamental'', is defined as the lowest frequency of a periodic waveform. In music, the fundamental is the musical pitch of a note that is perceived as the lowest partial present. In ...
. This is mixed with the carrier output to increase clarity.
In the channel vocoder algorithm, among the two components of an
analytic signal
In mathematics and signal processing, an analytic signal is a complex-valued function that has no negative frequency components. The real and imaginary parts of an analytic signal are real-valued functions related to each other by the Hilber ...
, considering only the
amplitude
The amplitude of a periodic variable is a measure of its change in a single period (such as time or spatial period). The amplitude of a non-periodic signal is its magnitude compared with a reference value. There are various definitions of amplit ...
component and simply ignoring the
phase
Phase or phases may refer to:
Science
*State of matter, or phase, one of the distinct forms in which matter can exist
*Phase (matter), a region of space throughout which all physical properties are essentially uniform
* Phase space, a mathematic ...
component tends to result in an unclear voice; on methods for rectifying this, see
phase vocoder A phase vocoder is a type of vocoder-purposed algorithm which can interpolate information present in the frequency and time domains of audio signals by using phase information extracted from a frequency transform. The computer algorithm allows freq ...
.
History
The development of a vocoder was started in 1928 by
Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
engineer
Homer Dudley
Homer W. Dudley (14 November 1896– 18 September 1980) was a pioneering electronic and acoustic engineer who created the first electronic voice synthesizer for Bell Labs in the 1930s and led the development of a method of sending secure voice tra ...
,
[
] who was granted patents for it on March 21, 1939,
[
(filed October 30, 1935)
] and Nov 16, 1937.
[
]
To demonstrate the
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
ability of its decoder section, the
voder The Bell Telephone Laboratory's Voder (from ''Voice Operating Demonstrator'') was the first attempt to electronically synthesize human speech by breaking it down into its acoustic components. It was invented by Homer Dudley in 1937–1938 and deve ...
(voice operating demonstrator)
[
] was introduced to the public at the AT&T building at the 1939–1940 New York World's Fair.
[
] The voder consisted of an
electronic oscillator
An electronic oscillator is an electronic circuit that produces a periodic, oscillation, oscillating electronic signal, often a sine wave or a square wave or a triangle wave. Oscillation, Oscillators convert direct current (DC) from a power supp ...
a sound source of
pitched tone and
noise generator
A noise generator is a circuit that produces electrical noise (i.e., a random signal). Noise generators are used to test signals for measuring noise figure, frequency response, and other parameters. Noise generators are also used for generating ...
for
hiss, a 10-band
resonator filters with
variable-gain amplifier
A variable-gain (VGA) or voltage-controlled amplifier (VCA) is an electronic amplifier that varies its gain depending on a control voltage (often abbreviated CV).
VCAs have many applications, including audio level compression, synthesizers and am ...
s as a
vocal tract
The vocal tract is the cavity in human bodies and in animals where the sound produced at the sound source (larynx in mammals; syrinx (biology), syrinx in birds) is filtered.
In birds it consists of the Vertebrate trachea, trachea, the Syrinx (bio ...
, and the manual controllers including a set of pressure-sensitive keys for filter control, and a
foot pedal for
pitch control
A variable speed pitch control (or vari-speed) is a control on an audio device such as a turntable, tape recorder, or CD player that allows the operator to deviate from a standard speed (such as 33, 45 or even 78 rpm on a turntable), resulting i ...
of tone.
[
Based on ]
See
schematic diagram of the Voder synthesizer
The filters controlled by keys convert the tone and the hiss into
vowel
A vowel is a syllabic speech sound pronounced without any stricture in the vocal tract. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness and also in quantity (leng ...
s,
consonant
In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract. Examples are and pronounced with the lips; and pronounced with the front of the tongue; and pronounced wit ...
s, and
inflection
In linguistic morphology, inflection (or inflexion) is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and defin ...
s. This was a complex machine to operate, but a skilled operator could produce recognizable speech.
[
A demonstration of the ]voder The Bell Telephone Laboratory's Voder (from ''Voice Operating Demonstrator'') was the first attempt to electronically synthesize human speech by breaking it down into its acoustic components. It was invented by Homer Dudley in 1937–1938 and deve ...
(not the vocoder).
Dudley's vocoder was used in the
SIGSALY
SIGSALY (also known as the X System, Project X, Ciphony I, and the Green Hornet) was a secure speech system used in World War II for the highest-level Allied communications. It pioneered a number of digital communications concepts, including the ...
system, which was built by Bell Labs engineers in 1943. SIGSALY was used for encrypted voice communications during
World War II
World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the vast majority of the world's countries—including all of the great powers—forming two opposin ...
. The KO-6 voice coder was released in 1949 in limited quantities; it was a close approximation to the SIGSALY at 1200 bit/s. In 1953, KY-9 THESEUS 1650 bit/s voice coder used solid-state logic to reduce the weight to from SIGSALY's , and in 1961 the HY-2 voice coder, a 16-channel 2400 bit/s system, weighed and was the last implementation of a channel vocoder in a secure speech system.
Later work in this field has since used digital
speech coding
Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
. The most widely used speech coding technique is
linear predictive coding
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC). Another speech coding technique,
adaptive differential pulse-code modulation
Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio ...
(ADPCM), was developed by P. Cummiskey,
Nikil S. Jayant and
James L. Flanagan at Bell Labs in 1973.
Applications
* Terminal equipment for systems based on
digital mobile radio
Digital mobile radio (DMR) is a specification for commercial products so they can interoperate. It is defined by a standard created by the European Telecommunications Standards Institute (ETSI), and is designed to be low-cost and easy to use. DMR, ...
(DMR).
* Digital voice scrambling and encryption
*
Cochlear implant
A cochlear implant (CI) is a surgically implanted neuroprosthesis that provides a person who has moderate-to-profound sensorineural hearing loss with sound perception. With the help of therapy, cochlear implants may allow for improved speech und ...
s: noise and tone vocoding is used to simulate the effects of cochlear implants.
* Musical and other artistic effects
Modern implementations
Even with the need to record several frequencies, and additional unvoiced sounds, the compression of vocoder systems is impressive. Standard speech-recording systems capture frequencies from about 500 Hz to 3,400 Hz, where most of the frequencies used in speech lie, typically using a sampling rate of 8 kHz (slightly greater than the
Nyquist rate
In signal processing, the Nyquist rate, named after Harry Nyquist, is a value (in units of samples per second or hertz, Hz) equal to twice the highest frequency (bandwidth) of a given function or signal. When the function is digitized at a hig ...
). The sampling resolution is typically 12 or more bits per sample resolution (16 is standard), for a final data rate in the range of 96–128 kbit/s, but a good vocoder can provide a reasonably good simulation of voice with as little as 2.4 kbit/s of data.
''Toll quality'' voice coders, such as ITU G.729, are used in many telephone networks. G.729 in particular has a final data rate of 8 kbit/s with superb voice quality. G.723 achieves slightly worse quality at data rates of 5.3 kbit/s and 6.4 kbit/s. Many voice vocoder systems use lower data rates, but below 5 kbit/s voice quality begins to drop rapidly.
Several vocoder systems are used in
NSA encryption systems
The National Security Agency took over responsibility for all U.S. Government encryption systems when it was formed in 1952. The technical details of most NSA-approved systems are still Classified information in the United States, classified, but ...
:
* LPC-10,
FIPS Pub 137, 2400 bit/s, which uses
linear predictive coding
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
*
Code-excited linear prediction
Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
(CELP), 2400 and 4800 bit/s, Federal Standard 1016, used in
STU-III
STU-III (Secure Telephone Unit - third generation) is a family of secure telephones introduced in 1987 by the NSA for use by the United States government, its contractors, and its allies. STU-III desk units look much like typical office telephone ...
*
Continuously variable slope delta modulation Continuously variable slope delta modulation (CVSD or CVSDM) is a voice coding method. It is a delta modulation with variable step size (i.e., special case of adaptive delta modulation), first proposed by Greefkes and Riemens in 1970.
CVSD encodes ...
(CVSD), 16 kbit/s, used in wide band encryptors such as the KY-57.
*
Mixed-excitation linear prediction
Mixed-excitation linear prediction (MELP) is a United States Department of Defense speech coding standard used mainly in military applications and satellite communications, secure voice, and secure radio devices. Its standardization and later devel ...
(MELP), MIL STD 3005, 2400 bit/s, used in the Future Narrowband Digital Terminal
FNBDT
The Secure Communications Interoperability Protocol (SCIP) is a US standard for secure voice and data communication, foone-to-one connections, not packet-switched networks. SCIP derived from the US Government Future Narrowband Digital Terminal ( ...
,
NSA
The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence (DNI). The NSA is responsible for global monitoring, collectio ...
's 21st century secure telephone.
*
Adaptive Differential Pulse Code Modulation
Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio ...
(
ADPCM
Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio ...
), former
ITU-T
The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Commu ...
G.721, 32 kbit/s used in
STE secure telephone
(ADPCM is not a proper vocoder but rather a waveform codec.
ITU
The International Telecommunication Union is a specialized agency of the United Nations responsible for many matters related to information and communication technologies. It was established on 17 May 1865 as the International Telegraph Unio ...
has gathered G.721 along with some other ADPCM codecs into G.726.)
Vocoders are also currently used in developing
psychophysics
Psychophysics quantitatively investigates the relationship between physical stimuli and the sensations and perceptions they produce. Psychophysics has been described as "the scientific study of the relation between stimulus and sensation" or, m ...
,
linguistics
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguis ...
,
computational neuroscience
Computational neuroscience (also known as theoretical neuroscience or mathematical neuroscience) is a branch of neuroscience which employs mathematical models, computer simulations, theoretical analysis and abstractions of the brain to u ...
and
cochlear implant
A cochlear implant (CI) is a surgically implanted neuroprosthesis that provides a person who has moderate-to-profound sensorineural hearing loss with sound perception. With the help of therapy, cochlear implants may allow for improved speech und ...
research.
Modern vocoders that are used in communication equipment and in voice storage devices today are based on the following algorithms:
*
Algebraic code-excited linear prediction
Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the code- ...
(ACELP 4.7 kbit/s – 24 kbit/s)
*
Mixed-excitation linear prediction
Mixed-excitation linear prediction (MELP) is a United States Department of Defense speech coding standard used mainly in military applications and satellite communications, secure voice, and secure radio devices. Its standardization and later devel ...
(MELPe 2400, 1200 and 600 bit/s)
*
Multi-band excitation
In telecommunications, a multi-band device (including (2) dual-band, (3) tri-band, (4) quad-band and (5) penta-band devices) is a communication device (especially a mobile phone) that supports multiple radio frequency bands. All devices which ha ...
(AMBE 2000 bit/s – 9600 bit/s)
* Sinusoidal-Pulsed Representation (SPR 600 bit/s – 4800 bit/s)
* Robust Advanced Low-complexity Waveform Interpolation (RALCWI 2050bit/s, 2400bit/s and 2750bit/s)
* Tri-Wave Excited Linear Prediction (TWELP 600 bit/s – 9600 bit/s)
* Noise Robust Vocoder (NRV 300 bit/s and 800 bit/s)
Linear prediction-based
Since the late 1970s, most non-musical vocoders have been implemented using
linear prediction
Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples.
In digital signal processing, linear prediction is often called linear predictive coding (LPC) and ...
, whereby the target signal's spectral envelope (formant) is estimated by an all-pole
IIR filter
Filter, filtering or filters may refer to:
Science and technology
Computing
* Filter (higher-order function), in functional programming
* Filter (software), a computer program to process a data stream
* Filter (video), a software component tha ...
. In linear prediction coding, the all-pole filter replaces the bandpass filter bank of its predecessor and is used at the encoder to ''whiten'' the signal (i.e., flatten the spectrum) and again at the decoder to re-apply the spectral shape of the target speech signal.
One advantage of this type of filtering is that the location of the linear predictor's spectral peaks is entirely determined by the target signal, and can be as precise as allowed by the time period to be filtered. This is in contrast with vocoders realized using fixed-width filter banks, where spectral peaks can generally only be determined to be within the scope of a given frequency band. LP filtering also has disadvantages in that signals with a large number of constituent frequencies may exceed the number of frequencies that can be represented by the linear prediction filter. This restriction is the primary reason that LP coding is almost always used in tandem with other methods in high-compression voice coders.
Waveform-interpolative
Waveform-interpolative (WI) vocoder was developed in AT&T
Bell Laboratories
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
around 1995 by W.B. Kleijn, and subsequently, a low- complexity version was developed by AT&T for the DoD secure vocoder competition. Notable enhancements to the WI coder were made at the
University of California, Santa Barbara
The University of California, Santa Barbara (UC Santa Barbara or UCSB) is a Public university, public Land-grant university, land-grant research university in Santa Barbara County, California, Santa Barbara, California with 23,196 undergraduate ...
. AT&T holds the core patents related to WI, and other institutes hold additional patents.
[
][
][
]
Artistic effects
Uses in music
For
music
Music is generally defined as the art of arranging sound to create some combination of form, harmony, melody, rhythm or otherwise expressive content. Exact definitions of music vary considerably around the world, though it is an aspect ...
al applications, a source of musical sounds is used as the carrier, instead of extracting the fundamental frequency. For instance, one could use the sound of a
synthesizer
A synthesizer (also spelled synthesiser) is an electronic musical instrument that generates audio signals. Synthesizers typically create sounds by generating waveforms through methods including subtractive synthesis, additive synthesis and ...
as the input to the filter bank, a technique that became popular in the 1970s.
History
Werner Meyer-Eppler
Werner Meyer-Eppler (30 April 1913 – 8 July 1960), was a Belgian-born German physicist, experimental acoustician, phoneticist and information theorist.
Meyer-Eppler was born in Antwerp. He studied mathematics, physics, and chemistry, fir ...
, a German scientist with a special interest in electronic voice synthesis, published a thesis in 1948 on
electronic music
Electronic music is a genre of music that employs electronic musical instruments, digital instruments, or circuitry-based music technology in its creation. It includes both music made using electronic and electromechanical means ( electroac ...
and
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
from the viewpoint of
sound synthesis
A synthesizer (also spelled synthesiser) is an electronic musical instrument that generates audio signals. Synthesizers typically create sounds by generating waveforms through methods including subtractive synthesis, additive synthesis and f ...
.
Later he was instrumental in the founding of the
Studio for Electronic Music of
WDR in Cologne, in 1951.
One of the first attempts to use a vocoder in creating music was the "Siemens Synthesizer" at the Siemens Studio for Electronic Music, developed between 1956 and 1959.
[ ]
(See also excerpt of pp
157
160
from th
3rd edition in 2008
())[ ]
Details of the Siemens Electronic Music Studio, exhibited at the Deutsches Museum
The Deutsches Museum (''German Museum'', officially (English: ''German Museum of Masterpieces of Science and Technology'')) in Munich, Germany, is the world's largest museum of science and technology, with about 28,000 exhibited objects from ...
.
In 1968,
Robert Moog
Robert Arthur Moog ( ; May 23, 1934 – August 21, 2005) was an American engineer and electronic music pioneer. He was the founder of the synthesizer manufacturer Moog Music and the inventor of the first commercial synthesizer, the Moog synthesi ...
developed one of the first
solid-state
Solid state, or solid matter, is one of the four fundamental states of matter.
Solid state may also refer to:
Electronics
* Solid-state electronics, circuits built of solid materials
* Solid state ionics, study of ionic conductors and their use ...
musical vocoders for the electronic music studio of the
University at Buffalo
The State University of New York at Buffalo, commonly called the University at Buffalo (UB) and sometimes called SUNY Buffalo, is a public research university with campuses in Buffalo and Amherst, New York. The university was founded in 1846 ...
.
In 1968,
Bruce Haack
Bruce Clinton Haack (May 4, 1931 – September 26, 1988) was a Canadian musician and composer in the field of electronic music.
Biography From Alberta to New York (1931-1963)
Demonstrating an early ability for music, Bruce Haack is said to ha ...
built a prototype vocoder, named ''Farad'' after
Michael Faraday
Michael Faraday (; 22 September 1791 – 25 August 1867) was an English scientist who contributed to the study of electromagnetism and electrochemistry. His main discoveries include the principles underlying electromagnetic inducti ...
.
It was first featured on "The Electronic Record For Children" released in 1969 and then on his rock album ''
The Electric Lucifer
''The Electric Lucifer'' is an album by Bruce Haack combining acid rock and electronic sounds. AllMusic describes it as "a psychedelic, anti-war song cycle about the battle between heaven and hell." Haack used a Moog synthesizer and his own home ...
'' released in 1970.
[ ]
A sample of earlier Vocoder.
In 1970,
Wendy Carlos
Wendy Carlos (born Walter Carlos, November 14, 1939) is an American musician and composer best known for her electronic music and film scores. Born and raised in Rhode Island, Carlos studied physics and music at Brown University before moving ...
and
Robert Moog
Robert Arthur Moog ( ; May 23, 1934 – August 21, 2005) was an American engineer and electronic music pioneer. He was the founder of the synthesizer manufacturer Moog Music and the inventor of the first commercial synthesizer, the Moog synthesi ...
built another musical vocoder, a ten-band device inspired by the vocoder designs of
Homer Dudley
Homer W. Dudley (14 November 1896– 18 September 1980) was a pioneering electronic and acoustic engineer who created the first electronic voice synthesizer for Bell Labs in the 1930s and led the development of a method of sending secure voice tra ...
. It was originally called a spectrum encoder-decoder and later referred to simply as a vocoder. The carrier signal came from a Moog
modular synthesizer
Modular synthesizers are synthesizers composed of separate modules for different functions. The modules can be connected together by the user to create a patch. The outputs from the modules may include audio signals, analog control voltages, o ...
, and the modulator from a
microphone
A microphone, colloquially called a mic or mike (), is a transducer that converts sound into an electrical signal. Microphones are used in many applications such as telephones, hearing aids, public address systems for concert halls and public ...
input. The output of the ten-band vocoder was fairly intelligible but relied on specially articulated
speech
Speech is a human vocal communication using language. Each language uses Phonetics, phonetic combinations of vowel and consonant sounds that form the sound of its words (that is, all English words sound different from all French words, even if ...
. Some vocoders use a high-pass filter to let some
sibilance
Sibilants are fricative consonants of higher amplitude and pitch, made by directing a stream of air with the tongue towards the teeth. Examples of sibilants are the consonants at the beginning of the English words ''sip'', ''zip'', ''ship'', and ...
through from the microphone; this ruins the device for its original speech-coding application, but it makes the talking synthesizer effect much more intelligible.
In 1972,
Isao Tomita
, often known simply as Tomita, was a Japanese composer, regarded as one of the pioneers of electronic music and space music, and as one of the most famous producers of analog synthesizer arrangements. In addition to creating note-by-note rea ...
's first
electronic music
Electronic music is a genre of music that employs electronic musical instruments, digital instruments, or circuitry-based music technology in its creation. It includes both music made using electronic and electromechanical means ( electroac ...
album ''Electric Samurai: Switched on Rock'' was an early attempt at applying
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
technique in
electronic rock
Electronic rock is a music genre that involves a combination of rock music and electronic music, featuring instruments typically found within both genres. It originates from the late 1960s, when rock bands began incorporating electronic instrume ...
and
pop music
Pop music is a genre of popular music that originated in its modern form during the mid-1950s in the United States and the United Kingdom. The terms ''popular music'' and ''pop music'' are often used interchangeably, although the former describe ...
. The album featured electronic renditions of contemporary
rock
Rock most often refers to:
* Rock (geology), a naturally occurring solid aggregate of minerals or mineraloids
* Rock music, a genre of popular music
Rock or Rocks may also refer to:
Places United Kingdom
* Rock, Caerphilly, a location in Wales ...
and
pop songs, while utilizing synthesized voices in place of human voices. In 1974, he utilized synthesized voices in his popular
classical music
Classical music generally refers to the art music of the Western world, considered to be distinct from Western folk music or popular music traditions. It is sometimes distinguished as Western classical music, as the term "classical music" also ...
album ''
Snowflakes are Dancing'', which became a worldwide success and helped to popularize electronic music.
In 1973, the british band
Emerson, Lake and Palmer
Emerson, Lake & Palmer (informally known as ELP) were an English progressive rock supergroup formed in London in 1970. The band consisted of Keith Emerson (keyboards), Greg Lake (vocals, bass, guitar, producer) and Carl Palmer (drums, percuss ...
used a vocoder on their album ''
Brain Salad Surgery
''Brain Salad Surgery'' is the fourth studio album by English progressive rock band Emerson, Lake & Palmer, released on 19 November 1973 by their record label, Manticore Records, and distributed by Atlantic Records.
Following the tour in suppor ...
'', for the song "
Karn Evil 9: 3rd Impression".
The 1975 song "
The Raven
"The Raven" is a narrative poem by American writer Edgar Allan Poe. First published in January 1845, the poem is often noted for its musicality, stylized language, and supernatural atmosphere. It tells of a distraught lover who is paid a myste ...
" from the album ''
Tales of Mystery and Imagination
''Tales of Mystery & Imagination'' (often rendered as ''Tales of Mystery and Imagination'') is a popular title for posthumous compilations of writings by American author, essayist and poet Edgar Allan Poe and was the first complete collection of ...
'' by
The Alan Parsons Project
The Alan Parsons Project was a British rock band active between 1975 and 1990, whose core membership consisted of producer, audio engineer, musician and composer Alan Parsons and singer, songwriter and pianist Eric Woolfson. They were accompanie ...
features
Alan Parsons
Alan Parsons (born 20 December 1948) is an English audio engineer, songwriter, musician and record producer.
Parsons was involved with the production of several notable albums, including the Beatles' ''Abbey Road'' (1969) and ''Let It Be'' ( ...
performing vocals through an EMI vocoder. According to the album's liner notes, "The Raven" was the first rock song to feature a digital vocoder.
Pink Floyd
Pink Floyd are an English rock band formed in London in 1965. Gaining an early following as one of the first British psychedelic music, psychedelic groups, they were distinguished by their extended compositions, sonic experimentation, philo ...
also used a vocoder on three of their albums, first on their 1977 ''
Animals
Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia. With few exceptions, animals consume organic material, breathe oxygen, are able to move, can reproduce sexually, and go through an ontogenetic stage in ...
'' for the songs "Sheep" and "Pigs (Three Different Ones)", then on ''
A Momentary Lapse of Reason
''A Momentary Lapse of Reason'' is the thirteenth studio album by the English progressive rock band Pink Floyd, released in the UK on 7 September 1987 by EMI and the following day in the US on Columbia. It was recorded primarily on guitarist ...
'' on "A New Machine Part 1" and "A New Machine Part 2" (1987), and finally on 1994's ''
The Division Bell
''The Division Bell'' is the fourteenth studio album by the English progressive rock band Pink Floyd, released on 28 March 1994 by EMI Records in the United Kingdom and on 4 April by Columbia Records in the United States.
The second Pink Floy ...
'', on "Keep Talking".
The
Electric Light Orchestra
The Electric Light Orchestra (ELO) are an English rock band formed in Birmingham in 1970 by songwriters and multi-instrumentalists Jeff Lynne and Roy Wood with drummer Bev Bevan. Their music is characterised by a fusion of pop, classical a ...
was among the first to use the vocoder in a commercial context, with their 1977 album ''
Out of the Blue''. The band extensively uses it on the album, including on the hits "
Sweet Talkin' Woman
"Sweet Talkin' Woman" is a 1978 single by Electric Light Orchestra (ELO) from the album '' Out of the Blue'' (1977). Its original title was "Dead End Street", but it was changed during recording. Some words that survived from that version can be ...
" and "
Mr. Blue Sky
"Mr. Blue Sky" is a song by the Electric Light Orchestra (ELO), featured on the band's seventh studio album '' Out of the Blue'' (1977). Written and produced by frontman Jeff Lynne, the song forms the fourth and final track of the "Concerto fo ...
". On following albums, the band made sporadic use of it, notably on their hits "
The Diary of Horace Wimp
"The Diary of Horace Wimp" is the fourth track on the Electric Light Orchestra album ''Discovery'', written by Jeff Lynne.
Released in 1979 as a single, the song is Beatlesque in nature and became a Top Ten hit in the UK and Ireland. The lyric ...
" and "
Confusion
In medicine, confusion is the quality or state of being bewildered or unclear. The term "acute mental confusion" " from their 1979 album ''
Discovery
Discovery may refer to:
* Discovery (observation), observing or finding something unknown
* Discovery (fiction), a character's learning something unknown
* Discovery (law), a process in courts of law relating to evidence
Discovery, The Discovery ...
'', the tracks "Prologue", "Yours Truly, 2095", and "Epilogue" on their 1981 album ''
Time
Time is the continued sequence of existence and events that occurs in an apparently irreversible succession from the past, through the present, into the future. It is a component quantity of various measurements used to sequence events, to ...
'', and "
Calling America
"Calling America" is a song by the rock music group Electric Light Orchestra (ELO) released as a single from their 1986 album '' Balance of Power''. The single reached number 28 in the United Kingdom, making it their 26th and final Top 40 hit ...
" from their 1986 album ''
Balance of Power''.
In the late 1970s, French duo
Space Art
"Space art" (also "astronomical art") is the term for a genre of modern artistic expression that strives to show the wonders of the Universe. Like other genres, space art has many facets and encompasses realism, impressionism, hardware art, scu ...
used a vocoder during the recording of their second album, ''Trip in the Centre Head''.
Phil Collins
Philip David Charles Collins (born 30 January 1951) is an English singer, musician, songwriter, record producer and actor. He was the drummer and lead singer of the rock band Genesis and also has a career as a solo performer. Between 1982 and ...
used a vocoder to provide a vocal effect for his 1981 international hit single "
In the Air Tonight
"In the Air Tonight" is the debut solo single by English drummer and singer-songwriter Phil Collins. It was released as the lead single from Collins's debut solo album, '' Face Value'', in January 1981.
Collins co-produced "In the Air Tonight" ...
".
Vocoders have appeared on pop recordings from time to time, most often simply as a
special effect
Special effects (often abbreviated as SFX, F/X or simply FX) are illusions or visual tricks used in the theatre, film, television, video game, amusement park and simulator industries to simulate the imagined events in a story or virtual wor ...
rather than a featured aspect of the work. However, many experimental electronic artists of the
new-age music
New-age is a genre of music intended to create artistic inspiration, relaxation technique, relaxation, and optimism. It is used by listeners for yoga, massage, meditation, and reading as a method of stress management to bring about a state of ecs ...
genre often utilize vocoder in a more comprehensive manner in specific works, such as
Jean-Michel Jarre
Jean-Michel André Jarre (; born 24 August 1948) is a French composer, performer and record producer. He is a pioneer in the electronic, ambient and new-age genres, and is known for organising outdoor spectacles featuring his music, accompanie ...
(on ''
Zoolook
''Zoolook'' is the seventh studio album by French electronic musician and composer Jean-Michel Jarre, released in November 1984 by Disques Dreyfus. Much of the music is built up from singing and speech in 25 different languages recorded and edited ...
'', 1984) and
Mike Oldfield
Mike may refer to:
Animals
* Mike (cat), cat and guardian of the British Museum
* Mike the Headless Chicken, chicken that lived for 18 months after his head had been cut off
* Mike (chimpanzee), a chimpanzee featured in several books and documen ...
(on ''
QE2'', 1980 and ''
Five Miles Out
''Five Miles Out'' is the seventh studio album by English recording artist Mike Oldfield, released on 19 February 1982 by Virgin Records in the UK. After touring in support of his previous album, '' QE2'' (1980), ended in mid-1981, Oldfield sta ...
'', 1982).
Vocoder module and use by M. Oldfield can be clearly seen on his ''Live At Montreux 1981'' DVD (Track "Sheba").
There are also some artists who have made vocoders an essential part of their music, overall or during an extended phase. Examples include the German
synthpop
Synth-pop (short for synthesizer pop; also called techno-pop; ) is a subgenre of new wave music that first became prominent in the late 1970s and features the synthesizer as the dominant musical instrument. It was prefigured in the 1960s a ...
group
Kraftwerk
Kraftwerk (, "power station") is a German band formed in Düsseldorf in 1970 by Ralf Hütter and Florian Schneider. Widely considered innovators and pioneers of electronic music, Kraftwerk were among the first successful acts to popularize the ...
, the Japanese
new wave group
Polysics
is a Japanese Techno/Electronic Music band from Tokyo, who dubs its unique style as "technicolor pogo punk". It was named after a brand of synthesizer, the Korg Polysix. The band started in 1997, but got their big break in 1998 at a concert in ...
,
Stevie Wonder
Stevland Hardaway Morris ( Judkins; May 13, 1950), known professionally as Stevie Wonder, is an American singer-songwriter, who is credited as a pioneer and influence by musicians across a range of genres that include rhythm and blues, Pop musi ...
("Send One Your Love", "A Seed's a Star") and jazz/fusion keyboardist
Herbie Hancock
Herbert Jeffrey Hancock (born April 12, 1940) is an American jazz pianist, keyboardist, bandleader, and composer. Hancock started his career with trumpeter Donald Byrd's group. He shortly thereafter joined the Miles Davis Quintet, where he help ...
during his late 1970s period. In 1982
Neil Young
Neil Percival Young (born November 12, 1945) is a Canadian-American singer and songwriter. After embarking on a music career in Winnipeg in the 1960s, Young moved to Los Angeles, joining Buffalo Springfield with Stephen Stills, Richie Furay ...
used a Sennheiser Vocoder VSM201 on six of the nine tracks on ''
Trans
Trans- is a Latin prefix meaning "across", "beyond", or "on the other side of".
Used alone, trans may refer to:
Arts, entertainment, and media
* Trans (festival), a former festival in Belfast, Northern Ireland, United Kingdom
* ''Trans'' (film ...
''.
The chorus and bridge of
Michael Jackson
Michael Joseph Jackson (August 29, 1958 – June 25, 2009) was an American singer, songwriter, dancer, and philanthropist. Dubbed the "King of Pop", he is regarded as one of the most significant cultural figures of the 20th century. Over a ...
's "
P.Y.T. (Pretty Young Thing)
"P.Y.T. (Pretty Young Thing)" is a song by the American singer Michael Jackson. It is the sixth single from Jackson's sixth solo album, '' Thriller'' (1982). The song was written by James Ingram and Quincy Jones.
"P.Y.T. (Pretty Young Thing)" w ...
". features a vocoder ("Pretty young thing/You make me sing"), courtesy of session musician
Michael Boddicker
Michael Lehmann Boddicker (born January 19, 1953) is an American film composer and session musician, specializing in electronic music. He is a three times National Academy of Recording Arts and Sciences (N.A.R.A.S.) Most Valuable Player "Synthesi ...
.
Coldplay
Coldplay are a British rock band formed in London in 1997. They consist of vocalist and pianist Chris Martin, guitarist Jonny Buckland, bassist Guy Berryman, drummer Will Champion and creative director Phil Harvey. They met at University Col ...
have used a vocoder in some of their songs. For example, in "
Major Minus
"Major Minus" is a song by British Rock music, rock band Coldplay. It was produced by Markus Dravs, Daniel Green and Rik Simpson, being the eighth track from the band's fifth studio album ''Mylo Xyloto'' (2011). The song takes its title from a f ...
" and "
Hurts Like Heaven
"Hurts Like Heaven" is a song by British rock band Coldplay from their fifth studio album, ''Mylo Xyloto''. It was written by all members of the band along with producer Brian Eno, being released as the final single from the record on 8 October ...
", both from the album ''
Mylo Xyloto
''Mylo Xyloto'' (pronounced ) is the fifth studio album by British rock band Coldplay, released on 24 October 2011. The band worked closely with producer Brian Eno following their successful collaboration on ''Viva la Vida or Death and All His ...
'' (2011),
Chris Martin
Christopher Anthony John Martin (born 2 March 1977) is an English singer-songwriter and musician. He is best known as the lead vocalist, pianist, rhythm guitarist and co-founder of the rock band Coldplay. Born in Exeter, Devon, he went to Univ ...
's vocals are mostly vocoder-processed. "
Midnight
Midnight is the transition time from one day to the next – the moment when the date changes, on the local official clock time for any particular jurisdiction. By clock time, midnight is the opposite of noon, differing from it by 12 hours. ...
", from ''
Ghost Stories
A ghost story is any piece of fiction, or drama, that includes a ghost, or simply takes as a premise the possibility of ghosts or characters' belief in them."Ghost Stories" in Margaret Drabble (ed.), ''Oxford Companion to English Literature''. ...
'' (2014), also features Martin singing through a vocoder. The hidden track "X Marks the Spot" from ''
A Head Full of Dreams
''A Head Full of Dreams'' is the seventh studio album by British rock band Coldplay, released on 4 December 2015, by Parlophone in the United Kingdom, and by Atlantic Records in the United States. Coldplay recorded the album from early to mid 20 ...
'' was also recorded through a vocoder.
Noisecore band
Atari Teenage Riot
Atari Teenage Riot (ATR) is a German band formed in Berlin in 1992. Highly political, they fuse left-wing, anarchist and anti-fascist views with punk vocals and a techno sound called digital hardcore, which is a term band member Alec Empire use ...
have used vocoders in variety of their songs and live performances such as ''
Live at the Brixton Academy'' (2002) alongside other digital audio technology both old and new.
The
Red Hot Chili Peppers
Red Hot Chili Peppers are an American rock music, rock band formed in Los Angeles in 1983, comprising vocalist Anthony Kiedis, bassist Flea (musician), Flea, drummer Chad Smith, and guitarist John Frusciante. Their music incorporates element ...
song "
By the Way
''By the Way'' is the eighth studio album by the American rock band Red Hot Chili Peppers, released July 9, 2002, on Warner Bros. Records. It sold more than 286,000 copies in its first week, and peaked at number two on the ''Billboard'' 200. Si ...
" uses a vocoder effect on
Anthony Kiedis
Anthony Kiedis ( ; born November 1, 1962) is an American singer and songwriter. He is a founding member and lead vocalist of the rock band Red Hot Chili Peppers. Kiedis and his fellow band members were inducted into the Rock and Roll Hall of Fa ...
' vocals.
Among the most consistent users of the vocoder in emulating the human voice are
Daft Punk
Daft Punk were a French electronic music duo formed in 1993 in Paris by Thomas Bangalter and Guy-Manuel de Homem-Christo. Widely regarded as one of the most influential acts in dance music history, they achieved popularity in the late 1990s as p ...
, who have used this instrument from their first album ''
Homework
Homework is a set of tasks assigned to students by their teachers to be completed outside the classroom. Common homework assignments may include required reading, a writing or typing project, Exercise (mathematics), mathematical exercises to b ...
'' (1997) to their latest work ''
Random Access Memories
''Random Access Memories'' is the fourth studio album by the French electronic music, electronic duo Daft Punk, released on 17 May 2013 through Columbia Records. The album pays tribute to late Music history of the United States in the 1970s, 197 ...
'' (2013) and consider the convergence of technological and human voice "the identity of their musical project". For instance, the lyrics of "
Around the World" (1997) are integrally vocoder-processed, "
Get Lucky" (2013) features a mix of natural and processed human voices, and "
Instant Crush
"Instant Crush" is a song written, produced, and performed by French electronic music duo Daft Punk and American musician Julian Casablancas. It was released as the fourth single from Daft Punk's fourth studio album, ''Random Access Memories'' ...
" (2013) features
Julian Casablancas
Julian Fernando Casablancas (born August 23, 1978) is an American singer, musician and songwriter, best known as the lead vocalist and primary songwriter of Rock music, rock band The Strokes, with whom he has released six studio albums since the ...
singing into a vocoder.
Producer
Zedd
Anton Zaslavski (russian: Антон Заславский; born 2 September 1989), known professionally as Zedd (), is a Russian-born German disc jockey, DJ, record producer, and songwriter.
Zedd grew up and began his musical journey in Kaise ...
, American country singer
Maren Morris
Maren Larae Morris (born April 10, 1990) is an American singer-songwriter. While rooted in the Country music, country genre, her music also blends elements of Pop music, pop, R&B music, R&B, and Hip-Hop music, hip-hop. Born and raised in Arlingt ...
and American musical duo
Grey
Grey (more common in British English) or gray (more common in American English) is an intermediate color between black and white. It is a neutral or achromatic color, meaning literally that it is "without color", because it can be composed o ...
made a song titled "
The Middle" which featured a vocoder and reached the top ten of the charts in 2018.
Voice effects in other arts
Robot voices became a recurring element in popular music during the 20th century. Apart from vocoders, several other methods of producing variations on this effect include: the
Sonovox
A talk box (also spelled talkbox and talk-box) is an effects unit that allows musicians to modify the sound of a musical instrument by shaping the frequency content of the sound and to apply speech sounds (in the same way as singing) onto the sou ...
,
Talk box
A talk box (also spelled talkbox and talk-box) is an effects unit that allows musicians to modify the sound of a musical instrument by shaping the frequency content of the sound and to apply speech sounds (in the same way as singing) onto the sou ...
, and
Auto-Tune
Auto-Tune (or autotune) is an audio processor introduced in 1996 by American company Antares Audio Technologies. Auto-Tune uses a proprietary device to measure and alter pitch in vocal and instrumental music recording and performances.
Auto-Tu ...
,
[ ]
A sample of Auto-Tune
Auto-Tune (or autotune) is an audio processor introduced in 1996 by American company Antares Audio Technologies. Auto-Tune uses a proprietary device to measure and alter pitch in vocal and instrumental music recording and performances.
Auto-Tu ...
effect (a.k.a. '' T-Pain effect''). linear prediction vocoders,
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
,
[ ]
A sample of earlier computer-based speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
and song synthesis, by John Larry Kelly, Jr.
John Larry Kelly Jr. (December 26, 1923 – March 18, 1965), was an American scientist who worked at Bell Labs. From a "system he'd developed to analyze information transmitted over networks," from Claude Shannon, Claude Shannon's earlier work o ...
and Louis Gerstman Louis "Lou" Gerstman (April 22, 1930 - March 17, 1992) was an American neuropsychologist best known for his work in speech synthesis. He was a co-inventor, along with John Kelly, of the computer portrayed as HAL 9000 in the film '' 2001: A Space Od ...
at Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
, using IBM 704
The IBM 704 is a large digital mainframe computer introduced by IBM in 1954. It was the first mass-produced computer with hardware for floating-point arithmetic. The IBM 704 ''Manual of operation'' states:
The type 704 Electronic Data-Pro ...
computer. The demo song "Daisy Bell
"Daisy Bell (Bicycle Built for Two)" is a song written in 1892 by British songwriter Harry Dacre with the well-known chorus "Daisy, Daisy / Give me your answer, do. / I'm half crazy / all for the love of you", ending with the words "a bicycle bu ...
", musical accompanied by Max Mathews
Max Vernon Mathews (November 13, 1926 in Columbus, Nebraska, USA – April 21, 2011 in San Francisco, CA, USA) was a pioneer of computer music.
Biography
Mathews studied electrical engineering at the California Institute of Technology and the Ma ...
, impressed Arthur C. Clarke
Sir Arthur Charles Clarke (16 December 191719 March 2008) was an English science-fiction writer, science writer, futurist, inventor, undersea explorer, and television series host.
He co-wrote the screenplay for the 1968 film '' 2001: A Spac ...
and later he used it in the climactic scene of the screenplay for his novel '' 2001: A Space Odyssey''.[ ]
A sample of speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
. ring modulation
In electronics, ring modulation is a signal processing function, an implementation of frequency mixing, in which two signals are combined to yield an output signal. One signal, called the carrier, is typically a sine wave or another simple w ...
and
comb filter
In signal processing, a comb filter is a filter implemented by adding a delayed version of a signal to itself, causing constructive and destructive interference. The frequency response of a comb filter consists of a series of regularly spaced no ...
.
Vocoders are used in
television production
A television show – or simply TV show – is any content produced for viewing on a television set which can be broadcast via over-the-air, satellite, or cable, excluding breaking news, advertisements, or trailers that are typically placed betw ...
,
filmmaking
Filmmaking (film production) is the process by which a motion picture is produced. Filmmaking involves a number of complex and discrete stages, starting with an initial story, idea, or commission. It then continues through screenwriting, casti ...
and games, usually for robots or talking computers. The robot voices of the
Cylons
The Cylons are the main antagonists of the human race in the '' Battlestar Galactica'' science fiction franchise, making appearances in the original 1978 series, the 1980 series, the 2004 re-imagining, and the spin-off prequel series '' C ...
in ''
Battlestar Galactica
''Battlestar Galactica'' is an American science fiction media franchise created by Glen A. Larson. The franchise began with the Battlestar Galactica (1978 TV series), original television series in 1978, and was followed by a short-run sequel se ...
'' were created with an EMS Vocoder 2000.
The
1980 version of the ''
Doctor Who
''Doctor Who'' is a British science fiction television series broadcast by the BBC since 1963. The series depicts the adventures of a Time Lord called the Doctor, an extraterrestrial being who appears to be human. The Doctor explores the u ...
'' theme, as arranged and recorded by
Peter Howell, has a section of the main melody generated by a Roland SVC-350 vocoder. A similar
Roland VP-330
The Roland VP-330 is a paraphonic ten band vocoder and string machine manufactured by Roland Corporation from 1979 to 1980. While there are several string machines and vocoders, a single device combining the two is rare, despite the advantage ...
vocoder was used to create the voice of
Soundwave, a character from the
Transformers
''Transformers'' is a media franchise produced by American toy company Hasbro and Japanese toy company Takara Tomy. It primarily follows the Autobots and the Decepticons, two alien robot factions at war that can transform into other forms, suc ...
series.
See also
*
Audio timescale-pitch modification
Time stretching is the process of changing the speed or duration of an audio signal without affecting its pitch. Pitch scaling is the opposite: the process of changing the pitch without affecting the speed. Pitch shift is pitch scaling implement ...
*
Auto-Tune
Auto-Tune (or autotune) is an audio processor introduced in 1996 by American company Antares Audio Technologies. Auto-Tune uses a proprietary device to measure and alter pitch in vocal and instrumental music recording and performances.
Auto-Tu ...
*
Homer Dudley
Homer W. Dudley (14 November 1896– 18 September 1980) was a pioneering electronic and acoustic engineer who created the first electronic voice synthesizer for Bell Labs in the 1930s and led the development of a method of sending secure voice tra ...
*
List of vocoders
*
Phase vocoder A phase vocoder is a type of vocoder-purposed algorithm which can interpolate information present in the frequency and time domains of audio signals by using phase information extracted from a frequency transform. The computer algorithm allows freq ...
*
Silent speech interface Silent speech interface is a device that allows speech communication without using the sound made when people vocalize their speech sounds. As such it is a type of electronic lip reading. It works by the computer identifying the phonemes that an ind ...
*
Talk box
A talk box (also spelled talkbox and talk-box) is an effects unit that allows musicians to modify the sound of a musical instrument by shaping the frequency content of the sound and to apply speech sounds (in the same way as singing) onto the sou ...
*
Werner Meyer-Eppler
Werner Meyer-Eppler (30 April 1913 – 8 July 1960), was a Belgian-born German physicist, experimental acoustician, phoneticist and information theorist.
Meyer-Eppler was born in Antwerp. He studied mathematics, physics, and chemistry, fir ...
References
;Multimedia references
External links
*
Description, photographs, and diagram for the vocoder at 120years.net* Description of a modern Vocoder.
Object of Interest: The Vocoder The New Yorker Magazine mini documentary
{{Authority control
Audio effects
Electronic musical instruments
Music hardware
Lossy compression algorithms
Speech codecs
Robotics