Audio signal processing is a subfield of
signal processing
Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...
that is concerned with the electronic manipulation of
audio signal
An audio signal is a representation of sound, typically using either a changing level of electrical voltage for analog signals or a series of binary numbers for Digital signal (signal processing), digital signals. Audio signals have frequencies i ...
s. Audio signals are electronic representations of
sound wave
In physics, sound is a vibration that propagates as an acoustic wave through a transmission medium such as a gas, liquid or solid.
In human physiology and psychology, sound is the ''reception'' of such waves and their ''perception'' by the ...
s—
longitudinal wave
Longitudinal waves are waves which oscillate in the direction which is parallel to the direction in which the wave travels and displacement of the medium is in the same (or opposite) direction of the wave propagation. Mechanical longitudinal ...
s which travel through air, consisting of compressions and rarefactions. The energy contained in audio signals or
sound power level is typically measured in
decibel
The decibel (symbol: dB) is a relative unit of measurement equal to one tenth of a bel (B). It expresses the ratio of two values of a Power, root-power, and field quantities, power or root-power quantity on a logarithmic scale. Two signals whos ...
s. As audio signals may be represented in either
digital
Digital usually refers to something using discrete digits, often binary digits.
Businesses
*Digital bank, a form of financial institution
*Digital Equipment Corporation (DEC) or Digital, a computer company
*Digital Research (DR or DRI), a software ...
or
analog format, processing may occur in either domain. Analog processors operate directly on the electrical signal, while digital processors operate mathematically on its digital representation.
History
The motivation for audio signal processing began at the beginning of the 20th century with inventions like the
telephone
A telephone, colloquially referred to as a phone, is a telecommunications device that enables two or more users to conduct a conversation when they are too far apart to be easily heard directly. A telephone converts sound, typically and most ...
,
phonograph
A phonograph, later called a gramophone, and since the 1940s a record player, or more recently a turntable, is a device for the mechanical and analogue reproduction of sound. The sound vibration Waveform, waveforms are recorded as correspond ...
, and
radio
Radio is the technology of communicating using radio waves. Radio waves are electromagnetic waves of frequency between 3 hertz (Hz) and 300 gigahertz (GHz). They are generated by an electronic device called a transmitter connec ...
that allowed for the transmission and storage of audio signals. Audio processing was necessary for early
radio broadcasting
Radio broadcasting is the broadcasting of audio signal, audio (sound), sometimes with related metadata, by radio waves to radio receivers belonging to a public audience. In terrestrial radio broadcasting the radio waves are broadcast by a lan ...
, as there were many problems with
studio-to-transmitter links. The theory of signal processing and its application to audio was largely developed at
Bell Labs
Nokia Bell Labs, commonly referred to as ''Bell Labs'', is an American industrial research and development company owned by Finnish technology company Nokia. With headquarters located in Murray Hill, New Jersey, Murray Hill, New Jersey, the compa ...
in the mid 20th century.
Claude Shannon
Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, computer scientist, cryptographer and inventor known as the "father of information theory" and the man who laid the foundations of th ...
and
Harry Nyquist's early work on
communication theory
Communication theory is a proposed description of communication phenomena, the relationships among them, a storyline describing these relationships, and an argument for these three elements. Communication theory provides a way of talking about a ...
,
sampling theory and
pulse-code modulation
Pulse-code modulation (PCM) is a method used to digitally represent analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitud ...
(PCM) laid the foundations for the field. In 1957,
Max Mathews
Max Vernon Mathews (November 13, 1926 – April 21, 2011) was an American pioneer of computer music.
Biography
Max Vernon Mathews was born in Columbus, Nebraska, to two science schoolteachers. His father in particular taught physics, chemistry ...
became the first person to
synthesize audio from a
computer
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
, giving birth to
computer music.
Major developments in
digital
Digital usually refers to something using discrete digits, often binary digits.
Businesses
*Digital bank, a form of financial institution
*Digital Equipment Corporation (DEC) or Digital, a computer company
*Digital Research (DR or DRI), a software ...
audio coding
An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio (such as in digital television, digital radio and in audio and video files). Examples of audio coding f ...
and
audio data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
include
differential pulse-code modulation
Differential pulse-code modulation (DPCM) is a signal encoder that uses the baseline of pulse-code modulation (PCM) but adds some functionalities based on the prediction of the samples of the signal. The input can be an analog signal or a Digital ...
(DPCM) by
C. Chapin Cutler at Bell Labs in 1950,
linear predictive coding
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model ...
(LPC) by
Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) metho ...
(
Nagoya University
, abbreviated to or NU, is a Japanese national research university located in Chikusa-ku, Nagoya.
It was established in 1939 as the last of the nine Imperial Universities in the then Empire of Japan, and is now a Designated National Universit ...
) and Shuzo Saito (
Nippon Telegraph and Telephone
(NTT) is a Japanese telecommunications holding company headquartered in Tokyo, Japan. Ranked 55th in ''Fortune'' Global 500, NTT is the fourth largest telecommunications company in the world in terms of revenue, as well as the third largest pu ...
) in 1966,
adaptive DPCM (ADPCM) by P. Cummiskey,
Nikil S. Jayant and
James L. Flanagan at Bell Labs in 1973,
discrete cosine transform
A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequency, frequencies. The DCT, first proposed by Nasir Ahmed (engineer), Nasir Ahmed in 1972, is a widely ...
(DCT) coding by
Nasir Ahmed, T. Natarajan and
K. R. Rao in 1974,
and
modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where s ...
(MDCT) coding by J. P. Princen, A. W. Johnson and A. B. Bradley at the
University of Surrey
The University of Surrey is a public research university in Guildford, Surrey, England. The university received its Royal Charter, royal charter in 1966, along with a Plate glass university, number of other institutions following recommendations ...
in 1987. LPC is the basis for
perceptual coding
Psychoacoustics is the branch of psychophysics involving the scientific study of the perception of sound by the human auditory system. It is the branch of science studying the psychological responses associated with sound including noise, speech, ...
and is widely used in
speech coding
Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
,
while MDCT coding is widely used in modern
audio coding formats such as
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany under the lead of Karlheinz Brandenburg. It was designed to greatly reduce the amount ...
and
Advanced Audio Coding
Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. It was developed by Dolby, AT&T, Fraunhofer and Sony, originally as part of the MPEG-2 specification but later improved under MPEG-4.ISO (2006ISO/ ...
(AAC).
Types
Analog
An analog audio signal is a continuous signal represented by an electrical voltage or current that is ''analogous'' to the sound waves in the air. Analog signal processing then involves physically altering the continuous signal by changing the voltage or current or charge via
electrical circuits
An electrical network is an interconnection of electrical components (e.g., batteries, resistors, inductors, capacitors, switches, transistors) or a model of such an interconnection, consisting of electrical elements (e.g., voltage so ...
.
Historically, before the advent of widespread
digital technology Digital technology may refer to:
* Application of digital electronics
* Any significant piece of knowledge from information technology
Information technology (IT) is a set of related fields within information and communications technology (IC ...
, analog was the only method by which to manipulate a signal. Since that time, as computers and software have become more capable and affordable, digital signal processing has become the method of choice. However, in music applications, analog technology is often still desirable as it often produces
nonlinear responses that are difficult to replicate with digital filters.
Digital
A digital representation expresses the audio waveform as a sequence of symbols, usually
binary numbers. This permits signal processing using
digital circuits
Digital electronics is a field of electronics involving the study of digital signals and the engineering of devices that use or produce them. It deals with the relationship between binary inputs and outputs by passing electrical signals through ...
such as
digital signal processor
A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on metal–oxide–semiconductor (MOS) integrated circuit chips. ...
s,
microprocessor
A microprocessor is a computer processor (computing), processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, a ...
s and general-purpose computers. Most modern audio systems use a digital approach as the techniques of digital signal processing are much more powerful and efficient than analog domain signal processing.
Applications
Processing methods and application areas include
storage,
data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
,
music information retrieval
Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. Those involved in MIR may have a background in academic musicology, psychoacoustics, psychology, signal processing, informatics, machine lear ...
,
speech processing
Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to ...
,
localization,
acoustic detection,
transmission,
noise cancellation,
acoustic fingerprint
An acoustic fingerprint is a condensed digital summary, a digital fingerprint, deterministically generated from an audio signal, that can be used to identify an audio sample or quickly locate similar items in a music database.
Practical u ...
ing,
sound recognition,
synthesis
Synthesis or synthesize may refer to:
Science Chemistry and biochemistry
*Chemical synthesis, the execution of chemical reactions to form a more complex molecule from chemical precursors
**Organic synthesis, the chemical synthesis of organi ...
, and enhancement (e.g.
equalization,
filtering,
level compression,
echo
In audio signal processing and acoustics, an echo is a reflection of sound that arrives at the listener with a delay after the direct sound. The delay is directly proportional to the distance of the reflecting surface from the source and the lis ...
and
reverb
In acoustics, reverberation (commonly shortened to reverb) is a persistence of sound after it is produced. It is often created when a sound is reflected on surfaces, causing multiple reflections that build up and then decay as the sound is a ...
removal or addition, etc.).
Audio broadcasting
Audio signal processing is used when broadcasting audio signals in order to enhance their fidelity or optimize for bandwidth or latency. In this domain, the most important audio processing takes place just before the transmitter. The audio processor here must prevent or minimize
overmodulation, compensate for non-linear transmitters (a potential issue with
medium wave
Medium wave (MW) is a part of the medium frequency (MF) radio band used mainly for AM radio broadcasting. The spectrum provides about 120 channels with more limited sound quality than FM stations on the FM broadcast band. During the daytim ...
and
shortwave
Shortwave radio is radio transmission using radio frequencies in the shortwave bands (SW). There is no official definition of the band range, but it always includes all of the high frequency band (HF), which extends from 3 to 30 MHz (app ...
broadcasting), and adjust overall
loudness
In acoustics, loudness is the subjectivity, subjective perception of sound pressure. More formally, it is defined as the "attribute of auditory sensation in terms of which sounds can be ordered on a scale extending from quiet to loud". The relat ...
to the desired level.
Active noise control
Active noise control
Active noise control (ANC), also known as noise cancellation (NC), or active noise reduction (ANR), is a method for reducing unwanted sound by the addition of a second sound specifically designed to cancel the first. The concept was first deve ...
is a technique designed to reduce unwanted sound. By creating a signal that is identical to the unwanted noise but with the opposite polarity, the two signals cancel out due to
destructive interference
In physics, interference is a phenomenon in which two coherent waves are combined by adding their intensities or displacements with due consideration for their phase difference. The resultant wave may have greater amplitude (constructive in ...
.
Audio synthesis
Audio synthesis is the electronic generation of audio signals. A musical instrument that accomplishes this is called a synthesizer. Synthesizers can either
imitate sounds or generate new ones. Audio synthesis is also used to generate human
speech
Speech is the use of the human voice as a medium for language. Spoken language combines vowel and consonant sounds to form units of meaning like words, which belong to a language's lexicon. There are many different intentional speech acts, suc ...
using
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
.
Audio effects
Audio effects alter the sound of a
musical instrument
A musical instrument is a device created or adapted to make Music, musical sounds. In principle, any object that produces sound can be considered a musical instrument—it is through purpose that the object becomes a musical instrument. A person ...
or other audio source. Common effects include
distortion
In signal processing, distortion is the alteration of the original shape (or other characteristic) of a signal. In communications and electronics it means the alteration of the waveform of an information-bearing signal, such as an audio signal ...
, often used with electric guitar in
electric blues
Electric blues is blues music distinguished by the use of electric amplification for musical instruments. The guitar was the first instrument to be popularly amplified and used by early pioneers T-Bone Walker in the late 1930s and John Lee Ho ...
and
rock music
Rock is a Music genre, genre of popular music that originated in the United States as "rock and roll" in the late 1940s and early 1950s, developing into a range of styles from the mid-1960s, primarily in the United States and the United Kingdo ...
;
dynamic effects such as
volume pedals and
compressors, which affect loudness;
filters
Filtration is a physical process that separates solid matter and fluid from a mixture.
Filter, filtering, filters or filtration may also refer to:
Science and technology
Computing
* Filter (higher-order function), in functional programming
* Fil ...
such as
wah-wah pedal
A wah-wah pedal, or simply wah pedal, is a type of effects pedal designed for electric guitar that alters the timbre of the input signal to create a distinctive sound, mimicking the human voice saying the onomatopoeic name "wah-wah". The peda ...
s and
graphic equalizer
Equalization, or simply EQ, in sound recording and reproduction is the process of adjusting the volume of different frequency bands within an audio signal. The circuit or equipment used to achieve this is called an equalizer.
Most hi-fi eq ...
s, which modify frequency ranges;
modulation
Signal modulation is the process of varying one or more properties of a periodic waveform in electronics and telecommunication for the purpose of transmitting information.
The process encodes information in form of the modulation or message ...
effects, such as
chorus,
flanger
Flanging is an audio effect produced by mixing two identical signals together, one signal delayed by a small and (usually) gradually changing period, usually smaller than 20 milliseconds. This produces a swept comb filter effect: peaks and ...
s and
phasers;
pitch effects such as
pitch shifters; and time effects, such as
reverb
In acoustics, reverberation (commonly shortened to reverb) is a persistence of sound after it is produced. It is often created when a sound is reflected on surfaces, causing multiple reflections that build up and then decay as the sound is a ...
and
delay, which create echoing sounds and emulate the sound of different spaces.
Musicians,
audio engineer
An audio engineer (also known as a sound engineer or recording engineer) helps to produce a recording or a live performance, balancing and adjusting sound sources using equalization, dynamics processing and audio effects, mixing, reproduc ...
s and record producers use effects units during live performances or in the studio, typically with electric guitar, bass guitar,
electronic keyboard
An electronic keyboard, portable keyboard, or digital keyboard is an electronic musical instrument based on keyboard instruments. Electronic keyboards include synthesizers, digital pianos, stage pianos, electronic organs and digital audio work ...
or
electric piano
An electric piano is a musical instrument that has a piano-style musical keyboard, where sound is produced by means of mechanical hammers striking metal strings or reeds or wire tines, which leads to vibrations which are then converted into ele ...
. While effects are most frequently used with
electric
Electricity is the set of physical phenomena associated with the presence and motion of matter possessing an electric charge. Electricity is related to magnetism, both being part of the phenomenon of electromagnetism, as described by Maxwel ...
or
electronic instruments, they can be used with any audio source, such as
acoustic instruments, drums, and vocals.
Computer audition
Computer audition (CA) or machine listening is the general field of study of
algorithms
In mathematics and computer science, an algorithm () is a finite sequence of mathematically rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for per ...
and systems for audio interpretation by machines. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition attempts to bring together several disciplines that originally dealt with specific problems or had a concrete application in mind. The engineer
Paris Smaragdis, interviewed in ''
Technology Review
''MIT Technology Review'' is a bimonthly magazine wholly owned by the Massachusetts Institute of Technology. It was founded in 1899 as ''The Technology Review'', and was re-launched without "''The''" in its name on April 23, 1998, under then pu ...
'', talks about these systems "software that uses sound to locate people moving through rooms, monitor machinery for impending breakdowns, or activate traffic cameras to record accidents."
Paris Smaragdis taught computers how to play more life-like music
/ref>
Inspired by models of human audition, CA deals with questions of representation, transduction, grouping, use of musical knowledge and general sound semantics
Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
for the purpose of performing intelligent operations on audio and music signals by the computer. Technically this requires a combination of methods from the fields of signal processing
Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...
, auditory modelling, music perception and cognition
Cognition is the "mental action or process of acquiring knowledge and understanding through thought, experience, and the senses". It encompasses all aspects of intellectual functions and processes such as: perception, attention, thought, ...
, pattern recognition
Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...
, and machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
, as well as more traditional methods of artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
for musical knowledge representation.
See also
* Sound card
A sound card (also known as an audio card) is an internal expansion card that provides input and output of audio signals to and from a computer under the control of computer programs. The term ''sound card'' is also applied to external audio ...
* Sound effect
A sound effect (or audio effect) is an artificially created or enhanced sound, or sound process used to emphasize artistic or other content of films, television shows, live performance, animation, video games, music, or other media.
In m ...
References
Further reading
*
*
{{DEFAULTSORT:Audio Signal Processing
Audio electronics
Signal processing