Audio signal processing is a subfield of

signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...

that is concerned with the electronic manipulation of

audio signal An audio signal is a representation of sound, typically using either a changing level of electrical voltage for analog signals, or a series of binary numbers for digital signals. Audio signals have frequencies in the audio frequency range of ro ...

s. Audio signals are electronic representations of sound waves—

longitudinal wave Longitudinal waves are waves in which the vibration of the medium is parallel ("along") to the direction the wave travels and displacement of the medium is in the same (or opposite) direction of the wave propagation. Mechanical longitudinal wa ...

s which travel through air, consisting of compressions and rarefactions. The energy contained in audio signals is typically measured in decibels. As audio signals may be represented in either

digital Digital usually refers to something using discrete digits, often binary digits. Technology and computing Hardware *Digital electronics, electronic circuits which operate using digital signals **Digital camera, which captures and stores digital i ...

analog Analog or analogue may refer to: Computing and electronics * Analog signal, in which information is encoded in a continuous variable ** Analog device, an apparatus that operates on analog signals *** Analog electronics, circuits which use analo ...

format, processing may occur in either domain. Analog processors operate directly on the electrical signal, while digital processors operate mathematically on its digital representation.

History

The motivation for audio signal processing began at the beginning of the 20th century with inventions like the

telephone A telephone is a telecommunications device that permits two or more users to conduct a conversation when they are too far apart to be easily heard directly. A telephone converts sound, typically and most efficiently the human voice, into el ...

, phonograph, and

radio Radio is the technology of signaling and communicating using radio waves. Radio waves are electromagnetic waves of frequency between 30 hertz (Hz) and 300 gigahertz (GHz). They are generated by an electronic device called a transm ...

that allowed for the transmission and storage of audio signals. Audio processing was necessary for early

radio broadcasting Radio broadcasting is transmission of audio (sound), sometimes with related metadata, by radio waves to radio receivers belonging to a public audience. In terrestrial radio broadcasting the radio waves are broadcast by a land-based radi ...

, as there were many problems with studio-to-transmitter links. The theory of signal processing and its application to audio was largely developed at

Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...

in the mid 20th century.

Claude Shannon Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, and cryptographer known as a "father of information theory". As a 21-year-old master's degree student at the Massachusetts In ...

and Harry Nyquist's early work on

communication theory Communication theory is a proposed description of communication phenomena, the relationships among them, a storyline describing these relationships, and an argument for these three elements. Communication theory provides a way of talking about a ...

, sampling theory and

pulse-code modulation Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the a ...

(PCM) laid the foundations for the field. In 1957, Max Mathews became the first person to synthesize audio from a computer, giving birth to

computer music Computer music is the application of computing technology in music composition, to help human composers create new music or to have computers independently create music, such as with algorithmic composition programs. It includes the theory and ...

. Major developments in

audio coding An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio (such as in digital television, digital radio and in audio and video files). Examples of audio coding ...

and

audio data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...

include

differential pulse-code modulation Differential pulse-code modulation (DPCM) is a signal encoder that uses the baseline of pulse-code modulation (PCM) but adds some functionalities based on the prediction of the samples of the signal. The input can be an analog signal or a digital ...

(DPCM) by C. Chapin Cutler at Bell Labs in 1950,

linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive mo ...

(LPC) by

Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) meth ...

(

Nagoya University , abbreviated to or NU, is a Japanese national research university located in Chikusa-ku, Nagoya. It was the seventh Imperial University in Japan, one of the first five Designated National University and selected as a Top Type university of ...

) and Shuzo Saito (

Nippon Telegraph and Telephone , commonly known as NTT, is a Japanese telecommunications company headquartered in Tokyo, Japan. Ranked 55th in ''Fortune'' Global 500, NTT is the fourth largest telecommunications company in the world in terms of revenue, as well as the third la ...

) in 1966,

adaptive DPCM Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise rati ...

(ADPCM) by P. Cummiskey, Nikil S. Jayant and James L. Flanagan at Bell Labs in 1973, discrete cosine transform (DCT) coding by Nasir Ahmed, T. Natarajan and K. R. Rao in 1974, and

modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where ...

(MDCT) coding by J. P. Princen, A. W. Johnson and A. B. Bradley at the

University of Surrey The University of Surrey is a public research university in Guildford, Surrey, England. The university received its royal charter in 1966, along with a number of other institutions following recommendations in the Robbins Report. The institu ...

in 1987. LPC is the basis for perceptual coding and is widely used in

speech coding Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic ...

, while MDCT coding is widely used in modern audio coding formats such as

MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Orig ...

and

Advanced Audio Coding Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves higher sound quality than MP3 encoders at the same bit rate. AAC has been sta ...

(AAC).

Analog signals

An analog audio signal is a continuous signal represented by an electrical voltage or current that is ''analogous'' to the sound waves in the air. Analog signal processing then involves physically altering the continuous signal by changing the voltage or current or charge via

electrical circuits An electrical network is an interconnection of electrical components (e.g., batteries, resistors, inductors, capacitors, switches, transistors) or a model of such an interconnection, consisting of electrical elements (e.g., voltage sources, ...

. Historically, before the advent of widespread digital technology, analog was the only method by which to manipulate a signal. Since that time, as computers and software have become more capable and affordable, digital signal processing has become the method of choice. However, in music applications, analog technology is often still desirable as it often produces

nonlinear In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...

responses that are difficult to replicate with digital filters.

Digital signals

A digital representation expresses the audio waveform as a sequence of symbols, usually binary numbers. This permits signal processing using

digital circuits Digital electronics is a field of electronics involving the study of digital signals and the engineering of devices that use or produce them. This is in contrast to analog electronics and analog signals. Digital electronic circuits are usually ...

such as digital signal processors,

microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...

s and general-purpose computers. Most modern audio systems use a digital approach as the techniques of digital signal processing are much more powerful and efficient than analog domain signal processing.

Applications

Processing methods and application areas include storage,

data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...

music information retrieval Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many real-world applications. Those involved in MIR may have a background in academic musico ...

speech processing Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied ...

, localization, acoustic detection,

transmission Transmission may refer to: Medicine, science and technology * Power transmission ** Electric power transmission ** Propulsion transmission, technology allowing controlled application of power *** Automatic transmission *** Manual transmission ** ...

noise cancellation Active noise control (ANC), also known as noise cancellation (NC), or active noise reduction (ANR), is a method for reducing unwanted sound by the addition of a second sound specifically designed to cancel the first. The concept was first develo ...

, acoustic fingerprinting, sound recognition,

synthesis Synthesis or synthesize may refer to: Science Chemistry and biochemistry *Chemical synthesis, the execution of chemical reactions to form a more complex molecule from chemical precursors **Organic synthesis, the chemical synthesis of organi ...

, and enhancement (e.g. equalization,

filtering Filter, filtering or filters may refer to: Science and technology Computing * Filter (higher-order function), in functional programming * Filter (software), a computer program to process a data stream * Filter (video), a software component th ...

, level compression,

echo In audio signal processing and acoustics, an echo is a reflection of sound that arrives at the listener with a delay after the direct sound. The delay is directly proportional to the distance of the reflecting surface from the source and the li ...

and

reverb Reverberation (also known as reverb), in acoustics, is a persistence of sound, after a sound is produced. Reverberation is created when a sound or signal is reflected causing numerous reflections to build up and then decay as the sound is abs ...

removal or addition, etc.).

Audio broadcasting

Audio signal processing is used when broadcasting audio signals in order to enhance their fidelity or optimize for bandwidth or latency. In this domain, the most important audio processing takes place just before the transmitter. The audio processor here must prevent or minimize overmodulation, compensate for non-linear transmitters (a potential issue with

medium wave Medium wave (MW) is the part of the medium frequency (MF) radio band used mainly for AM radio broadcasting. The spectrum provides about 120 channels with more limited sound quality than FM stations on the FM broadcast band. During the dayt ...

and

shortwave Shortwave radio is radio transmission using shortwave (SW) radio frequencies. There is no official definition of the band, but the range always includes all of the high frequency band (HF), which extends from 3 to 30 MHz (100 to 10 m ...

broadcasting), and adjust overall

loudness In acoustics, loudness is the subjective perception of sound pressure. More formally, it is defined as, "That attribute of auditory sensation in terms of which sounds can be ordered on a scale extending from quiet to loud". The relation of phys ...

to the desired level.

Active noise control

Active noise control Active noise control (ANC), also known as noise cancellation (NC), or active noise reduction (ANR), is a method for reducing unwanted sound by the addition of a second sound specifically designed to cancel the first. The concept was first develop ...

is a technique designed to reduce unwanted sound. By creating a signal that is identical to the unwanted noise but with the opposite polarity, the two signals cancel out due to

destructive interference In physics, interference is a phenomenon in which two waves combine by adding their displacement together at every single point in space and time, to form a resultant wave of greater, lower, or the same amplitude. Constructive and destructive ...

Audio synthesis

Audio synthesis is the electronic generation of audio signals. A musical instrument that accomplishes this is called a synthesizer. Synthesizers can either imitate sounds or generate new ones. Audio synthesis is also used to generate human

speech Speech is a human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words (that is, all English words sound different from all French words, even if they are th ...

using

speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...

Audio effects

Audio effects alter the sound of a

musical instrument A musical instrument is a device created or adapted to make musical sounds. In principle, any object that produces sound can be considered a musical instrument—it is through purpose that the object becomes a musical instrument. A person who pl ...

or other audio source. Common effects include

distortion In signal processing, distortion is the alteration of the original shape (or other characteristic) of a signal. In communications and electronics it means the alteration of the waveform of an information-bearing signal, such as an audio s ...

, often used with electric guitar in

electric blues Electric blues refers to any type of blues music distinguished by the use of electric amplification for musical instruments. The guitar was the first instrument to be popularly amplified and used by early pioneers T-Bone Walker in the late 19 ...

and

rock music Rock music is a broad genre of popular music that originated as "rock and roll" in the United States in the late 1940s and early 1950s, developing into a range of different styles in the mid-1960s and later, particularly in the United States and ...

;

dynamic Dynamics (from Greek δυναμικός ''dynamikos'' "powerful", from δύναμις ''dynamis'' " power") or dynamic may refer to: Physics and engineering * Dynamics (mechanics) ** Aerodynamics, the study of the motion of air ** Analytical dyna ...

effects such as volume pedals and

compressors A compressor is a mechanical device that increases the pressure of a gas by reducing its volume. An air compressor is a specific type of gas compressor. Compressors are similar to pumps: both increase the pressure on a fluid and both can trans ...

, which affect loudness;

filters Filter, filtering or filters may refer to: Science and technology Computing * Filter (higher-order function), in functional programming * Filter (software), a computer program to process a data stream * Filter (video), a software component th ...

such as wah-wah pedals and graphic equalizers, which modify frequency ranges;

modulation In electronics and telecommunications, modulation is the process of varying one or more properties of a periodic waveform, called the '' carrier signal'', with a separate signal called the ''modulation signal'' that typically contains informat ...

effects, such as

chorus Chorus may refer to: Music * Chorus (song) or refrain, line or lines that are repeated in music or in verse * Chorus effect, the perception of similar sounds from multiple sources as a single, richer sound * Chorus form, song in which all verse ...

, flangers and phasers; pitch effects such as pitch shifters; and time effects, such as

and delay, which create echoing sounds and emulate the sound of different spaces. Musicians, audio engineers and record producers use effects units during live performances or in the studio, typically with electric guitar, bass guitar,

electronic keyboard An electronic keyboard, portable keyboard, or digital keyboard is an electronic musical instrument, an electronic derivative of keyboard instruments. Electronic keyboards include synthesizers, digital pianos, stage pianos, electronic organs ...

electric piano An electric piano is a musical instrument which produces sounds when a performer presses the keys of a piano-style musical keyboard. Pressing keys causes mechanical hammers to strike metal strings, metal reeds or wire tines, leading to vibrations ...

. While effects are most frequently used with

electric Electricity is the set of physical phenomena associated with the presence and motion of matter that has a property of electric charge. Electricity is related to magnetism, both being part of the phenomenon of electromagnetism, as described b ...

electronic Electronic may refer to: *Electronics, the science of how to control electric energy in semiconductor * ''Electronics'' (magazine), a defunct American trade journal *Electronic storage, the storage of data using an electronic device *Electronic co ...

instruments, they can be used with any audio source, such as acoustic instruments, drums, and vocals.

History

Analog signals

Digital signals

Applications

Audio broadcasting

Active noise control

Audio synthesis

Audio effects

Computer audition

See also

References

Further reading