HOME

TheInfoList



OR:

Echo suppression and echo cancellation are methods used in
telephony Telephony ( ) is the field of technology involving the development, application, and deployment of telecommunication services for the purpose of electronic transmission of voice, fax, or data, between distant parties. The history of telephony is i ...
to improve voice quality by preventing
echo In audio signal processing and acoustics, an echo is a reflection of sound that arrives at the listener with a delay after the direct sound. The delay is directly proportional to the distance of the reflecting surface from the source and the lis ...
from being created or removing it after it is already present. In addition to improving subjective audio quality, echo suppression increases the capacity achieved through silence suppression by preventing echo from traveling across a
telecommunications network A telecommunications network is a group of nodes interconnected by telecommunications links that are used to exchange messages between the nodes. The links may use a variety of technologies based on the methodologies of circuit switching, mes ...
. Echo suppressors were developed in the 1950s in response to the first use of satellites for telecommunications. Echo suppression and cancellation methods are commonly called acoustic echo suppression (AES) and acoustic echo cancellation (AEC), and more rarely line echo cancellation (LEC). In some cases, these terms are more precise, as there are various types and causes of echo with unique characteristics, including acoustic echo (sounds from a loudspeaker being reflected and recorded by a microphone, which can vary substantially over time) and line echo (electrical impulses caused by, e.g., coupling between the sending and receiving wires, impedance mismatches, electrical reflections, etc., which varies much less than acoustic echo). In practice, however, the same techniques are used to treat all types of echo, so an acoustic echo canceller can cancel line echo as well as acoustic echo. ''AEC'' in particular is commonly used to refer to echo cancelers in general, regardless of whether they were intended for acoustic echo, line echo, or both. Although echo suppressors and echo cancellers have similar goals—preventing a speaking individual from hearing an echo of their own voice—the methods they use are different: * Echo suppressors work by detecting a voice signal going in one direction on a circuit, and then muting or attenuating the signal in the other direction. Usually, the echo suppressor at the far end of the circuit does this muting when it detects voice coming from the near-end of the circuit. This muting prevents the speaker from hearing their own voice returning from the far end. * Echo cancellation involves first recognizing the originally transmitted signal that re-appears, with some delay, in the transmitted or received signal. Once the echo is recognized, it can be removed by subtracting it from the transmitted or received signal. This technique is generally implemented digitally using a
digital signal processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on MOS integrated circuit chips. They are widely used in audio s ...
or software, although it can be implemented in analog circuits as well. ITU standard
G.168
an
P.340
describe requirements and tests for echo cancellers in digital and
PSTN The public switched telephone network (PSTN) provides infrastructure and services for public telecommunication. The PSTN is the aggregate of the world's circuit-switched telephone networks that are operated by national, regional, or local teleph ...
applications, respectively.


History

In
telephony Telephony ( ) is the field of technology involving the development, application, and deployment of telecommunication services for the purpose of electronic transmission of voice, fax, or data, between distant parties. The history of telephony is i ...
, echo is the reflected copy of one's voice heard some time later. If the delay is fairly significant (more than a few hundred milliseconds), it is considered annoying. If the delay is very small (10s of milliseconds or less), the phenomenon is called
sidetone Sidetone is audible feedback to someone speaking or otherwise producing sound as an indication of active transmission. Sidetone is introduced by some communications circuits and anti-sidetone circuitry is used to control its level. Sidetone is exp ...
. If the delay is slightly longer, around 50 milliseconds, humans cannot hear the echo as a distinct sound, but instead hear a
chorus effect Chorus (or chorusing, choruser or chorused effect) is an audio effect that occurs when individual sounds with approximately the same time, and very similar pitches, converge. While similar sounds coming from multiple sources can occur naturally, ...
. In the earlier days of telecommunications, echo suppression was used to reduce the objectionable nature of echos to human users. One person speaks while the other listens, and they speak back and forth. An echo suppressor attempts to determine which is the primary direction and allows that channel to go forward. In the reverse channel, it places
attenuation In physics, attenuation (in some contexts, extinction) is the gradual loss of flux intensity through a medium. For instance, dark glasses attenuate sunlight, lead attenuates X-rays, and water and air attenuate both light and sound at var ...
to block or ''suppress'' any signal on the assumption that the signal is echo. Although the suppressor effectively deals with echo, this approach leads to several problems which may be frustrating for both parties to a call. * Double-talk: It is fairly normal in conversation for both parties to speak at the same time, at least briefly. Because each echo suppressor will then detect voice energy coming from the far-end of the circuit, the effect would ordinarily be for loss to be inserted in both directions at once, effectively blocking both parties. To prevent this, echo suppressors can be set to detect voice activity from the near-end speaker and to fail to insert loss (or insert a smaller loss) when both the near-end speaker and far-end speaker are talking. This, of course, temporarily defeats the primary effect of having an echo suppressor at all. * Clipping: Since the echo suppressor is alternately inserting and removing loss, there is frequently a small delay when a new speaker begins talking that results in clipping the first syllable from that speaker's speech. * Dead-set: If the far-end party on a call is in a noisy environment, the near-end speaker will hear that background noise while the far-end speaker is talking, but the echo suppressor will suppress this background noise when the near-end speaker starts talking. The sudden absence of background noise gives the near-end user the impression that the line has gone dead. In response to this,
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mul ...
developed echo canceler theory in the early 1960s, which then resulted in laboratory echo cancelers in the late 1960s and commercial echo cancelers in the 1980s. An echo canceller works by generating an estimate of the echo from the talker's signal, and subtracts that estimate from the return path. This technique requires an
adaptive filter An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those parameters according to an optimization algorithm. Because of the complexity of the optimization algor ...
to generate a signal accurate enough to effectively cancel the echo, where the echo can differ from the original due to various kinds of degradation along the way. Since invention at AT&T Bell Labs echo cancellation algorithms have been improved and honed. Like all echo cancelling processes, these first algorithms were designed to anticipate the signal which would inevitably re-enter the transmission path, and cancel it out. Rapid advances in
digital signal processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are ...
allowed echo cancellers to be made smaller and more cost-effective. In the 1990s, echo cancellers were implemented within voice switches for the first time (in the Northern Telecom DMS-250) rather than as standalone devices. The integration of echo cancellation directly into the switch meant that echo cancellers could be reliably turned on or off on a call-by-call basis, removing the need for separate trunk groups for voice and data calls. Today's telephony technology often employs echo cancellers in small or handheld communications devices via a software
voice engine A voice engine is a software subsystem for bidirectional audio communication, typically used as part of a telecommunications system to simulate a telephone. It functions like a data pump for audio data, specifically voice data. The voice engine ...
, which provides cancellation of either acoustic echo or the residual echo introduced by a far-end PSTN gateway system; such systems typically cancel echo reflections with up to 64 milliseconds delay.


Operation

The echo cancellation process works as follows: # A far-end signal is delivered to the system. # The far-end signal is reproduced. # The far-end signal is filtered and delayed to resemble the near-end signal. # The filtered far-end signal is subtracted from the near-end signal. # The resultant signal represents sounds present in the room excluding any direct or reverberated sound. The primary challenge for an echo canceller is determining the nature of the filtering to be applied to the far-end signal such that it resembles the resultant near-end signal. The filter is essentially a model of speaker, microphone and the room's acoustical attributes. Echo cancellers must be adaptive because the characteristics of the near-end's speaker and microphone are generally not known in advance. The acoustical attributes of the near-end's room are also not generally known in advance, and may change (e.g., if the microphone is moved relative to the speaker, or if individuals walk around the room causing changes in the acoustic reflections). By using the far-end signal as the stimulus, modern systems use an adaptive filter and can ''converge'' from providing no cancellation to 55 dB of cancellation in around 200 ms. Until recently echo cancellation only needed to apply to the voice bandwidth of telephone circuits.
PSTN The public switched telephone network (PSTN) provides infrastructure and services for public telecommunication. The PSTN is the aggregate of the world's circuit-switched telephone networks that are operated by national, regional, or local teleph ...
calls transmit frequencies between 300 Hz and 3 kHz, the range required for human speech intelligibility.
Videoconferencing Videotelephony, also known as videoconferencing and video teleconferencing, is the two-way or multipoint reception and transmission of audio signal, audio and video signals by people in different locations for Real-time, real time communication. ...
is one area where full bandwidth audio is used. In this case, specialized products are employed to perform echo cancellation. Because echo suppression has known limitations, in an ideal situation, echo cancellation alone will be used. However, this is insufficient in many applications, notably software phones on networks with long delay and meager throughput. Here, echo cancellation and suppression can work in conjunction to achieve acceptable performance.


Quantifying echo

Echo is measured as (ERL). This is the ratio, expressed in
decibels The decibel (symbol: dB) is a relative unit of measurement equal to one tenth of a bel (B). It expresses the ratio of two values of a power or root-power quantity on a logarithmic scale. Two signals whose levels differ by one decibel have a ...
, of the original and its echo. High values mean the echo is very weak, while low values mean the echo is very strong. Negative indicate the echo is stronger than the original signal, which if left unchecked would cause
audio feedback Audio feedback (also known as acoustic feedback, simply as feedback) is a positive feedback situation which may occur when an acoustic path exists between an audio input (for example, a microphone or guitar pickup) and an audio output (for exa ...
. The performance of an echo canceller is measured in ''echo return loss enhancement'' (ERLE), which is the amount of additional signal loss applied by the echo canceller. Most echo cancellers are able to apply 18 to 35 dB ERLE. The total signal loss of the echo (ACOM) is the sum of the ERL and ERLE.


Current uses

Sources of echo are found in everyday surroundings such as: * Hands-free car phone systems * A standard telephone or cellphone in
speakerphone A speakerphone is a telephone with a microphone and loudspeaker provided separately from those in the handset. This device allows multiple persons to participate in a conversation. The loudspeaker broadcasts the voice or voices of those on the ot ...
mode * Dedicated standalone speakerphones * Installed
conference room A conference hall, conference room, or meeting room is a room provided for singular events such as business conferences and meetings. Room It is commonly found at large hotels and convention centers though many other establishments, including even ...
systems which use ceiling speakers and microphones on the table * Physical coupling where vibrations of the
loudspeaker A loudspeaker (commonly referred to as a speaker or speaker driver) is an electroacoustic transducer that converts an electrical audio signal into a corresponding sound. A ''speaker system'', also often simply referred to as a "speaker" or ...
transfer to the microphone via the
handset A handset is a component of a telephone that a user holds to the ear and mouth to receive audio through the receiver and speak to the remote party using the built-in transmitter. In earlier telephones, the transmitter was mounted directly on ...
casing In some of these cases, sound from the loudspeaker enters the microphone almost unaltered. The difficulties in canceling echo stem from the alteration of the original sound by the ambient space. These changes can include certain frequencies being absorbed by soft furnishings and reflection of different frequencies at varying strength. Implementing AEC requires engineering expertise and a fast processor, usually in the form of a
digital signal processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on MOS integrated circuit chips. They are widely used in audio s ...
(DSP), this cost in processing capability may come at a premium, however, many embedded systems do have a fully functional AEC.
Smart speakers A smart speaker is a type of loudspeaker and voice command device with an integrated virtual assistant that offers interactive actions and hands-free activation with the help of one "hot word" (or several "hot words"). Some smart speakers can a ...
and
interactive voice response Interactive voice response (IVR) is a technology that allows telephone users to interact with a computer-operated telephone system through the use of voice and DTMF tones input with a keypad. In telecommunications, IVR allows customers to interac ...
systems that accept speech for input use AEC while speech prompts are played to prevent the system's own speech recognition from falsely recognizing the echoed prompts and other output.


Modems

Standard telephone lines use the same pair of wires to both send and receive audio, which results in a small amount of the outgoing signal being reflected back. This is useful for people talking on the phone, as it provides a signal to the speaker that their voice is making it through the system. However, this reflected signal causes problems for a modem, which is unable to distinguish between a signal from the remote modem and the echo of its own signal. For this reason, earlier dial-up modems split the signal frequencies, so that the devices on either end used different tones, allowing each one to ignore any signals in the frequency range it was using for transmission. However, this diminished the amount of bandwidth available to both sides. Echo cancellation mitigated this problem. During the call setup and negotiation period, both modems send a series of unique tones and then listen for them to return through the phone system. They measure the total delay time, then configure a delay line for that same period. Once the connection is completed, they send their signals into the phone lines as normal, but also into the delay line. When their signal is reflected back, it is mixed with the inverted signal from the delay line, which cancels out the echo. This allowed both modems to use the full spectrum available, doubling the possible speed. Echo cancellation is also applied by many telcos to the line itself and can cause data corruption rather than improving the signal. Some telephone switches or converters (such as analog terminal adapters) disable echo suppression or echo cancellation when they detect the 2100 or 2225 Hz answer tones associated with such calls, in accordance with
ITU-T The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Co ...
recommendation G.164 or G.165. ISDN and DSL modems operating at frequencies above the voice band over standard
twisted-pair Twisted pair cabling is a type of wiring used for communications in which two conductors of a single circuit are twisted together for the purposes of improving electromagnetic compatibility. Compared to a single conductor or an untwisted ba ...
telephone wires also make use of automated echo cancellation to allow simultaneous bidirectional data communication. The computational complexity in implementing the adaptive filter is much reduced compared to voice echo cancelling because the transmit signal is a digital bit stream. Instead of a multiplication and an addition operation for every tap in the filter, only the addition is required. A
RAM Ram, ram, or RAM may refer to: Animals * A male sheep * Ram cichlid, a freshwater tropical fish People * Ram (given name) * Ram (surname) * Ram (director) (Ramsubramaniam), an Indian Tamil film director * RAM (musician) (born 1974), Dutch * ...
lookup table based echo cancelling scheme eliminates even the addition operation by simply addressing a memory with a truncated transmit bit stream to obtain the echo estimate. With advances in semiconductor technology echo cancellation is now commonly implemented with
Digital Signal Processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on MOS integrated circuit chips. They are widely used in audio s ...
(DSP) techniques. Some modems use separate incoming and outgoing frequencies or allocate separate time slots for transmitting and receiving to eliminate the need for echo cancellation. Higher frequencies beyond the original design limits of telephone cables suffer significant
attenuation distortion Attenuation distortion is the distortion of an analog signal that occurs during transmission when the transmission medium does not have a flat frequency response across the bandwidth of the medium or the frequency spectrum of the signal.Rowe, St ...
due to bridge taps and incomplete
impedance matching In electronics, impedance matching is the practice of designing or adjusting the input impedance or output impedance of an electrical device for a desired value. Often, the desired value is selected to maximize power transfer or minimize si ...
. Deep, narrow frequency gaps which cannot be remedied by echo cancellation often result. These are detected and mapped out during connection negotiation.


See also

*
Audio feedback Audio feedback (also known as acoustic feedback, simply as feedback) is a positive feedback situation which may occur when an acoustic path exists between an audio input (for example, a microphone or guitar pickup) and an audio output (for exa ...
*
Least mean squares filter Least mean squares (LMS) algorithms are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean square of the error signal (difference between the desired and the actual ...
*
Mix-minus In audio engineering, a mix-minus or clean feed is a particular setup of a mixing console or matrix mixer, such that an output of the mixer contains everything ''except'' a designated input. Mix-minus is often used to prevent echoes or feedback ...
* Signal reflection *
Voice engine A voice engine is a software subsystem for bidirectional audio communication, typically used as part of a telecommunications system to simulate a telephone. It functions like a data pump for audio data, specifically voice data. The voice engine ...


References


External links

* * * {{DEFAULTSORT:Echo Cancellation Communication circuits Telephony