HOME

TheInfoList




In
signal processing Signal processing is an electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipment, devices, and systems which use electricity, electronics, and electromagnetis ...

signal processing
, data compression, source coding, or bit-rate reduction is the process of encoding
information Information is processed, organised and structured data Data (; ) are individual facts, statistics, or items of information, often numeric. In a more technical sense, data are a set of values of qualitative property, qualitative or quant ...

information
using fewer
bit The bit is a basic unit of information in computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both ...
s than the original representation. Any particular compression is either
lossy In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size f ...
or
lossless Lossless compression is a class of data compression In signal processing Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as audio signal processing, sound, image p ...
. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder. The process of reducing the size of a
data file A data file is a computer file A computer file is a computer resource for recording data Data are units of information Information can be thought of as the resolution of uncertainty; it answers the question of "What an entity is" and ...
is often referred to as data compression. In the context of
data transmission Data transmission and data reception (or, more broadly, data communication or digital communications) is the transfer and reception of data (a Digital data, digital bitstream or a digitized analog signal) over a Point-to-point (telecommunications) ...

data transmission
, it is called source coding; encoding done at the source of the data before it is stored or transmitted. Source coding should not be confused with
channel coding In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and softw ...
, for error detection and correction or
line coding In telecommunication, a line code is a pattern of voltage, current, or photons used to represent digital data transmission (telecommunications), transmitted down a transmission line. This repertoire of signals is usually called a constrained code ...
, the means for mapping data onto a signal. Compression is useful because it reduces the resources required to store and transmit data.
Computational resource In computational complexity theory, a computational resource is a resource used by some computational models in the solution of computational problem In theoretical computer science An artistic representation of a Turing machine. Turing machi ...
s are consumed in the compression and decompression processes. Data compression is subject to a space–time complexity trade-off. For instance, a compression scheme for video may require expensive
hardware Hardware may refer to: Technology Computing and electronics * Computer hardware, physical parts of a computer * Digital electronics, electronics that operate on digital signals * Electronic component, device in an electronic system used to affect e ...
for the video to be decompressed fast enough to be viewed as it is being decompressed, and the option to decompress the video in full before watching it may be inconvenient or require additional storage. The design of data compression schemes involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (when using
lossy data compression In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size fo ...
), and the computational resources required to compress and decompress the data


Lossless

Lossless data compression Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy compression permits reconstruction only of an approximation of the original d ...
algorithm In and , an algorithm () is a finite sequence of , computer-implementable instructions, typically to solve a class of problems or to perform a computation. Algorithms are always and are used as specifications for performing s, , , and other ...

algorithm
s usually exploit statistical redundancy to represent data without losing any
information Information is processed, organised and structured data Data (; ) are individual facts, statistics, or items of information, often numeric. In a more technical sense, data are a set of values of qualitative property, qualitative or quant ...
, so that the process is reversible. Lossless compression is possible because most real-world data exhibits statistical redundancy. For example, an image may have areas of color that do not change over several pixels; instead of coding "red pixel, red pixel, ..." the data may be encoded as "279 red pixels". This is a basic example of
run-length encoding Run-length encoding (RLE) is a form of lossless data compression Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy compression ...
; there are many schemes to reduce file size by eliminating redundancy. The
Lempel–Ziv LZ77 and LZ78 are the two lossless data compression Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy compression permits recon ...
(LZ) compression methods are among the most popular algorithms for lossless storage. DEFLATE is a variation on LZ optimized for decompression speed and compression ratio, but compression can be slow. In the mid-1980s, following work by
Terry Welch Terry Archer Welch was an American computer scientist. Along with Abraham Lempel Abraham Lempel ( he, אברהם למפל, born 10 February 1936) is an Israeli computer scientist and one of the fathers of the LZ family of lossless data compress ...
, the
Lempel–Ziv–Welch Lempel–Ziv–Welch (LZW) is a universal lossless data compression Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy c ...
(LZW) algorithm rapidly became the method of choice for most general-purpose compression systems. LZW is used in
GIF The Graphics Interchange Format (GIF; or , #Pronunciation, see pronunciation) is a Raster graphics, bitmap Image file formats, image format that was developed by a team at the online services provider CompuServe led by American computer scien ...
images, programs such as PKZIP, and hardware devices such as modems. LZ methods use a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded.
Grammar-based codes Grammar-based codes or Grammar-based compression are compression algorithms based on the idea of constructing a context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose Production (computer science), p ...
like this can compress highly repetitive input extremely effectively, for instance, a biological data collection of the same or closely related species, a huge versioned document collection, internet archival, etc. The basic task of grammar-based codes is constructing a context-free grammar deriving a single string. Other practical grammar compression algorithms include Sequitur and Re-Pair. The strongest modern lossless compressors use
probabilistic Probability is the branch of mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathe ...
models, such as
prediction by partial matching Prediction by partial matching (PPM) is an adaptive statistics, statistical data compression technique based on context modeling and prediction. PPM models use a set of previous symbols in the uncompressed symbol stream to predict the next symbol i ...
. The
Burrows–Wheeler transform The Burrows–Wheeler transform (BWT, also called block-sorting compression) rearranges a character string into runs of similar characters. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated ch ...
can also be viewed as an indirect form of statistical modelling. In a further refinement of the direct use of
probabilistic model A statistical model is a mathematical model A mathematical model is a description of a system A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unified whole. A system ...
ling, statistical estimates can be coupled to an algorithm called
arithmetic coding Arithmetic coding is a form of entropy encoding In information theory, an entropy coding (or entropy encoding) is a lossless compression , lossless data compression scheme that is independent of the specific characteristics of the medium. One of ...
. Arithmetic coding is a more modern coding technique that uses the mathematical calculations of a
finite-state machine A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation In computer science Computer science deals with the theoretical fou ...
to produce a string of encoded bits from a series of input data symbols. It can achieve superior compression compared to other techniques such as the better-known Huffman algorithm. It uses an internal memory state to avoid the need to perform a one-to-one mapping of individual input symbols to distinct representations that use an integer number of bits, and it clears out the internal memory only after encoding the entire string of data symbols. Arithmetic coding applies especially well to adaptive data compression tasks where the statistics vary and are context-dependent, as it can be easily coupled with an adaptive model of the
probability distribution In probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
of the input data. An early example of the use of arithmetic coding was in an optional (but not widely used) feature of the
JPEG JPEG ( ) is a commonly used method of lossy compression In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represe ...

JPEG
image coding standard. It has since been applied in various other designs including
H.263 H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videoconferencing Videotelephony comprises the technologies for the reception and transmission of audio- video signals by users in different loc ...
, H.264/MPEG-4 AVC and
HEVC High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digita ...
for video coding. Archive software typically has the ability to adjust the "dictionary size", where a larger size demands more
random access memory 8GB DDR3 RAM stick with a white Heat sink">heatsink File:Laptop Heatsink.jpg, 330px, Typical heatsink-fan combination found on a consumer laptop. The heatpipes which contain a working fluid make direct contact with the CPU and GPU, conductin ...

random access memory
during compression and decompression, but compresses stronger, especially on repeating patterns in files' content.


Lossy

In the late 1980s, digital images became more common, and standards for lossless
image compression Image compression is a type of data compression In signal processing Signal processing is an electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipment, ...
emerged. In the early 1990s, lossy compression methods began to be widely used. In these schemes, some loss of information is accepted as dropping nonessential detail can save storage space. There is a corresponding
trade-off A trade-off (or tradeoff) is a situational decision that involves diminishing or losing one quality, quantity, or property of a set or design in return for gains in other aspects. In simple terms, a tradeoff is where one thing increases, and another ...
between preserving information and reducing size. Lossy data compression schemes are designed by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in
luminance Luminance is a photometricPhotometry can refer to: * Photometry (optics), the science of measurement of visible light in terms of its perceived brightness to human vision * Photometry (astronomy), the measurement of the flux or intensity of an ...

luminance
than it is to the variations in color.
JPEG JPEG ( ) is a commonly used method of lossy compression In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represe ...

JPEG
image compression works in part by rounding off nonessential bits of information. A number of popular compression formats exploit these perceptual differences, including
psychoacoustics Psychoacoustics is the branch of psychophysics Psychophysics quantitatively investigates the relationship between physical stimulus (physiology), stimuli and the sensation (psychology), sensations and perceptions they produce. Psychophysics has b ...
for sound, and psychovisuals for images and video. Most forms of lossy compression are based on
transform coding Transform coding is a type of data compression In signal processing, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is e ...
, especially the
discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are ...
(DCT). It was first proposed in 1972 by Nasir Ahmed, who then developed a working algorithm with T. Natarajan and
K. R. Rao Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington (UT Arlington). Academically known as K. R. Rao, he is credited with the co-invention of ...
in 1973, before introducing it in January 1974. DCT is the most widely used lossy compression method, and is used in multimedia formats for
images An image (from la, imago) is an artifact that depicts visual perception, such as a photograph or other Two-dimensional space, two-dimensional picture, that resembles a subject—usually a physical physical body, object—and thus provides ...
(such as
JPEG JPEG ( ) is a commonly used method of lossy compression In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represe ...

JPEG
and
HEIF High Efficiency Image File Format (HEIF) is a container format for storing individual images and image sequences. The standard covers multimedia files that can also include other media streams, such as timed text, audio and video. HEIF can st ...
),
video Video is an electronic Electronic may refer to: *Electronics Electronics comprises the physics, engineering, technology and applications that deal with the emission, flow and control of electrons in vacuum and matter. It uses active d ...
(such as
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...

MPEG
, AVC and
HEVC High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digita ...
) and audio (such as
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal processing), digital form. In digital a ...

MP3
,
AAC AAC may refer to: Aviation * Advanced Aircraft Advanced Aircraft Corporation is an aircraft manufacturer based in Carlsbad, California. History AAC bought out Riley Aircraft in 1983 in aviation, 1983. Products The firm has specialised in conve ...
and
Vorbis Vorbis is a free and open-source software Free and open-source software (FOSS) is software Software is a collection of instructions that tell a computer A computer is a machine that can be programmed to carry out sequences of ari ...
). Lossy
image compression Image compression is a type of data compression In signal processing Signal processing is an electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipment, ...
is used in
digital camera A digital camera is a camera A camera is an optical Optics is the branch of physics Physics is the natural science that studies matter, its Elementary particle, fundamental constituents, its Motion (physics), motion and behav ...

digital camera
s, to increase storage capacities. Similarly,
DVD The DVD (common abbreviation for Digital Video Disc or Digital Versatile Disc) is a digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Hardware *Digital electronics Digital elect ...

DVD
s,
Blu-ray The Blu-ray Disc (BD), often known simply as Blu-ray, is a digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Hardware *Digital electronics Digital electronics is a field of elec ...
and
streaming video Streaming media is multimedia Multimedia is a form of communication that combines different such as , , , , or into a single interactive presentation, in contrast to traditional mass media which ...
use lossy
video coding format A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Ha ...
s. Lossy compression is extensively used in video. In lossy audio compression, methods of psychoacoustics are used to remove non-audible (or less audible) components of the
audio signal An audio signal is a representation of sound In physics Physics is the that studies , its , its and behavior through , and the related entities of and . "Physical science is that department of knowledge which relates to the order ...
. Compression of human speech is often performed with even more specialized techniques;
speech coding Speech coding is an application of data compression In signal processing Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as audio signal processing, sound, image ...
is distinguished as a separate discipline from general-purpose audio compression. Speech coding is used in
internet telephony Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and multimedia Multimedia is a form of communication that co ...
, for example, audio compression is used for CD ripping and is decoded by the audio players. Lossy compression can cause
generation loss Generation loss is the loss of quality between subsequent copies or transcodes of data. Anything that reduces the quality of the representation when copying, and would cause further reduction in quality on making a copy of the copy, can be consi ...
.


Theory

The theoretical basis for compression is provided by
information theory Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of Digital data, digital information. The field was fundamentally established by the ...
and, more specifically,
algorithmic information theory Algorithmic information theory (AIT) is a branch of theoretical computer science Theoretical computer science (TCS) is a subset of general computer science Computer science deals with the theoretical foundations of information, algorith ...
for lossless compression and rate–distortion theory for lossy compression. These areas of study were essentially created by
Claude Shannon Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician A mathematician is someone who uses an extensive knowledge of mathematics Mathematics (from Greek: ) includes the study of such topics as numbe ...
, who published fundamental papers on the topic in the late 1940s and early 1950s. Other topics associated with compression include
coding theory Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are studied ...
and
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...
.


Machine learning

There is a close connection between
machine learning Machine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data ...

machine learning
and compression. A system that predicts the posterior probabilities of a sequence given its entire history can be used for optimal data compression (by using
arithmetic coding Arithmetic coding is a form of entropy encoding In information theory, an entropy coding (or entropy encoding) is a lossless compression , lossless data compression scheme that is independent of the specific characteristics of the medium. One of ...
on the output distribution). An optimal compressor can be used for prediction (by finding the symbol that compresses best, given the previous history). This equivalence has been used as a justification for using data compression as a benchmark for "general intelligence". An alternative view can show compression algorithms implicitly map strings into implicit feature space vectors, and compression-based similarity measures compute similarity within these feature spaces. For each compressor C(.) we define an associated vector space ℵ, such that C(.) maps an input string x, corresponding to the vector norm , , ~x, , . An exhaustive examination of the feature spaces underlying all compression algorithms is precluded by space; instead, feature vectors chooses to examine three representative lossless compression methods, LZW, LZ77, and PPM. According to
AIXI AIXI is a theoretical mathematical formalism for artificial general intelligence Artificial general intelligence (AGI) is the hypothetical ability of an intelligent agent In artificial intelligence Artificial intelligence (AI) is intel ...
theory, a connection more directly explained in
Hutter Prize The Hutter Prize is a cash prize funded by Marcus Hutter which rewards data compression improvements on a specific 1 GB English text file. Specifically, the prize awards 5000 euros for each one percent improvement (with 500,000 euros total funding) ...
, the best possible compression of x is the smallest possible software that generates x. For example, in that model, a zip file's compressed size includes both the zip file and the unzipping software, since you can't unzip it without both, but there may be an even smaller combined form.


Data differencing

Data compression can be viewed as a special case of
data differencingIn computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of Algorith ...
. Data differencing consists of producing a ''difference'' given a ''source'' and a ''target,'' with patching reproducing the ''target'' given a ''source'' and a ''difference.'' Since there is no separate source and target in data compression, one can consider data compression as data differencing with empty source data, the compressed file corresponding to a difference from nothing. This is the same as considering absolute
entropy Entropy is a scientific concept as well as a measurable physical property that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodynamic ...
(corresponding to data compression) as a special case of
relative entropy Relative may refer to: General use *Kinship In , kinship is the web of social relationships that form an important part of the lives of all humans in all societies, although its exact meanings even within this discipline are often debated. ...
(corresponding to data differencing) with no initial data. The term ''differential compression'' is used to emphasize the data differencing connection.


Uses


Image

Entropy coding In information theory Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of Digital data, digital information. The field was fundament ...
originated in the 1940s with the introduction of Shannon–Fano coding, the basis for
Huffman coding In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of Algor ...
which was developed in 1950.
Transform coding Transform coding is a type of data compression In signal processing, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is e ...
dates back to the late 1960s, with the introduction of
fast Fourier transform A fast Fourier transform (FFT) is an algorithm In and , an algorithm () is a finite sequence of , computer-implementable instructions, typically to solve a class of problems or to perform a computation. Algorithms are always and are u ...
(FFT) coding in 1968 and the
Hadamard transform 300px, Fast Walsh–Hadamard transform, a faster way to calculate the Walsh spectrum of (1, 0, 1, 0, 0, 1, 1, 0). The Hadamard transform (also known as the Walsh–Hadamard transform, Hadamard–Rademacher–Walsh transform, Walsh transform, o ...
in 1969. An important
image compression Image compression is a type of data compression In signal processing Signal processing is an electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipment, ...
technique is the
discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are ...
(DCT), a technique developed in the early 1970s. DCT is the basis for
JPEG JPEG ( ) is a commonly used method of lossy compression In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represe ...

JPEG
, a
lossy compression In information technology, lossy compression or irreversible compression is the class of data compression, data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to r ...
format which was introduced by the
Joint Photographic Experts Group The Joint Photographic Experts Group (JPEG) is the joint committee between ISO The International Organization for Standardization (ISO ) is an international standard An international standard is a technical standard A technical standard is an ...
(JPEG) in 1992. JPEG greatly reduces the amount of data required to represent an image at the cost of a relatively small reduction in image quality and has become the most widely used
image file format Image file formats are standardized means of organizing and storing digital image A digital image is an composed of s, also known as ''pixels'', each with ', ' of numeric representation for its or that is an output from its fed as input ...
. Its highly efficient DCT-based compression algorithm was largely responsible for the wide proliferation of
digital image A digital image is an image An image (from la, imago) is an artifact that depicts visual perception Visual perception is the ability to interpret the surrounding environment (biophysical), environment through photopic vision (day ...
s and
digital photo Digital photography uses cameras A camera is an optical Optics is the branch of physics Physics (from grc, φυσική (ἐπιστήμη), physikḗ (epistḗmē), knowledge of nature, from ''phýsis'' 'nature'), , is the ...
s.
Lempel–Ziv–Welch Lempel–Ziv–Welch (LZW) is a universal lossless data compression Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy c ...
(LZW) is a
lossless compression Lossless compression is a class of data compression In signal processing Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as audio signal processing, sound, image ...
algorithm developed in 1984. It is used in the
GIF The Graphics Interchange Format (GIF; or , see pronunciation) is a bitmap In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algor ...

GIF
format, introduced in 1987. DEFLATE, a lossless compression algorithm specified in 1996, is used in the
Portable Network Graphics Portable Network Graphics (PNG, officially pronounced , sometimes pronounced ) is a raster graphics, raster-graphics file graphics file format, format that supports lossless data compression. PNG was developed as an improved, non-patented repl ...
(PNG) format.
Wavelet compression In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and ...
, the use of
waveletsA wavelet is a wave In physics Physics (from grc, φυσική (ἐπιστήμη), physikḗ (epistḗmē), knowledge of nature, from ''phýsis'' 'nature'), , is the natural science that studies matter, its Motion (physics), motion an ...
in image compression, began after the development of DCT coding. The
JPEG 2000 JPEG 2000 (JP2) is an image compression Image compression is a type of data compression In signal processing Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such a ...
standard was introduced in 2000. In contrast to the DCT algorithm used by the original JPEG format, JPEG 2000 instead uses
discrete wavelet transform . The original image is high-pass filtered, yielding the three large images, each describing local changes in brightness (details) in the original image. It is then low-pass filtered and downscaled, yielding an approximation image; this image is hig ...
(DWT) algorithms. JPEG 2000 technology, which includes the
Motion JPEG 2000 Motion JPEG 2000 (MJ2 or MJP2) is a file format for motion sequences of JPEG 2000 JPEG 2000 (JP2) is an image compression Image compression is a type of data compression In signal processing, data compression, source coding, or bit-rate r ...
extension, was selected as the
video coding standard A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). It typically uses a standardized video compression ...
for
digital cinema Digital cinema refers to adoption of digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Hardware *Digital electronics Digital electronics is a field of electronics Electronics c ...
in 2004.


Audio

Audio data compression, not to be confused with
dynamic range compression Dynamic range compression (DRC) or simply compression is an audio signal processing Audio signal processing is a subfield of that is concerned with the electronic manipulation of s. Audio signals are electronic representations of s—s ...
, has the potential to reduce the transmission
bandwidth Bandwidth commonly refers to: * Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range * Bandwidth (computing), the rate of data transfer, bit rate or thr ...
and storage requirements of audio data. are implemented in
software Software is a collection of instructions Instruction or instructions may refer to: Computing * Instruction, one operation of a processor within a computer architecture instruction set * Computer program, a collection of instructions Music * I ...

software
as audio
codec A codec is a device or computer program In imperative programming, a computer program is a sequence of instructions in a programming language that a computer can execute or interpret. In declarative programming, a ''computer program'' is a Set ...
s. In both lossy and lossless compression, information redundancy is reduced, using methods such as
coding Coding may refer to: Computer science * Computer programming, the process of creating and maintaining the source code of computer programs * Line coding, in data storage * Source coding, compression used in data transmission * Coding theory * Chann ...
, quantization,
discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are ...
and
linear predictionLinear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function In mathematics, the term linear function refers to two distinct but related notions: * In calculus and related areas, a ...
to reduce the amount of information used to represent the uncompressed data. Lossy audio compression algorithms provide higher compression and are used in numerous audio applications including
Vorbis Vorbis is a free and open-source software Free and open-source software (FOSS) is software Software is a collection of instructions that tell a computer A computer is a machine that can be programmed to carry out sequences of ari ...
and
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal processing), digital form. In digital a ...

MP3
. These algorithms almost all rely on
psychoacoustics Psychoacoustics is the branch of psychophysics Psychophysics quantitatively investigates the relationship between physical stimulus (physiology), stimuli and the sensation (psychology), sensations and perceptions they produce. Psychophysics has b ...
to eliminate or reduce fidelity of less audible sounds, thereby reducing the space required to store or transmit them. The acceptable trade-off between loss of audio quality and transmission or storage size depends upon the application. For example, one 640 MB
compact disc The compact disc (CD) is a digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Hardware *Digital electronics Digital electronics is a field of electronics Electronics compri ...

compact disc
(CD) holds approximately one hour of uncompressed
high fidelity High fidelity (often shortened to Hi-Fi or HiFi) is the high-quality reproduction of sound. It is important to audiophile An audiophile is a person who is enthusiastic about high-fidelity High fidelity (often shortened to Hi-Fi or HiFi) ...
music, less than 2 hours of music compressed losslessly, or 7 hours of music compressed in the
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal processing), digital form. In digital a ...

MP3
format at a medium
bit rate In telecommunications and computing, bit rate (bitrate or as a variable ''R'') is the number of bits that are conveyed or processed per unit of time. The bit rate is expressed in the unit Data rate units, bit per second unit (symbol: ''bit/s' ...
. A digital sound recorder can typically store around 200 hours of clearly intelligible speech in 640 MB. Lossless audio compression produces a representation of digital data that can be decoded to an exact digital duplicate of the original. Compression ratios are around 50–60% of the original size, which is similar to those for generic lossless data compression. Lossless codecs use
curve fitting Curve fitting is the process of constructing a curve In mathematics, a curve (also called a curved line in older texts) is an object similar to a line (geometry), line, but that does not have to be Linearity, straight. Intuitively, a curve m ...

curve fitting
or linear prediction as a basis for estimating the signal. Parameters describing the estimation and the difference between the estimation and the actual signal are coded separately. A number of lossless audio compression formats exist. See list of lossless codecs for a listing. Some formats are associated with a distinct system, such as
Direct Stream Transfer Super Audio CD (SACD) is a read-only optical disc drive. (CD-R), showing characteristic iridescence. In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the st ...
, used in
Super Audio CD Super Audio CD (SACD) is a read-only optical disc In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithm of an algo ...
and
Meridian Lossless Packing 200px, The Advanced Resolution logo Meridian Lossless Packing, also known as Packed PCM (PPCM), is a lossless compression technique for compressing PCM audio data developed by Meridian Audio, Ltd. MLP is the standard lossless compression method f ...
, used in
DVD-Audio DVD-Audio (commonly abbreviated as DVD-A) is a digital format for delivering high-fidelity High fidelity (often shortened to Hi-Fi or HiFi) is the high-quality reproduction of sound. It is important to audiophile An audiophile is a perso ...
,
Dolby TrueHD Dolby TrueHD is a lossless Lossless compression is a class of data compression In signal processing Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as audio sign ...

Dolby TrueHD
,
Blu-ray The Blu-ray Disc (BD), often known simply as Blu-ray, is a digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Hardware *Digital electronics Digital electronics is a field of elec ...
and
HD DVD HD DVD (short for High Definition Digital Versatile Disc) is a discontinued high-density optical disc drive. (CD-R), showing characteristic iridescence. In computing Computing is any goal-oriented activity requiring, benefiting from, ...
. Some
audio file format An audio file format is a file format A file format is a standard Standard may refer to: Flags * Colours, standards and guidons * Standard (flag), a type of flag used for personal identification Norm, convention or requirement * Stan ...
s feature a combination of a lossy format and a lossless correction; this allows stripping the correction to easily obtain a lossy file. Such formats include
MPEG-4 SLS MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per International Organization for Standardization, ISO/International Electrotechnical Commission, IEC 14496-3:2005/Amd 3:2006 (Scalable Lossless Coding), is an extension to the MPEG-4 Part 3 (MPEG-4 A ...
(Scalable to Lossless),
WavPack WavPack is a free and open-source Audio compression (data)#Lossless, lossless audio compression file format, format and application implementing the format. It is unique in the way that it supports hybrid audio compression alongside normal compres ...
, and
OptimFROG DualStream OptimFROG is a proprietary lossless Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy compression permits reconstruction only o ...
. When audio files are to be processed, either by further compression or for
editing Editing is the process of selecting and preparing written language, written, photographic, Image editing, visual, Audio engineer, audible, or Film editing, cinematic material used by a person or an entity to convey a message or information. T ...
, it is desirable to work from an unchanged original (uncompressed or losslessly compressed). Processing of a lossily compressed file for some purpose usually produces a final result inferior to the creation of the same compressed file from an uncompressed original. In addition to sound editing or mixing, lossless audio compression is often used for archival storage, or as master copies.


Lossy audio compression

Lossy audio compression is used in a wide range of applications. In addition to standalone audio-only applications of file playback in MP3 players or computers, digitally compressed audio streams are used in most video DVDs, digital television, streaming media on the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a ''internetworking, network of networks'' that consist ...

Internet
, satellite and cable radio, and increasingly in terrestrial radio broadcasts. Lossy compression typically achieves far greater compression than lossless compression, by discarding less-critical data based on
psychoacoustic Psychoacoustics is the branch of psychophysics Psychophysics quantitatively investigates the relationship between physical stimuli and the sensations and perceptions they produce. Psychophysics has been described as "the scientific study of the ...
optimizations. Psychoacoustics recognizes that not all data in an audio stream can be perceived by the human
auditory system The auditory system is the sensory system The sensory nervous system is a part of the nervous system In biology Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Bi ...
. Most lossy compression reduces redundancy by first identifying perceptually irrelevant sounds, that is, sounds that are very hard to hear. Typical examples include high frequencies or sounds that occur at the same time as louder sounds. Those irrelevant sounds are coded with decreased accuracy or not at all. Due to the nature of lossy algorithms,
audio quality Sound quality is typically an assessment of the accuracy, fidelity, or Intelligibility (communication), intelligibility of sound, audio output from an electronic device. Quality can be measured objectively, such as when tools are used to gauge ...
suffers a digital generation loss when a file is decompressed and recompressed. This makes lossy compression unsuitable for storing the intermediate results in professional audio engineering applications, such as sound editing and multitrack recording. However, lossy formats such as
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal processing), digital form. In digital a ...

MP3
are very popular with end-users as the file size is reduced to 5-20% of the original size and a megabyte can store about a minute's worth of music at adequate quality.


= Coding methods

= To determine what information in an audio signal is perceptually irrelevant, most lossy compression algorithms use transforms such as the
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that ...
(MDCT) to convert
time domain Time domain refers to the analysis of mathematical functions Mathematics (from Greek: ) includes the study of such topics as quantity Quantity is a property that can exist as a multitude or magnitude, which illustrate discontinuity ...
sampled waveforms into a transform domain, typically the
frequency domain In physics, electronics, control systems engineering, and statistics, the frequency domain refers to the analysis of mathematical functions or Signal (information theory), signals with respect to frequency, rather than time. Put simply, a time-dom ...
. Once transformed, component frequencies can be prioritized according to how audible they are. Audibility of spectral components is assessed using the
absolute threshold of hearing The absolute threshold of hearing (ATH) is the minimum sound level of a pure tone In psychoacoustics Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how humans perceive various sou ...
and the principles of simultaneous masking—the phenomenon wherein a signal is masked by another signal separated by frequency—and, in some cases,
temporal masking Auditory masking occurs when the perception of one sound In physics Physics (from grc, φυσική (ἐπιστήμη), physikḗ (epistḗmē), knowledge of nature, from ''phýsis'' 'nature'), , is the natural science that studie ...
—where a signal is masked by another signal separated by time.
Equal-loudness contour An equal-loudness contour is a measure of sound pressure level Sound pressure or acoustic pressure is the local pressure deviation from the ambient (average or equilibrium) atmospheric pressure, caused by a sound wave. In air, sound press ...
s may also be used to weigh the perceptual importance of components. Models of the human ear-brain combination incorporating such effects are often called
psychoacoustic model Psychoacoustics is the branch of psychophysics Psychophysics quantitatively investigates the relationship between physical stimuli and the sensations and perceptions they produce. Psychophysics has been described as "the scientific study of the ...

psychoacoustic model
s. Other types of lossy compressors, such as the
linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic rep ...
(LPC) used with speech, are source-based coders. LPC uses a model of the human vocal tract to analyze speech sounds and infer the parameters used by the model to produce them moment to moment. These changing parameters are transmitted or stored and used to drive another model in the decoder which reproduces the sound. Lossy formats are often used for the distribution of streaming audio or interactive communication (such as in cell phone networks). In such applications, the data must be decompressed as the data flows, rather than after the entire data stream has been transmitted. Not all audio codecs can be used for streaming applications.
Latency Latency or latent may refer to: Science and technology * Latent heat, energy released or absorbed, by a body or a thermodynamic system, during a constant-temperature process * Latent variable, a variable that is not directly observed but inferred i ...
is introduced by the methods used to encode and decode the data. Some codecs will analyze a longer segment, called a ''frame'', of the data to optimize efficiency, and then code it in a manner that requires a larger segment of data at one time to decode. The inherent latency of the coding algorithm can be critical; for example, when there is a two-way transmission of data, such as with a telephone conversation, significant delays may seriously degrade the perceived quality. In contrast to the speed of compression, which is proportional to the number of operations required by the algorithm, here latency refers to the number of samples that must be analyzed before a block of audio is processed. In the minimum case, latency is zero samples (e.g., if the coder/decoder simply reduces the number of bits used to quantize the signal). Time domain algorithms such as LPC also often have low latencies, hence their popularity in speech coding for telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed to implement a psychoacoustic model in the frequency domain, and latency is on the order of 23 ms.


= Speech encoding

=
Speech encoding Speech coding is an application of data compression In signal processing Signal processing is an electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipme ...
is an important category of audio data compression. The perceptual models used to estimate what aspects of speech a human ear can hear are generally somewhat different from those used for music. The range of frequencies needed to convey the sounds of a human voice is normally far narrower than that needed for music, and the sound is normally less complex. As a result, speech can be encoded at high quality using a relatively low bit rate. This is accomplished, in general, by some combination of two approaches: * Only encoding sounds that could be made by a single human voice. * Throwing away more of the data in the signal—keeping just enough to reconstruct an "intelligible" voice rather than the full frequency range of human
hearing Hearing, or auditory perception, is the ability to perceive sounds In physics, sound is a vibration that propagates as an acoustic wave, through a transmission medium such as a gas, liquid or solid. In human physiology and psychology, ...
. The earliest algorithms used in speech encoding (and audio data compression in general) were the
A-law algorithm An A-law algorithm is a standard companding algorithm, used in European 8-bit PCM digital communications Communication (from Latin ''communicare'', meaning "to share") is the act of developing meaning among entities or groups through th ...
and the
μ-law algorithm The μ-law algorithm (sometimes written "Mu (letter), mu-law", often typographic approximation, approximated as "u-law") is a companding algorithm, primarily used in 8-bit PCM Digital data, digital telecommunication systems in North America and ...
.


History

Early audio research was conducted at
Bell Labs Nokia Bell Labs (formerly named Bell Labs Innovations (1996–2007), AT&T Bell Laboratories (1984–1996) and Bell Telephone Laboratories (1925–1984)) is an American industrial research and scientific development company A company, ab ...
. There, in 1950, C. Chapin Cutler filed the patent on
differential pulse-code modulation Differential pulse-code modulation (DPCM) is a signal encoder that uses the baseline of pulse-code modulation Pulse-code modulation (PCM) is a method used to Digital signal (signal processing), digitally represent sampled analog signals. It ...
(DPCM). In 1973,
Adaptive DPCM Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation Differential pulse-code modulation (DPCM) is a signal encoder that uses the baseline of pulse-code modulation (PCM) but adds some functionalities ...
(ADPCM) was introduced by P. Cummiskey, Nikil S. Jayant and James L. Flanagan.
Perceptual coding Psychoacoustics is the branch of psychophysics Psychophysics quantitatively investigates the relationship between physical stimulus (physiology), stimuli and the sensation (psychology), sensations and perceptions they produce. Psychophysics has be ...
was first used for
speech coding Speech coding is an application of data compression In signal processing Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as audio signal processing, sound, image ...
compression, with
linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic rep ...
(LPC). Initial concepts for LPC date back to the work of
Fumitada Itakura is a Japanese scientist A scientist is a person who conducts scientific research The scientific method is an Empirical evidence, empirical method of acquiring knowledge that has characterized the development of science since at least the ...
(
Nagoya University , abbreviated to or NU, is a Japanese national research university A research university is a university A university ( la, universitas, 'a whole') is an educational institution, institution of higher education, higher (or Tertiary educat ...
) and Shuzo Saito (
Nippon Telegraph and Telephone , commonly known as NTT, is a Japanese telecommunications Telecommunication is the transmission of information by various types of technologies over , radio, , or other systems. It has its origin in the desire of humans for communication o ...
) in 1966. During the 1970s, Bishnu S. Atal and
Manfred R. Schroeder Manfred Robert Schroeder (12 July 1926 – 28 December 2009) was a German physicist, most known for his contributions to acoustics Acoustics is a branch of physics Physics is the that studies , its , its and behavior through , an ...
at
Bell Labs Nokia Bell Labs (formerly named Bell Labs Innovations (1996–2007), AT&T Bell Laboratories (1984–1996) and Bell Telephone Laboratories (1925–1984)) is an American industrial research and scientific development company A company, ab ...
developed a form of LPC called
adaptive predictive codingAdaptive predictive coding (APC) is a narrowband analog-to-digital conversion that uses a one-level or multilevel sampling system in which the value of the signal In signal processing Signal processing is an electrical engineering subfi ...
(APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the
code-excited linear prediction Code-excited linear prediction (CELP) is a linear predictive Linearity is the property of a mathematical relationship (''function (mathematics), function'') that can be graph of a function, graphically represented as a straight Line (geometry), ...
(CELP) algorithm which achieved a significant
compression ratio The compression ratio is the ratio between the volume of the cylinder A cylinder (from ) has traditionally been a Solid geometry, three-dimensional solid, one of the most basic of curvilinear geometric shapes. Geometrically, it can be consi ...

compression ratio
for its time. Perceptual coding is used by modern audio compression formats such as
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal processing), digital form. In digital a ...

MP3
and
AAC AAC may refer to: Aviation * Advanced Aircraft Advanced Aircraft Corporation is an aircraft manufacturer based in Carlsbad, California. History AAC bought out Riley Aircraft in 1983 in aviation, 1983. Products The firm has specialised in conve ...
.
Discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are ...
(DCT), developed by Nasir Ahmed, T. Natarajan and
K. R. Rao Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington (UT Arlington). Academically known as K. R. Rao, he is credited with the co-invention of ...
in 1974, provided the basis for the
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that ...
(MDCT) used by modern audio compression formats such as MP3,
Dolby Digital Dolby Digital, originally synonymous with Dolby AC-3, is the name for what has now become a family of audio compression technologies developed by Dolby Laboratories Dolby Laboratories, Inc. (often shortened to Dolby Labs and known simply a ...
, and AAC. MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987, following earlier work by Princen and Bradley in 1986. The world's first commercial
broadcast automation Broadcast automation incorporates the use of broadcast programming Broadcast programming is the practice of organizing and/or ordering (scheduling) of broadcast media shows, typically radio and television, in a daily, weekly, monthly, quart ...
audio compression system was developed by Oscar Bonello, an engineering professor at the
University of Buenos Aires The University of Buenos Aires ( es, Universidad de Buenos Aires, UBA) is a public university, public research university in Buenos Aires, Argentina. Established in 1821, it is the premier institution of higher learning in the country and one of ...
. In 1983, using the psychoacoustic principle of the masking of critical bands first published in 1967, he started developing a practical application based on the recently developed
IBM PC The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first computer released in the IBM PC model line and the basis for the IBM PC compatible IBM PC compatible computers are similar to the original IBM PC The IBM ...

IBM PC
computer, and the broadcast automation system was launched in 1987 under the name Audicom. Twenty years later, almost all the radio stations in the world were using similar technology manufactured by a number of companies. A literature compendium for a large variety of audio coding systems was published in the IEEE's ''Journal on Selected Areas in Communications'' (''JSAC''), in February 1988. While there were some papers from before that time, this collection documented an entire variety of finished, working audio coders, nearly all of them using perceptual techniques and some kind of frequency analysis and back-end noiseless coding.


Video

Uncompressed video Uncompressed video is digital video that either has never been compressed or was generated by decompressing previously compressed digital video. It is commonly used by video cameras, video monitors, video recording devices (including general-pur ...
requires a very high data rate. Although lossless video compression codecs perform at a compression factor of 5 to 12, a typical
H.264 Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 MPEG-4 is a method of defining compression Compression may refer to: Physical science *Compression (physics), size reduction due to forces *Compression member, a structural elem ...
lossy compression video has a compression factor between 20 and 200. The two key video compression techniques used in
video coding standards A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Ha ...
are the
discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are ...
(DCT) and
motion compensation Motion compensation is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video ...
(MC). Most video coding standards, such as the
H.26x The Video Coding Experts Group or Visual Coding Experts Group (VCEG, also known as Question 6) is a working group of the (ITU-T) concerned with video coding standards. It is responsible for standardization of the "H.26x" line of video coding sta ...
and
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...

MPEG
formats, typically use motion-compensated DCT video coding (block motion compensation). Most video codecs are used alongside audio compression techniques to store the separate but complementary data streams as one combined package using so-called '' container formats''.


Encoding theory

Video data may be represented as a series of still image frames. Such data usually contains abundant amounts of spatial and temporal redundancy. Video compression algorithms attempt to reduce redundancy and store information more compactly. Most video compression formats and
codecs A codec is a device or computer program In imperative programming, a computer program is a sequence of instructions in a programming language that a computer can execute or interpret. In declarative programming, a ''computer program'' is a Set ...
exploit both spatial and temporal redundancy (e.g. through difference coding with
motion compensation Motion compensation is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video ...
). Similarities can be encoded by only storing differences between e.g. temporally adjacent frames (inter-frame coding) or spatially adjacent pixels (intra-frame coding).
Inter-frame An inter frame is a frame in a video compression In signal processing, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression i ...
compression (a temporal
delta encoding Delta encoding is a way of storing or transmitting data Data are units of information Information can be thought of as the resolution of uncertainty; it answers the question of "What an entity is" and thus defines both its essence and t ...
) (re)uses data from one or more earlier or later frames in a sequence to describe the current frame.
Intra-frame coding Intra-frame coding is used in video coding (compression). It is part of an intra-frame codec like ProRes: a group of pictures codec without inter frames. Intra-frame prediction exploits spatial redundancy, i.e. correlation among pixels within ...
, on the other hand, uses only data from within the current frame, effectively being still-
image compression Image compression is a type of data compression In signal processing Signal processing is an electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipment, ...
. The intra-frame video coding formats used in camcorders and video editing employ simpler compression that uses only intra-frame prediction. This simplifies video editing software, as it prevents a situation in which a compressed frame refers to data that the editor has deleted. Usually, video compression additionally employs
lossy compression In information technology, lossy compression or irreversible compression is the class of data compression, data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to r ...
techniques like quantization that reduce aspects of the source data that are (more or less) irrelevant to the human visual perception by exploiting perceptual features of human vision. For example, small differences in color are more difficult to perceive than are changes in brightness. Compression algorithms can average a color across these similar areas to reduce space, in a manner similar to those used in
JPEG JPEG ( ) is a commonly used method of lossy compression In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represe ...

JPEG
image compression. As in all lossy compression, there is a
trade-off A trade-off (or tradeoff) is a situational decision that involves diminishing or losing one quality, quantity, or property of a set or design in return for gains in other aspects. In simple terms, a tradeoff is where one thing increases, and another ...
between
video quality Video quality is a characteristic of a video Video is an electronic Electronic may refer to: *Electronics Electronics comprises the physics, engineering, technology and applications that deal with the emission, flow and control of el ...
and
bit rate In telecommunications and computing, bit rate (bitrate or as a variable ''R'') is the number of bits that are conveyed or processed per unit of time. The bit rate is expressed in the unit Data rate units, bit per second unit (symbol: ''bit/s' ...
, cost of processing the compression and decompression, and system requirements. Highly compressed video may present visible or distracting artifacts. Other methods than the prevalent DCT-based transform formats, such as
fractal compression Fractal compression is a lossy compression In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. Thes ...
,
matching pursuit Matching pursuit (MP) is a sparse approximation algorithm which finds the "best matching" projections of multidimensional data onto the span of an over-complete (i.e., redundant) dictionary D. The basic idea is to approximately represent a signal ...

matching pursuit
and the use of a
discrete wavelet transform . The original image is high-pass filtered, yielding the three large images, each describing local changes in brightness (details) in the original image. It is then low-pass filtered and downscaled, yielding an approximation image; this image is hig ...
(DWT), have been the subject of some research, but are typically not used in practical products (except for the use of wavelet coding as still-image coders without motion compensation). Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness of such methods.


= Inter-frame coding

= Inter-frame coding works by comparing each frame in the video with the previous one. Individual frames of a video sequence are compared from one frame to the next, and the video compression codec sends only the differences to the reference frame. If the frame contains areas where nothing has moved, the system can simply issue a short command that copies that part of the previous frame into the next one. If sections of the frame move in a simple manner, the compressor can emit a (slightly longer) command that tells the decompressor to shift, rotate, lighten, or darken the copy. This longer command still remains much shorter than intraframe compression. Usually, the encoder will also transmit a residue signal which describes the remaining more subtle differences to the reference imagery. Using entropy coding, these residue signals have a more compact representation than the full signal. In areas of video with more motion, the compression must encode more data to keep up with the larger number of pixels that are changing. Commonly during explosions, flames, flocks of animals, and in some panning shots, the high-frequency detail leads to quality decreases or to increases in the
variable bitrate Variable bitrate (VBR) is a term used in telecommunications Telecommunication is the transmission of information by various types of technologies over , radio, , or other systems. It has its origin in the desire of humans for communication ...
.


Hybrid block-based transform formats

Today, nearly all commonly used video compression methods (e.g., those in standards approved by the
ITU-T The ITU Telecommunication Standardization Sector (ITU-T) coordinates standards for telecommunications Telecommunication is the transmission of information by various types of technologies over , radio, , or other systems. It has its origin ...
or
ISO The International Organization for Standardization (ISO ) is an international standard An international standard is a technical standard A technical standard is an established norm Norm, the Norm or NORM may refer to: In academic discipline ...
) share the same basic architecture that dates back to
H.261 H.261 is an ITU-T The ITU Telecommunication Standardization Sector (ITU-T) coordinates standards for telecommunications and Information Communication Technology such as X.509 for cybersecurity, Y.3172 and Y.3173 for machine learning, and H.264/MP ...
which was standardized in 1988 by the ITU-T. They mostly rely on the DCT, applied to rectangular blocks of neighboring pixels, and temporal prediction using
motion vector In video compression In signal processing Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as audio signal processing, sound, image processing, images, and scienti ...
s, as well as nowadays also an in-loop filtering step. In the prediction stage, various deduplication and difference-coding techniques are applied that help decorrelate data and describe new data based on already transmitted data. Then rectangular blocks of (residue)
pixel In digital imaging Digital imaging or digital image acquisition is the creation of a representation of the visual characteristics of an object, such as a physical scene or the interior structure of an object. The term is often assumed to imp ...

pixel
data are transformed to the frequency domain to ease targeting irrelevant information in quantization and for some spatial redundancy reduction. The
discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are ...
(DCT) that is widely used in this regard was introduced by
N. Ahmed Nasir Ahmed (born 1940 in Bangalore Bangalore , List of renamed Indian cities and states, officially known as Bengaluru (), is the Capital city, capital and the largest city of the Indian state of Karnataka. It has a population of more ...
, T. Natarajan and
K. R. Rao Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington (UT Arlington). Academically known as K. R. Rao, he is credited with the co-invention of ...
in 1974. In the main lossy processing stage, that data gets quantized in order to reduce information that is irrelevant to human visual perception. In the last stage statistical redundancy gets largely eliminated by an entropy coder which often applies some form of arithmetic coding. In an additional in-loop filtering stage various filters can be applied to the reconstructed image signal. By computing these filters also inside the encoding loop they can help compression because they can be applied to reference material before it gets used in the prediction process and they can be guided using the original signal. The most popular example are deblocking filters that blur out blocking artifacts from quantization discontinuities at transform block boundaries.


History

In 1967, A.H. Robinson and C. Cherry proposed a
run-length encoding Run-length encoding (RLE) is a form of lossless data compression Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy compression ...
bandwidth compression scheme for the transmission of analog television signals.
Discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are ...
(DCT), which is fundamental to modern video compression, was introduced by Nasir Ahmed, T. Natarajan and
K. R. Rao Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington (UT Arlington). Academically known as K. R. Rao, he is credited with the co-invention of ...
in 1974.
H.261 H.261 is an ITU-T The ITU Telecommunication Standardization Sector (ITU-T) coordinates standards for telecommunications and Information Communication Technology such as X.509 for cybersecurity, Y.3172 and Y.3173 for machine learning, and H.264/MP ...
, which debuted in 1988, commercially introduced the prevalent basic architecture of video compression technology. It was the first
video coding format A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Ha ...
based on DCT compression, which would subsequently become the standard for all of the major video coding formats that followed. H.261 was developed by a number of companies, including
Hitachi () is a Japanese multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from multiple countries * Multinational state, a sovereign ...

Hitachi
,
PictureTel PictureTel Corporation, often shortened to PictureTel Corp., was one of the first commercial videoconferencing product companies. It achieved peak revenues of over $490 million in 1996 and 1997 and was eventually acquired by Polycom in October ...
, NTT, BT and
Toshiba is a Japanese multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from multiple countries * Multinational state, a sovereign st ...
. The most popular
video coding standard A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). It typically uses a standardized video compression ...
s used for codecs have been the
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...

MPEG
standards.
MPEG-1 MPEG-1 is a standard for lossy In information technology, lossy compression or irreversible compression is the class of data compression, data encoding methods that uses inexact approximations and partial data discarding to represent the content ...
was developed by the
Motion Picture Experts Group The Moving Picture Experts Group (MPEG) is an alliance of working groupWorking Group may refer to: * Working group, an interdisciplinary group of researchers; or * Working Group (dogs), kennel club designation for certain purebred dog breeds; ...
(MPEG) in 1991, and it was designed to compress
VHS VHS (Video Home System) is a for consumer-level on tape . From the 1950s, video recording became a major contributor to the television industry, via the first commercialized s (VTRs). At that time, the expensive devices were used only i ...

VHS
-quality video. It was succeeded in 1994 by
MPEG-2 MPEG-2 (a.k.a. H.222/H.262 as was defined by the ITU 260px, ITU Monument, Bern The International Telecommunication Union is a specialized agency of the United Nations responsible for all matters related to information and communicati ...
/
H.262 H.262 or MPEG-2 Part 2 (formally known as ITU-T Recommendation H.262 and ISO/IEC 13818-2, also known as MPEG-2 Video) is a video coding format A video coding format (or sometimes video compression format) is a content representation format ...
, which was developed by a number of companies, primarily
Sony , commonly known as Sony and stylized as SONY, is a Japanese multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from mult ...

Sony
, Thomson and
Mitsubishi Electric , established on 15 January 1921, is a Japanese multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from multiple countries * Mul ...
. MPEG-2 became the standard video format for
DVD The DVD (common abbreviation for Digital Video Disc or Digital Versatile Disc) is a digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Hardware *Digital electronics Digital elect ...

DVD
and SD
digital television Digital television (DTV) is the transmission of television audiovisual Audiovisual (AV) is electronic media 200px, Graphical representations of electrical audio data. Electronic media uses either analog (red) or digital (blue) signal pr ...
. In 1999, it was followed by
MPEG-4 MPEG-4 is a method of defining compression Compression may refer to: Physical science *Compression (physics), size reduction due to forces *Compression member, a structural element such as a column *Compressibility, susceptibility to compression ...
/
H.263 H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videoconferencing Videotelephony comprises the technologies for the reception and transmission of audio- video signals by users in different loc ...
, which was a major leap forward for video compression technology. It was developed by a number of companies, primarily Mitsubishi Electric,
Hitachi () is a Japanese multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from multiple countries * Multinational state, a sovereign ...

Hitachi
and
Panasonic formerly is a major Japanese multinational corporation, multinational Conglomerate (company), conglomerate company, headquartered in Kadoma, Osaka, Kadoma, Osaka Prefecture, Osaka. It was founded by Kōnosuke Matsushita in 1918 as a lightbulb ...

Panasonic
. The most widely used video coding format is H.264/MPEG-4 AVC. It was developed in 2003 by a number of organizations, primarily Panasonic, Godo Kaisha IP Bridge and
LG Electronics LG Electronics Inc. () is a South Korean Multinational corporation, multinational electronics company headquartered in Yeouido-dong, Seoul, South Korea. LG Electronics is a part of the LG Corporation, the fourth-largest ''chaebol'' in South K ...

LG Electronics
. AVC commercially introduced the modern
context-adaptive binary arithmetic codingContext-adaptive binary arithmetic coding (CABAC) is a form of entropy encoding In information theory, an entropy coding (or entropy encoding) is a lossless compression , lossless data compression scheme that is independent of the specific characteri ...
(CABAC) and
context-adaptive variable-length coding Context-adaptive variable-length coding (CAVLC) is a form of entropy coding In information theory Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, ...
(CAVLC) algorithms. AVC is the main video encoding standard for
Blu-ray Disc The Blu-ray Disc (BD), often known simply as Blu-ray, is a digital Digital usually refers to something using digits, particularly binary digits. Technology and computing Hardware *Digital electronics Digital electronics is a field of elec ...

Blu-ray Disc
s, and is widely used by video sharing websites and streaming internet services such as
YouTube YouTube is an American online video sharing and social media platform Social media are interactive technologies that allow the Content creation, creation or information sharing, sharing/exchange of information, ideas, career interests, an ...

YouTube
,
Netflix Netflix, Inc. is an American subscription The subscription business model is a business model in which a customer In sales Sales are activities related to selling or the number of goods sold in a given targeted time period. Th ...

Netflix
,
Vimeo Vimeo, Inc. () is an American video hosting, sharing, and services platform provider headquartered in New York City. Vimeo focuses on the delivery of high-definition video across a range of devices. Vimeo's business model is through software as ...

Vimeo
, and
iTunes Store The iTunes Store is a software-based online digital media Digital media means any media (communication), communication media that operate with the use of any of various encoded machine-readable data formats. Digital media can be created, v ...

iTunes Store
, web software such as
Adobe Flash Player Adobe Flash Player (also called Shockwave Flash in Internet Explorer Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, (from August 16, 1995 to March 30, 2021) commonly abbreviated IE or MSIE) is a disc ...
and
Microsoft Silverlight Microsoft Silverlight is a discontinued application framework In computer programming Computer programming is the process of designing and building an executable computer program to accomplish a specific computing result or to perform a s ...
, and various
HDTV High-definition television (HD or HDTV) describes a television system providing a substantially higher image resolution Image resolution is the detail an holds. The term applies to s, film images, and other types of images. Higher resolution m ...

HDTV
broadcasts over terrestrial and satellite television.


Genetics

Genetics compression algorithms are the latest generation of lossless algorithms that compress data (typically sequences of nucleotides) using both conventional compression algorithms and genetic algorithms adapted to the specific datatype. In 2012, a team of scientists from Johns Hopkins University published a genetic compression algorithm that does not use a reference genome for compression. HAPZIPPER was tailored for International HapMap Project, HapMap data and achieves over 20-fold compression (95% reduction in file size), providing 2- to 4-fold better compression and is less computationally intensive than the leading general-purpose compression utilities. For this, Chanda, Elhaik, and Bader introduced MAF-based encoding (MAFE), which reduces the heterogeneity of the dataset by sorting SNPs by their minor allele frequency, thus homogenizing the dataset. Other algorithms in 2009 and 2013 (DNAZip and GenomeZip) have compression ratios of up to 1200-fold—allowing 6 billion basepair diploid human genomes to be stored in 2.5 megabytes (relative to a reference genome or averaged over many genomes). For a benchmark in genetics/genomics data compressors, see


Outlook and currently unused potential

It is estimated that the total amount of data that is stored on the world's storage devices could be further compressed with existing compression algorithms by a remaining average factor of 4.5:1. It is estimated that the combined technological capacity of the world to store information provides 1,300 exabytes of hardware digits in 2007, but when the corresponding content is optimally compressed, this only represents 295 exabytes of Shannon information.


See also

* Auditory masking * HTTP compression * Kolmogorov complexity * Magic compression algorithm * Minimum description length * Modulo-N code * Motion coding * Perceptual audio coder * Range encoding * Sub-band coding * Universal code (data compression) * Vector quantization


References


External links


Data Compression Basics (Video)

Video compression 4:2:2 10-bit and its benefits

Why does 10-bit save bandwidth (even when content is 8-bit)?

Which compression technology should be used

Wiley – Introduction to Compression Theory

EBU subjective listening tests on low-bitrate audio codecs

Audio Archiving Guide: Music Formats
(Guide for helping a user pick out the right codec) *
hydrogenaudio wiki comparison

Introduction to Data Compression
by Guy E Blelloch from Carnegie Mellon University, CMU
HD Greetings – 1080p Uncompressed source material for compression testing and research


* [http://www.soundexpert.info/ Interactive blind listening tests of audio codecs over the internet]
TestVid – 2,000+ HD and other uncompressed source video clips for compression testing

Videsignline – Intro to Video Compression

Data Footprint Reduction Technology


{{Authority control Data compression, Digital audio Digital television Film and video technology Video compression Videotelephony Utility software types