Digital image processing is the use of a
digital computer to process
digital image
A digital image is an image composed of picture elements, also known as ''pixels'', each with ''finite'', '' discrete quantities'' of numeric representation for its intensity or gray level that is an output from its two-dimensional functions ...
s through an
algorithm.
As a subcategory or field of
digital signal processing
Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are ...
, digital image processing has many advantages over
analog image processing
In electrical engineering and computer science, analog image processing is any image processing task conducted on two-dimensional analog signals by analog means (as opposed to digital image processing).
Basically any data can be represented in two ...
. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of
noise and
distortion during processing. Since images are defined over two dimensions (perhaps more) digital image processing may be modeled in the form of
multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation and improvement of discrete mathematics theory); third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.
History
Many of the techniques of
digital image
A digital image is an image composed of picture elements, also known as ''pixels'', each with ''finite'', '' discrete quantities'' of numeric representation for its intensity or gray level that is an output from its two-dimensional functions ...
processing, or digital picture processing as it often was called, were developed in the 1960s, at
Bell Laboratories, the
Jet Propulsion Laboratory,
Massachusetts Institute of Technology,
University of Maryland, and a few other research facilities, with application to
satellite imagery
Satellite images (also Earth observation imagery, spaceborne photography, or simply satellite photo) are images of Earth collected by imaging satellites operated by governments and businesses around the world. Satellite imaging companies sell ima ...
,
wire-photo standards conversion,
medical imaging
Medical imaging is the technique and process of imaging the interior of a body for clinical analysis and medical intervention, as well as visual representation of the function of some organs or tissues (physiology). Medical imaging seeks to rev ...
,
videophone,
character recognition, and photograph enhancement. The purpose of early image processing was to improve the quality of the image. It was aimed for human beings to improve the visual effect of people. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing include image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the sun and the environment of the moon. The impact of the successful mapping of the moon's surface map by the computer has been a huge success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the moon.
The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as
television standards conversion. As
general-purpose computer
A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations ( computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These pro ...
s became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest.
Image sensors
The basis for modern
image sensors
An image sensor or imager is a sensor that detects and conveys information used to make an image. It does so by converting the variable attenuation of light waves (as they pass through or reflect off objects) into signals, small bursts of curr ...
is
metal-oxide-semiconductor (MOS) technology,
which originates from the invention of the
MOSFET
The metal–oxide–semiconductor field-effect transistor (MOSFET, MOS-FET, or MOS FET) is a type of field-effect transistor (FET), most commonly fabricated by the controlled oxidation of silicon. It has an insulated gate, the voltage of which d ...
(MOS field-effect transistor) by
Mohamed M. Atalla and
Dawon Kahng at
Bell Labs in 1959.
This led to the development of digital
semiconductor image sensors, including the
charge-coupled device (CCD) and later the
CMOS sensor.
The charge-coupled device was invented by
Willard S. Boyle
Willard Sterling Boyle, (August 19, 1924May 7, 2011) was a Canadian physicist. He was a pioneer in the field of laser technology and co-inventor of the charge-coupled device.
As director of Space Science and Exploratory Studies at Bellcomm he h ...
and
George E. Smith at Bell Labs in 1969. While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny
MOS capacitor. As it was fairly straightforward to
fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next.
The CCD is a semiconductor circuit that was later used in the first
digital video camera
A video camera is an optical instrument that captures videos (as opposed to a movie camera, which records images on film). Video cameras were initially developed for the television industry but have since become widely used for a variety of other ...
s for
television broadcasting.
The
NMOS active-pixel sensor (APS) was invented by
Olympus
Olympus or Olympos ( grc, Ὄλυμπος, link=no) may refer to:
Mountains
In antiquity
Greece
* Mount Olympus in Thessaly, northern Greece, the home of the twelve gods of Olympus in Greek mythology
* Mount Olympus (Lesvos), located in Les ...
in Japan during the mid-1980s. This was enabled by advances in MOS
semiconductor device fabrication, with
MOSFET scaling
The metal–oxide–semiconductor field-effect transistor (MOSFET, MOS-FET, or MOS FET) is a type of field-effect transistor (FET), most commonly fabricated by the controlled oxidation of silicon. It has an insulated gate, the voltage of which d ...
reaching smaller
micron and then sub-micron levels.
The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The
CMOS
Complementary metal–oxide–semiconductor (CMOS, pronounced "sea-moss", ) is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) fabrication process that uses complementary and symmetrical pairs of p-type and n-type MOSFE ...
active-pixel sensor (CMOS sensor) was later developed by
Eric Fossum's team at the
NASA Jet Propulsion Laboratory in 1993.
By 2007, sales of CMOS sensors had surpassed CCD sensors.
Image compression
An important development in digital
image compression
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior r ...
technology was the
discrete cosine transform (DCT), a
lossy compression technique first proposed by
Nasir Ahmed in 1972.
DCT compression became the basis for
JPEG
JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
, which was introduced by the
Joint Photographic Experts Group in 1992.
JPEG compresses images down to much smaller file sizes, and has become the most widely used
image file format on the
Internet. Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of
digital images
A digital image is an image composed of picture elements, also known as ''pixels'', each with ''finite'', '' discrete quantities'' of numeric representation for its intensity or gray level that is an output from its two-dimensional functions f ...
and
digital photo
Digital photography uses cameras containing arrays of electronic photodetectors interfaced to an analog-to-digital converter (ADC) to produce images focused by a lens, as opposed to an exposure on photographic film. The digitized image is sto ...
s,
with several billion JPEG images produced every day .
Digital signal processor (DSP)
Electronic
signal processing was revolutionized by the wide adoption of
MOS technology in the 1970s.
MOS integrated circuit technology was the basis for the first single-chip
microprocessors and
microcontrollers in the early 1970s,
and then the first single-chip
digital signal processor
A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on MOS integrated circuit chips. They are widely used in audio si ...
(DSP) chips in the late 1970s.
DSP chips have since been widely used in digital image processing.
The
discrete cosine transform (DCT)
image compression
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior r ...
algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for
encoding, decoding,
video coding,
audio coding,
multiplexing
In telecommunications and computer networking, multiplexing (sometimes contracted to muxing) is a method by which multiple analog or digital signals are combined into one signal over a shared medium. The aim is to share a scarce resource - a ...
, control signals,
signaling
In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The ''IEEE Transactions on Signal Processing'' ...
,
analog-to-digital conversion, formatting
luminance
Luminance is a photometric measure of the luminous intensity per unit area of light travelling in a given direction. It describes the amount of light that passes through, is emitted from, or is reflected from a particular area, and falls withi ...
and color differences, and color formats such as
YUV444
YUV is a color model typically used as part of a color image pipeline. It encodes a color image or video taking human perception into account, allowing reduced bandwidth for chrominance components, compared to a "direct" RGB-representation. His ...
and
YUV411
YUV is a color model typically used as part of a color image pipeline. It encodes a color image or video taking human perception into account, allowing reduced bandwidth for chrominance components, compared to a "direct" RGB-representati ...
. DCTs are also used for encoding operations such as
motion estimation,
motion compensation,
inter-frame prediction,
quantization, perceptual weighting,
entropy encoding, variable encoding, and
motion vector
Motion estimation is the process of determining ''motion vectors'' that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions b ...
s, and decoding operations such as the inverse operation between different color formats (
YIQ,
YUV and
RGB) for display purposes. DCTs are also commonly used for
high-definition television (HDTV) encoder/decoder chips.
Medical imaging
In 1972, the engineer from British company EMI Housfield invented the X-ray computed tomography device for head diagnosis, which is what is usually called CT (computer tomography). The CT nucleus method is based on the projection of the human head section and is processed by computer to reconstruct the cross-sectional image, which is called image reconstruction. In 1975, EMI successfully developed a CT device for the whole body, which obtained a clear tomographic image of various parts of the human body. In 1979, this diagnostic technique won the Nobel Prize.
Digital image processing technology for medical applications was inducted into the
Space Foundation Space Technology Hall of Fame in 1994.
Tasks
Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means.
In particular, digital image processing is a concrete application of, and a practical technology based on:
*
Classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood.
Classification is the grouping of related facts into classes.
It may also refer to:
Business, organizat ...
*
Feature extraction
*
Multi-scale signal analysis
Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, di ...
*
Pattern recognition
*
Projection
Projection, projections or projective may refer to:
Physics
* Projection (physics), the action/process of light, heat, or sound reflecting from a surface to another in a different direction
* The display of images by a projector
Optics, graphic ...
Some techniques which are used in digital image processing include:
*
Anisotropic diffusion
*
Hidden Markov models
*
Image editing
*
Image restoration
*
Independent component analysis
*
Linear filtering
*
Neural networks
A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological ...
*
Partial differential equations
*
Pixelation
*
Point feature matching
In computer vision and image processing, a feature is a piece of information about the content of an image; typically about whether a certain region of the image has certain properties. Features may be specific structures in the image such as poi ...
*
Principal components analysis
*
Self-organizing maps
*
Wavelets
Digital image transformations
Filtering
Digital filters are used to blur and sharpen digital images. Filtering can be performed by:
*
convolution with specifically designed
kernels (filter array) in the spatial domain
* masking specific frequency regions in the frequency (Fourier) domain
The following examples show both methods:
Image padding in Fourier domain filtering
Images are typically padded before being transformed to the Fourier space, the
highpass filter
A high-pass filter (HPF) is an electronic filter that passes signals with a frequency higher than a certain cutoff frequency and attenuates signals with frequencies lower than the cutoff frequency. The amount of attenuation for each frequency d ...
ed images below illustrate the consequences of different padding techniques:
Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding.
Filtering code examples
MATLAB example for spatial domain highpass filtering.
img=checkerboard(20); % generate checkerboard
% ************************** SPATIAL DOMAIN ***************************
klaplace= -1 0; -1 5 -1; 0 -1 0 % Laplacian filter kernel
X=conv2(img,klaplace); % convolve test img with
% 3x3 Laplacian kernel
figure()
imshow(X,[]) % show Laplacian filtered
title('Laplacian Edge Detection')
Affine transformations
Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples:
To apply the affine matrix to an image, the image is converted to matrix in which each entry corresponds to the pixel intensity at that location. Then each pixel's location can be represented as a vector indicating the coordinates of that pixel in the image,
, y
The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
where x and y are the row and column of a pixel in the image matrix. This allows the coordinate to be multiplied by an affine-transformation matrix, which gives the position that the pixel value will be copied to in the output image.
However, to allow transformations that require translation transformations, 3 dimensional
homogeneous coordinates are needed. The third dimension is usually set to a non-zero constant, usually 1, so that the new coordinate is
, y, 1
The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
This allows the coordinate vector to be multiplied by a 3 by 3 matrix, enabling translation shifts. So the third dimension, which is the constant 1, allows translation.
Because matrix multiplication is associative, multiple affine transformations can be combined into a single affine transformation by multiplying the matrix of each individual transformation in the order that the transformations are done. This results in a single matrix that, when applied to a point vector, gives the same result as all the individual transformations performed on the vector
, y, 1
The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
in sequence. Thus a sequence of affine transformation matrices can be reduced to a single affine transformation matrix.
For example, 2 dimensional coordinates only allow rotation about the origin (0, 0). But 3 dimensional homogeneous coordinates can be used to first translate any point to (0, 0), then perform the rotation, and lastly translate the origin (0, 0) back to the original point (the opposite of the first translation). These 3 affine transformations can be combined into a single matrix, thus allowing rotation around any point in the image.
Image denoising with Morphology
Mathematical morphology is suitable for denoising images.
Structuring element In mathematical morphology, a structuring element is a shape, used to probe or interact with a given image, with the purpose of drawing conclusions on how this shape fits or misses the shapes in the image. It is typically used in morphological oper ...
are important in
Mathematical morphology.
The following examples are about Structuring elements. The denoise function, image as I, and structuring element as B are shown as below and table.
e.g.
Define Dilation(I, B)(i,j) =
. Let Dilation(I,B) = D(I,B)
D(I', B)(1,1) =
Define Erosion(I, B)(i,j) =
. Let Erosion(I,B) = E(I,B)
E(I', B)(1,1) =
After dilation
After erosion
An opening method is just simply erosion first, and then dilation while the closing method is vice versa. In reality, the D(I,B) and E(I,B) can implemented by
Convolution
In order to apply the denoising method to an image, the image is converted into grayscale. A mask with denoising method is logical matrix with