Pyramid, or pyramid representation, is a type of
multi-scale signal
In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The ''IEEE Transactions on Signal Processing'' ...
representation developed by the
computer vision
Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...
,
image processing
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensiona ...
and
signal processing
Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
communities, in which a signal or an image is subject to repeated
smoothing
In statistics and image processing, to smooth a data set is to create an approximating function (mathematics), function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena ...
and
subsampling. Pyramid representation is a predecessor to
scale-space representation and
multiresolution analysis
A multiresolution analysis (MRA) or multiscale approximation (MSA) is the design method of most of the practically relevant discrete wavelet transforms (DWT) and the justification for the algorithm of the fast wavelet transform (FWT). It was introd ...
.
Pyramid generation
There are two main types of pyramids: lowpass and bandpass.
A lowpass pyramid is made by smoothing the image with an appropriate smoothing filter and then subsampling the smoothed image, usually by a factor of 2 along each coordinate direction. The resulting image is then subjected to the same procedure, and the cycle is repeated multiple times. Each cycle of this process results in a smaller image with increased smoothing, but with decreased spatial sampling density (that is, decreased image resolution). If illustrated graphically, the entire multi-scale representation will look like a pyramid, with the original image on the bottom and each cycle's resulting smaller image stacked one atop the other.
A bandpass pyramid is made by forming the difference between images at adjacent levels in the pyramid and performing image interpolation between adjacent levels of resolution, to enable computation of pixelwise differences.
Pyramid generation kernels
A variety of different smoothing
kernels
Kernel may refer to:
Computing
* Kernel (operating system), the central component of most operating systems
* Kernel (image processing), a matrix used for image convolution
* Compute kernel, in GPGPU programming
* Kernel method, in machine learnin ...
have been proposed for generating pyramids.
Among the suggestions that have been given, the ''binomial kernels'' arising from the
binomial coefficient
In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers and is written \tbinom. It is the coefficient of the t ...
s stand out as a particularly useful and theoretically well-founded class.
[ Thus, given a two-dimensional image, we may apply the (normalized) binomial filter (1/4, 1/2, 1/4) typically twice or more along each spatial dimension and then subsample the image by a factor of two. This operation may then proceed as many times as desired, leading to a compact and efficient multi-scale representation. If motivated by specific requirements, intermediate scale levels may also be generated where the subsampling stage is sometimes left out, leading to an ''oversampled'' or ''hybrid pyramid''.][ With the increasing computational efficiency of CPUs available today, it is in some situations also feasible to use wider supported ]Gaussian filter
In electronics and signal processing mainly in digital signal processing, a Gaussian filter is a filter whose impulse response is a Gaussian function (or an approximation to it, since a true Gaussian response would have infinite impulse response) ...
s as smoothing kernels in the pyramid generation steps.
Gaussian pyramid
In a Gaussian pyramid, subsequent images are weighted down using a Gaussian average (Gaussian blur
In image processing, a Gaussian blur (also known as Gaussian smoothing) is the result of blurring an image by a Gaussian function (named after mathematician and scientist Carl Friedrich Gauss).
It is a widely used effect in graphics software, ...
) and scaled down. Each pixel containing a local average corresponds to a neighborhood pixel on a lower level of the pyramid. This technique is used especially in texture synthesis
Texture synthesis is the process of algorithmically constructing a large digital image from a small digital sample image by taking advantage of its structural content. It is an object of research in computer graphics and is used in many fields, amo ...
.
Laplacian pyramid
A Laplacian pyramid is very similar to a Gaussian pyramid but saves the difference image of the blurred versions between each levels. Only the smallest level is not a difference image to enable reconstruction of the high resolution image using the difference images on higher levels. This technique can be used in image compression
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior r ...
.
Steerable pyramid
A steerable pyramid, developed by Simoncelli and others, is an implementation of a multi-scale, multi-orientation band-pass filter
A band-pass filter or bandpass filter (BPF) is a device that passes frequencies within a certain range and rejects (attenuates) frequencies outside that range.
Description
In electronics and signal processing, a filter is usually a two-por ...
bank used for applications including image compression
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior r ...
, texture synthesis
Texture synthesis is the process of algorithmically constructing a large digital image from a small digital sample image by taking advantage of its structural content. It is an object of research in computer graphics and is used in many fields, amo ...
, and object recognition
Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the ...
. It can be thought of as an orientation selective version of a Laplacian pyramid, in which a bank of steerable filter
In applied mathematics, a steerable filter
is an orientation-selective convolution kernel used for image enhancement and feature extraction that can be expressed via a linear combination of a small set of rotated versions of itself. As an exam ...
s are used at each level of the pyramid instead of a single Laplacian or Gaussian filter
In electronics and signal processing mainly in digital signal processing, a Gaussian filter is a filter whose impulse response is a Gaussian function (or an approximation to it, since a true Gaussian response would have infinite impulse response) ...
.
Applications of pyramids
Alternative representation
In the early days of computer vision, pyramids were used as the main type of multi-scale representation for computing multi-scale image features
Feature may refer to:
Computing
* Feature (CAD), could be a hole, pocket, or notch
* Feature (computer vision), could be an edge, corner or blob
* Feature (software design) is an intentional distinguishing characteristic of a software item ...
from real-world image data. More recent techniques include scale-space representation, which has been popular among some researchers due to its theoretical foundation, the ability to decouple the subsampling stage from the multi-scale representation, the more powerful tools for theoretical analysis as well as the ability to compute a representation at ''any'' desired scale, thus avoiding the algorithmic problems of relating image representations at different resolution. Nevertheless, pyramids are still frequently used for expressing computationally efficient approximations to scale-space representation.[Lindeberg, T. and Bretzner, L]
Real-time scale selection in hybrid multi-scale representations
Proc. Scale-Space'03, Isle of Skye, Scotland, Springer Lecture Notes in Computer Science, volume 2695, pages 148-163, 2003.
Detail manipulation
Levels of a Laplacian pyramid can be added to or removed from the original image to amplify or reduce detail at different scales. However, detail manipulation of this form is known to produce halo artifacts in many cases, leading to the development of alternatives such as the bilateral filter
A bilateral filter is a non-linear, edge-preserving, and noise-reducing smoothing filter for images. It replaces the intensity of each pixel with a weighted average of intensity values from nearby pixels. This weight can be based on a Gaussian d ...
.
Some image compression
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior r ...
file formats use the Adam7 algorithm
Adam7 is an interlacing algorithm for raster images, best known as the interlacing scheme optionally used in PNG images. An Adam7 interlaced image is broken into seven subimages, which are defined by replicating this 8×8 pattern across the ...
or some other interlacing technique.
These can be seen as a kind of image pyramid.
Because those file format store the "large-scale" features first, and fine-grain details later in the file,
a particular viewer displaying a small "thumbnail" or on a small screen can quickly download just enough of the image to display it in the available pixels—so one file can support many viewer resolutions, rather than having to store or generate a different file for each resolution.
See also
* Mipmap
In computer graphics, mipmaps (also MIP maps) or pyramids are pre-calculated, optimized sequences of images, each of which is a progressively lower resolution representation of the previous. The height and width of each image, or level, in the m ...
* Scale space implementation
In the areas of computer vision, image analysis and signal processing, the notion of scale-space representation is used for processing measurement data at multiple scales, and specifically enhance or suppress image features over different ranges o ...
* Level of detail
* JPEG 2000#Multiple resolution representation
References
{{reflist
External links
Gaussian-Laplacian Pyramid Image Coding
- illustrates methods of Downsampling In digital signal processing, downsampling, compression, and decimation are terms associated with the process of ''resampling'' in a multi-rate digital signal processing system. Both ''downsampling'' and ''decimation'' can be synonymous with ''comp ...
, Upsampling
In digital signal processing, upsampling, expansion, and interpolation are terms associated with the process of resampling in a multi-rate digital signal processing system. ''Upsampling'' can be synonymous with ''expansion'', or it can describe an ...
, and Gaussian
Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below.
There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
convolution
In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
The Gaussian Pyramid
- provides a brief introduction for the procedure and cites several sources
Laplacian Irregular Graph Pyramid
- Figure 1 on this page illustrates an example of the Gaussian Pyramid
o
eBook Submission
Image processing
Computer vision