In digital image and

video processing In electronics engineering, video processing is a particular case of signal processing, in particular image processing, which often employs video filters and where the input and output signals are video files or video streams. Video processing t ...

, a color layout descriptor (CLD) is designed to capture the spatial distribution of

color Color (American English) or colour (British English) is the visual perceptual property deriving from the spectrum of light interacting with the photoreceptor cells of the eyes. Color categories and physical specifications of color are assoc ...

in an image. The feature extraction process consists of two parts: grid based representative color selection and discrete cosine transform with quantization. Color is the most basic quality of the visual contents, therefore it is possible to use colors to describe and represent an image. The MPEG-7 standard has tested the most efficient procedure to describe the color and has selected those that have provided more satisfactory results. This standard proposes different methods to obtain these descriptors, and one tool defined to describe the color is the CLD, that permits describing the color relation between sequences or group of images. The CLD captures the spatial layout of the representative colors on a grid superimposed on a region or image. Representation is based on coefficients of the discrete cosine transform (DCT). This is a very compact descriptor being highly efficient in fast browsing and search applications. It can be applied to still images as well as to video segments.

Definition

The CLD is a very compact and resolution-invariant representation of color for high-speed

image retrieval An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as capti ...

and it has been designed to efficiently represent the spatial distribution of colors. This feature can be used for a wide variety of similarity-based retrieval, content filtering and visualization. It is especially useful for spatial structure-based retrieval applications. This descriptor is obtained by applying the DCT transformation on a 2-D array of local representative colors in Y or Cb or Cr

color space A color space is a specific organization of colors. In combination with color profiling supported by various physical devices, it supports reproducible representations of colorwhether such representation entails an analog or a digital representa ...

. The functionalities of the CLD are basically the matching: ::: – Image-to-image matching ::: – Video clip-to-video clip matching Remark that the CLD is one of the most precise and fast color descriptor. Extraction process of the CLD

Extraction

The extraction process of this color descriptor consists of four stages: * Image partitioning * Representative color selection * DCT transformation * Zigzag scanning The standard MPEG-7 recommends using the

YCbCr YCbCr, Y′CbCr, or Y Pb/Cb Pr/Cr, also written as YCBCR or Y′CBCR, is a family of color spaces used as a part of the color image pipeline in video and digital photography systems. Y′ is the Luma (video), luma component and CB and CR are t ...

color space for the CLD. Image partitioning in CLD 2

Image partitioning

In the image partitioning stage, the input picture (on RGB color space) is divided into 64 blocks to guarantee the

invariance Invariant and invariance may refer to: Computer science * Invariant (computer science), an expression whose value doesn't change during program execution ** Loop invariant, a property of a program loop that is true before (and after) each iterat ...

to resolution or scale. The inputs and outputs of this step are summarized in the following table:

Representative color selection

After the image partitioning stage, a single representative color is selected from each block. Any method to select the representative color can be applied, but the standard recommends the use of the average of the pixel colors in a block as the corresponding representative color, since it is simpler and the description accuracy is sufficient in general. The selection results in a tiny image icon of size 8x8. The next figure shows this process. Note that in the image of the figure, the resolution of the original image has been maintained only in order to facilitate its representation. The inputs and outputs of this stage are summarized in the next table: Once the tiny image icon is obtained, the color space conversion between RGB and YCbCr is applied.

DCT transformation

In the fourth stage, the

luminance Luminance is a photometric measure of the luminous intensity per unit area of light travelling in a given direction. It describes the amount of light that passes through, is emitted from, or is reflected from a particular area, and falls withi ...

(Y) and the blue and red

chrominance Chrominance (''chroma'' or ''C'' for short) is the signal used in video systems to convey the color information of the picture (see YUV color model), separately from the accompanying luma signal (or Y' for short). Chrominance is usually represen ...

(Cb and Cr) are transformed by 8x8 DCT, so three sets of 64 DCT coefficients are obtained. To calculate the DCT in a 2D array, the formulas below are used. :

B_=\alpha_p \alpha_q \sum_^ \sum_^ A_ \cos\frac \cos\frac,\qquad 0 \le p \le M-1,\; 0 \le q \le N-1

\alpha_p=\begin\frac,&p=0 \\ \sqrt,&1\le p\le M-1\end \qquad \alpha_q=\begin\frac,&q=0 \\ \sqrt,&1\le q\le N-1\end

The inputs and outputs of this stage are summarized in the next table:

Zigzag scanning

A zigzag scanning is performed with these three sets of 64 DCT coefficients, following the schema presented in the figure. The purpose of the zigzag scan is to group the low frequency coefficients of the 8x8 matrix. The inputs and outputs of this stage are summarized in the next table: Finally, these three set of matrices correspond to the CLD of the input image.

Matching

The matching process helps to evaluate if two elements are equal comparing both elements and calculating the distance between them. In the case of color descriptors the matching process helps to evaluate if two images are similar. Its procedure is the following: ::: – Given an image as an input, the application attempts to find an image with a similar descriptor in a data base of images. If we consider two CLDs: ::: ::: , The distance between the two descriptors can be computed as: :::

D=\sqrt + \sqrt + \sqrt

The subscript i represents the zigzag-scanning order of the coefficients. Furthermore, notice that is possible to weight the coefficients (w) in order to adjust the performance of the matching process. These weights let us give to some components of the descriptor more importance than others. Observing the formula, it can be extracted that: ::: – 2 images are the same if the distance is 0 ::: – 2 images are similar if the distance is near to 0 Therefore, this matching process will let to identify images with similar color descriptors. Since the complexity of the similarity matching process shown above is low, high-speed image matching can be achieved.

External links

MASTER THESIS – Color Based Image Classification and Description
(Sergi Laencina Verdaguer)
Relating visual and semantic image descriptors
(J. Stauder and J. Sirot) Image processing Image search Multimedia