Block-matching Algorithm
   HOME

TheInfoList



OR:

A Block Matching Algorithm is a way of locating matching
macroblock The macroblock is a processing unit in image and video compression formats based on linear block transforms, typically the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform ...
s in a sequence of
digital video Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises ...
frames for the purposes of
motion estimation Motion estimation is the process of determining ''motion vectors'' that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions b ...
. The underlying supposition behind motion estimation is that the patterns corresponding to objects and background in a frame of video sequence move within the frame to form corresponding objects on the subsequent frame. This can be used to discover temporal redundancy in the video sequence, increasing the effectiveness of inter-frame
video compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
by defining the contents of a macroblock by reference to the contents of a known macroblock which is minimally different. A block matching algorithm involves dividing the current
frame A frame is often a structural system that supports other components of a physical construction and/or steel frame that limits the construction's extent. Frame and FRAME may also refer to: Physical objects In building construction *Framing (con ...
of a video into macroblocks and comparing each of the macroblocks with a corresponding block and its adjacent neighbors in a nearby frame of the video (sometimes just the previous one). A
vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...
is created that models the movement of a macroblock from one location to another. This movement, calculated for all the macroblocks comprising a frame, constitutes the motion estimated in a frame. The search area for a good macroblock match is decided by the ‘search parameter’, p, where p is the number of
pixels In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device. In most digital display devices, pixels are the sm ...
on all four sides of the corresponding macro-block in the previous frame. The search parameter is a measure of motion. The larger the value of p, larger is the potential motion and the possibility for finding a good match. A full search of all potential blocks however is a computationally expensive task. Typical inputs are a macroblock of size 16 pixels and a search area of p = 7 pixels. Block-matching and 3D filtering makes use of this approach to solve various
image restoration Image restoration is the operation of taking a corrupt/noisy image and estimating the clean, original image. Corruption may come in many forms such as motion blur, noise and camera mis-focus. Image restoration is performed by reversing the process ...
inverse problems An inverse problem in science is the process of calculating from a set of observations the causal factors that produced them: for example, calculating an image in X-ray computed tomography, source reconstruction in acoustics, or calculating the ...
such as
noise reduction Noise reduction is the process of removing noise from a signal. Noise reduction techniques exist for audio and images. Noise reduction algorithms may distort the signal to some degree. Noise rejection is the ability of a circuit to isolate an und ...
and
deblurring Deblurring is the process of removing blurring artifacts from images. Deblurring recovers a sharp image ''S'' from a blurred image ''B'', where ''S'' is convolved with ''K'' (the blur kernel) to generate ''B''. Mathematically, this can be represen ...
in both still images and
digital video Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises ...
.


Motivation

Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. The motion vectors may relate to the whole image (global motion estimation) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom. Applying the motion vectors to an image to predict the transformation to another image, on account of moving camera or object in the image is called
motion compensation Motion compensation in computing, is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video d ...
. The combination of motion estimation and motion compensation is a key part of
video compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
as used by
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...
1, 2 and 4 as well as many other
video codecs The following is a list of compression formats and related codecs. Audio compression formats Non-compression * Linear pulse-code modulation (LPCM, generally only described as PCM) is the format for uncompressed audio in media files and it is al ...
. Motion estimation based video compression helps in saving bits by sending encoded difference images which have inherently less entropy as opposed to sending a fully coded frame. However, the most computationally expensive and resource extensive operation in the entire compression process is motion estimation. Hence, fast and computationally inexpensive algorithms for motion estimation is a need for video compression.


Evaluation Metrics

A metric for matching a macroblock with another block is based on a cost function. The most popular in terms of computational expense is: Mean difference or Mean Absolute Difference (MAD) = \frac\sum_^\sum_^ , C_-R_,
Mean Squared Error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...
(MSE) = \frac\sum_^\sum_^ (C_-R_)^2 where N is the size of the macro-block, and C_ and R_ are the pixels being compared in current macroblock and reference macroblock, respectively. The motion compensated image that is created using the motion vectors and macroblocks from the reference frame is characterized by
Peak signal-to-noise ratio Peak signal-to-noise ratio (PSNR) is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. Because many signals have a very wide dynamic ...
(PSNR), \text = 10 \log_\frac


Algorithms

Block Matching algorithms have been researched since mid-1980s. Many algorithms have been developed, but only some of the most basic or commonly used have been described below.


Exhaustive Search

This algorithm calculates the cost function at each possible location in the search window. This leads to the best possible match of the macro-block in the reference frame with a block in another frame. The resulting motion compensated image has highest peak signal-to-noise ratio as compared to any other block matching algorithm. However this is the most computationally extensive block matching algorithm among all. A larger search window requires greater number of computations.


Optimized hierarchical block matching (OHBM)

The optimized hierarchical block matching (OHBM) algorithm speeds up the exhaustive search based on the optimized image pyramids.


Three Step Search

It is one of the earliest fast block matching algorithms. It runs as follows: # Start with search location at center # Set step size S = 4 and search parameter p = 7 # Search 8 locations +/- S pixels around location (0,0) and the location (0,0) # Pick among the 9 locations searched, the one with minimum cost function # Set the new search origin to the above picked location # Set the new step size as S = S/2 # Repeat the search procedure until S = 1 The resulting location for S=1 is the one with minimum cost function and the macro block at this location is the best match. There is a reduction in computation by a factor of 9 in this algorithm. For p=7, while ES evaluates cost for 225 macro-blocks, TSS evaluates only for 25 macro blocks.


Two Dimensional Logarithmic Search

TDLS is closely related to TSS however it is more accurate for estimating motion vectors for a large search window size. The algorithm can be described as follows, # Start with search location at the center # Select an initial step size say, S = 8 # Search for 4 locations at a distance of S from center on the X and Y axes # Find the location of point with least cost function # If a point other than center is the best matching point, ## Select this point as the new center ## Repeat steps 2 to 3 # If the best matching point is at the center, set S = S/2 # If S = 1, all 8 locations around the center at a
distance Distance is a numerical or occasionally qualitative measurement of how far apart objects or points are. In physics or everyday usage, distance may refer to a physical length or an estimation based on other criteria (e.g. "two counties over"). ...
S are searched # Set the motion vector as the point with least cost function


New Three Step Search

TSS uses a uniformly allocated checking pattern and is prone to miss small motions. NTSS is an improvement over TSS as it provides a center biased search scheme and has provisions to stop halfway to reduce the computational cost. It was one of the first widely accepted fast algorithms and frequently used for implementing earlier standards like
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...
1 and H.261. The algorithm runs as follows: # Start with search location at center # Search 8 locations +/- S pixels with S = 4 and 8 locations +/- S pixels with S = 1 around location (0,0) # Pick among the 16 locations searched, the one with minimum cost function # If the minimum cost function occurs at origin, stop the search and set motion vector to (0,0) # If the minimum cost function occurs at one of the 8 locations at S = 1, set the new search origin to this location ## Check adjacent weights for this location, depending on location it may check either 3 or 5 points # The one that gives lowest weight is the closest match, set the motion vector to that location # If the lowest weight after the first step was one of the 8 locations at S = 4, the normal TSS procedure follows ## Pick among the 9 locations searched, the one with minimum cost function ## Set the new search origin to the above picked location ## Set the new step size as S = S/2 ## Repeat the search procedure until S = 1 Thus this algorithm checks 17 points for each macro-block and the worst-case scenario involves checking 33 locations, which is still much faster than TSS


Simple and Efficient Search

The idea behind TSS is that the error surface due to motion in every macro block is
unimodal In mathematics, unimodality means possessing a unique mode. More generally, unimodality means there is only a single highest value, somehow defined, of some mathematical object. Unimodal probability distribution In statistics, a unimodal pr ...
. A unimodal surface is a bowl shaped surface such that the weights generated by the cost function increase monotonically from the global minimum. However a unimodal surface cannot have two minimums in opposite directions and hence the 8 point fixed pattern search of TSS can be further modified to incorporate this and save computations. SES is the extension of TSS that incorporates this assumption. SES algorithm improves upon TSS algorithm as each search step in SES is divided into two phases: • First Phase : • Divide the area of search in four quadrants • Start search with three locations, one at center (A) and others (B and C), S=4 locations away from A in orthogonal directions • Find points in search quadrant for second phase using the weight distribution for A, B, C: • If (MAD(A)>=MAD(B) and MAD(A)>=MAD(C)), select points in second phase quadrant IV • If (MAD(A)>=MAD(B) and MAD(A)<=MAD(C)), select points in second phase quadrant I • If (MAD(A)=MAD(C)), select points in second phase quadrant III • Second Phase: • Find the location with lowest weight • Set the new search origin as the point found above • Set the new step size as S = S/2 • Repeat the SES search procedure until S=1 • Select the location with lowest weight as motion vector SES is computationally very efficient as compared to TSS. However the peak signal-to-noise ratio achieved is poor as compared to TSS as the error surfaces are not strictly unimodal in reality.


Four Step Search

Four Step Search is an improvement over TSS in terms of lower computational cost and better peak signal-to-noise ratio. Similar to NTSS, FSS also employs center
bias Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, ...
ed searching and has a halfway stop provision. The algorithm runs as follows: # Start with search location at center # Set step size S = 2, (irrespective of search parameter p) # Search 8 locations +/- S pixels around location (0,0) # Pick among the 9 locations searched, the one with minimum cost function # If the minimum weight is found at center for search window: ## Set the new step size as S = S/2 (that is S = 1) ## Repeat the search procedure from steps 3 to 4 ## Select location with the least weight as motion vector # If the minimum weight is found at one of the 8 locations other than the center: ## Set the new origin to this location ## Fix the step size as S = 2 ## Repeat the search procedure from steps 3 to 4. Depending on location of new origin, search through 5 locations or 3 locations ## Select the location with the least weight ## If the least weight location is at the center of new window go to step 5, else go to step 6


Diamond Search

Diamond Search (DS) algorithm uses a diamond search point pattern and the algorithm runs exactly the same as 4SS. However, there is no limit on the number of steps that the algorithm can take. Two different types of fixed patterns are used for search, * Large Diamond Search Pattern (LDSP) * Small Diamond Search Pattern (SDSP) The algorithm runs as follows: * LDSP: ## Start with search location at center ## Set step size S = 2 ## Search 8 locations pixels (X,Y) such that (, X, +, Y, =S) around location (0,0) using a diamond search point pattern ## Pick among the 9 locations searched, the one with minimum cost function ## If the minimum weight is found at center for search window, go to SDSP step ## If the minimum weight is found at one of the 8 locations other than the center, set the new origin to this location ## Repeat LDSP * SDSP: ## Set the new search origin ## Set the new step size as S = S/2 (that is S = 1) ## Repeat the search procedure to find location with least weight ## Select location with the least weight as motion vector This algorithm finds the global minimum very accurately as the search pattern is neither too big nor too small. Diamond Search algorithm has a peak signal-to-noise ratio close to that of Exhaustive Search with significantly less computational expense.


Adaptive Rood Pattern Search

Adaptive rood pattern search (ARPS) algorithm makes use of the fact that the general motion in a frame is usually
coherent Coherence, coherency, or coherent may refer to the following: Physics * Coherence (physics), an ideal property of waves that enables stationary (i.e. temporally and spatially constant) interference * Coherence (units of measurement), a deri ...
, i.e. if the macro blocks around the current macro block moved in a particular direction then there is a high
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
that the current macro block will also have a similar motion vector. This algorithm uses the motion vector of the macro block to its immediate left to predict its own motion vector. Adaptive rood pattern search runs as follows: # Start with search location at the center (origin) # Find the predicted motion vector for the block # Set step size S = max (, X, ,, Y, ), where (X,Y) is the
coordinate In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is sign ...
of predicted motion vector # Search for rood pattern distributed points around the origin at step size S # Set the point with least weight as origin # Search using small diamond search pattern (SDSP) around the new origin # Repeat SDSP search until least weighted point is at the center of SDSP Rood pattern search directly puts the search in an area where there is a high probability of finding a good matching block. The main advantage of ARPS over DS is if the predicted motion vector is (0, 0), it does not waste computational time in doing LDSP, but it directly starts using SDSP. Furthermore, if the predicted motion vector is far away from the center, then again ARPS saves on computations by directly jumping to that vicinity and using SDSP, whereas DS takes its time doing LDSP.


References


External links

1. http://www.mathworks.com/matlabcentral/fileexchange/8761-block-matching-algorithms-for-motion-estimation 2. https://www.ece.cmu.edu/~ee899/project/deepak_mid.htm {{DEFAULTSORT:Block-Matching Algorithm Film and video technology Video compression