HOME

TheInfoList



OR:

The generalized Hough transform (GHT), introduced by Dana H. Ballard in 1981, is the modification of the
Hough transform The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting pro ...
using the principle of template matching. The Hough transform was initially developed to detect analytically defined shapes (e.g., line,
circle A circle is a shape consisting of all points in a plane that are at a given distance from a given point, the centre. Equivalently, it is the curve traced out by a point that moves in a plane so that its distance from a given point is const ...
,
ellipse In mathematics, an ellipse is a plane curve surrounding two focus (geometry), focal points, such that for all points on the curve, the sum of the two distances to the focal points is a constant. It generalizes a circle, which is the special ty ...
etc.). In these cases, we have knowledge of the shape and aim to find out its location and orientation in the image. This modification enables the Hough transform to be used to detect an arbitrary object described with its model. The problem of finding the object (described with a model) in an image can be solved by finding the model's position in the image. With the generalized Hough transform, the problem of finding the model's position is transformed to a problem of finding the transformation's parameter that maps the model into the image. Given the value of the transformation's parameter, the position of the model in the image can be determined. The original implementation of the GHT used edge information to define a mapping from orientation of an edge point to a reference point of the shape. In the case of a
binary image A binary image is one that consists of pixels that can have one of exactly two colors, usually black and white. Binary images are also called ''bi-level'' or ''two-level'', Pixelart made of two colours is often referred to as ''1-Bit'' or ''1b ...
where pixels can be either black or white, every black pixel of the image can be a black pixel of the desired pattern thus creating a
locus Locus (plural loci) is Latin for "place". It may refer to: Entertainment * Locus (comics), a Marvel Comics mutant villainess, a member of the Mutant Liberation Front * ''Locus'' (magazine), science fiction and fantasy magazine ** ''Locus Award' ...
of reference points in the Hough space. Every pixel of the image votes for its corresponding reference points. The maximum points of the Hough space indicate possible reference points of the pattern in the image. This maximum can be found by scanning the Hough space or by solving a relaxed set of equations, each of them corresponding to a black pixel.


History

Merlin and Farber showed how to use a Hough algorithm when the desired curves could not be described analytically. It was a precursor to Ballard's algorithm that was restricted to
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
and did not account for
rotation Rotation, or spin, is the circular movement of an object around a '' central axis''. A two-dimensional rotating object has only one possible central axis and can rotate in either a clockwise or counterclockwise direction. A three-dimensional ...
and scale changes. The Merlin-Farber algorithm is impractical for real image data as in an image with many edge pixels, it finds many false positives due to repetitive pixel arrangements.


Theory of generalized Hough transform

To generalize the Hough algorithm to non-analytic curves, Ballard defines the following parameters for a generalized shape: ''a='' where ''y'' is a reference
origin Origin(s) or The Origin may refer to: Arts, entertainment, and media Comics and manga * Origin (comics), ''Origin'' (comics), a Wolverine comic book mini-series published by Marvel Comics in 2002 * The Origin (Buffy comic), ''The Origin'' (Bu ...
for the shape, ''θ'' is its orientation, and ''s = (sx, sy)'' describes two
orthogonal In mathematics, orthogonality is the generalization of the geometric notion of ''perpendicularity''. By extension, orthogonality is also used to refer to the separation of specific features of a system. The term also has specialized meanings in ...
scale factors. An algorithm can compute the best set of parameters for a given shape from edge pixel data. These parameters do not have equal status. The reference origin location, ''y'', is described in terms of a template table called the R table of possible edge pixel orientations. The computation of the additional parameters ''s'' and ''θ'' is then accomplished by straightforward transformations to this table. The key generalization to arbitrary shapes is the use of directional information. Given any shape and a fixed reference point on it, instead of a parametric curve, the information provided by the boundary pixels is stored in the form of the R-table in the transform stage. For every edge point on the test image, the properties of the point are looked up on the R-table and reference point is retrieved and the appropriate cell in a matrix called the Accumulator matrix is incremented. The cell with maximum 'votes' in the Accumulator matrix can be a possible point of existence of fixed reference of the object in the test image.


Building the R-table

Choose a reference point ''y'' for the shape (typically chosen inside the shape). For each boundary point ''x'', compute ''ɸ(x)'', the
gradient In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gradi ...
direction and ''r = y – x'' as shown in the image. Store ''r'' as a function of ''ɸ''. Notice that each index of ''ɸ'' may have many values of ''r''. One can either store the co-ordinate differences between the fixed reference and the edge point ''((xc – xij), (yc – yij))'' or as the radial distance and the angle between them ''(rij, αij)''. Having done this for each point, the R-table will fully represent the template object. Also, since the generation phase is invertible, we may use it to localise object occurrences at other places in the image.


Object localization

For each edge pixel ''x'' in the image, find the gradient ''ɸ'' and increment all the corresponding points ''x+r'' in the accumulator array ''A'' (initialized to a maximum size of the image) where r is a table entry indexed by ''ɸ'', i.e., ''r(ɸ)''. These entry points give us each possible position for the reference point. Although some bogus points may be calculated, given that the object exists in the image, a maximum will occur at the reference point. Maxima in ''A'' correspond to possible instances of the shape.


Generalization of scale and orientation

For a fixed orientation of shape, the accumulator array was two-dimensional in the reference point co-ordinates. To search for shapes of arbitrary orientation ''θ'' and scale ''s'', these two parameters are added to the shape description. The accumulator array now consists of four dimensions corresponding to the parameters ''(y, s, θ)''. The R-table can also be used to increment this larger dimensional space since different orientations and scales correspond to easily computed transformations of the table. Denote a particular R-table for a shape ''S'' by ''R(ɸ)''. Simple transformations to this table will allow it to detect scaled or rotated instances of the same shape. For example, if the shape is scaled by s and this transformation is denoted by ''Ts''. then ''Ts (ɸ)= sR(ɸ)'' i.e., all the vectors are scaled by ''s''. Also, if the object is rotated by ''θ'' and this transformation is denoted by ''Tθ'', then ''Tθ (ɸ)= Rot'' i.e., all the indices are incremented by – ''θ'' modulo 2π, the appropriate vectors ''r'' are found, and then they are rotated by ''θ''. Another property which will be useful in describing the composition of generalized Hough transforms is the change of reference point. If we want to choose a new reference point ''ỹ'' such that ''y-ỹ = z'' then the modification to the R-table is given by ''R(ɸ)+ z'', i.e. ''z'' is added to each vector in the table.


Alternate way using pairs of edges

A pair of edge pixels can be used to reduce the parameter space. Using the R-table and the properties as described above, each edge pixel defines a surface in the four-dimensional accumulator space of ''a = (y, s, θ)''. Two edge pixels at different orientations describe the same surface rotated by the same amount with respect to ''θ''. If these two surfaces intersect, points where they intersect will correspond to possible parameters ''a'' for the shape. Thus it is theoretically possible to use the two points in image space to reduce the locus in parameter space to a single point. However, the difficulties of finding the intersection points of the two surfaces in parameter space will make this approach unfeasible for most cases.


Composite shapes

If the shape S has a composite structure consisting of subparts ''S1'', ''S2'', .. ''SN'' and the reference points for the shapes ''S'', ''S1'', ''S2'', .. ''SN'' are ''y'', ''y1'', ''y2'', .. ''yn'', respectively, then for a scaling factor ''s'' and orientation ''θ'', the generalized Hough transform ''Rs(ɸ)'' is given by R_ = T_ \left\. The concern with this transform is that the choice of reference can greatly affect the accuracy. To overcome this, Ballard has suggested smoothing the resultant accumulator with a composite smoothing template. The composite smoothing template ''H(y)'' is given as a composite
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
of individual smoothing templates of the sub-shapes. H(y) = \sum_^ h_(y-y_). Then the improved Accumulator is given by ''As = A*H'' and the maxima in ''As'' corresponds to possible instances of the shape.


Spatial decomposition

Observing that the global Hough transform can be obtained by the summation of local Hough transforms of disjoint sub-region, Heather and Yang proposed a method which involves the
recursive subdivision This is a glossary of terms relating to computer graphics Computer graphics deals with generating images with the aid of computers. Today, computer graphics is a core technology in digital photography, film, video games, cell phone and comp ...
of the image into sub-images, each with their own parameter space, and organized in a
quadtree A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are the two-dimensional analog of octrees and are most often used to partition a two-dimensional space by recursively subdividing it into four q ...
structure. It results in improved efficiency in finding endpoints of line segments and improved robustness and reliability in extracting lines in noisy situations, at a slightly increased cost of memory.


Implementation

The implementation uses the following equations: : x = x_+x' \ \ or \ \ x_ = x-x' : y = y_+y' \ \ or \ \ y_ = y-y' : \cos(\pi-\alpha) = x'/r \ \ \text \ \ x' = r \cos(\pi-\alpha) = -r\cos(\alpha) : \sin(\pi-\alpha) = y'/r \ \ \text \ \ y' = r \sin(\pi-\alpha) = r\sin(\alpha) Combining the above equations we have: : x_ = x + r \cos(\alpha) : y_ = y + r \sin(\alpha) Constructing the R-table :(0) Convert the sample shape image into an edge image using any edge detecting algorithm like
Canny edge detector The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced a ''computational theory of edge detection'' explain ...
:(1) Pick a reference point (e.g., ''(xc, yc)'') :(2) Draw a line from the reference point to the boundary :(3) Compute ''ɸ'' :(4) Store the reference point ''(xc, yc)'' as a function of ''ɸ'' in ''R(ɸ)'' table. Detection: :(0) Convert the sample shape image into an edge image using any edge detecting algorithm like
Canny edge detector The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced a ''computational theory of edge detection'' explain ...
. :(1) Initialize the Accumulator table: ''A cmin . . . xcmaxycmin . . . ycmax]'' :(2) For each edge point ''(x, y)'' ::(2.1) Using the gradient angle ''ɸ'', retrieve from the R-table all the ''(α, r)'' values indexed under ''ɸ''. ::(2.2) For each ''(α,r)'', compute the candidate reference points: :::''xc = x + r cos(α)'' :::''yc = y + r sin(α)'' ::(2.3) Increase counters (voting): ::::''++A( xc c'' :(3) Possible locations of the object contour are given by local maxima in ''A cyc]''. ::If ''A cyc] > T'', then the object contour is located at ''(xc, yc)'' General case: Suppose the object has undergone some rotation ''Θ'' and uniform scaling ''s'': :''(x, y) --> (x″, y″)'' :''x″ = (x cos(Θ) – y sin(Θ))s'' :''y″ = (x sin(Θ) + y cos(Θ))s'' :''Replacing x by x″ and y by y″: '' :''xc = x – x″ or xc = x - (x cos(Θ) – y sin(Θ))s'' :''yc = y – y″ or yc = y - (x sin(Θ) + y cos(Θ))s'' :(1) Initialize the Accumulator table: ''A cmin . . . xcmaxycmin . . . ycmax] min . . . qmaxsmin . . . smax]'' :(2) For each edge point ''(x, y)'' ::(2.1) Using its gradient angle ''ɸ'', retrieve all the ''(α, r)'' values from the R-table ::(2.2) For each ''(α, r)'', compute the candidate reference points: ::::''x = r cos(α)'' ::::''y = r sin(α)'' ::::for(''Θ = Θmin; Θ ≤ Θmax; Θ++'') :::::for(''s = smin; s ≤ smax; s++'') ::::::''xc = x - (x cos(Θ) – y sin(Θ))s'' ::::::''yc = y - (x sin(Θ) + y cos(Θ))s'' ::::::''++(A cyc] ˜s])'' :(3) Possible locations of the object contour are given by local maxima in ''A cyc] ˜s]'' :If ''A cyc] ˜s] > T'', then the object contour is located at ''(xc, yc)'', has undergone a rotation ''Θ'', and has been scaled by ''s''.


Advantages and disadvantages


Advantages

* It is robust to partial or slightly deformed shapes (i.e., robust to recognition under occlusion). * It is robust to the presence of additional structures in the image. * It is tolerant to noise. * It can find multiple occurrences of a shape during the same processing pass.


Disadvantages

* It has substantial computational and storage requirements which become acute when object orientation and scale have to be considered.


Related work

Ballard suggested using orientation information of the edge decreasing the cost of the computation. Many efficient GHT techniques have been suggested such as the SC-GHT (Using slope and curvature as local properties). Davis and YamL. Davis and S. Yam, tp://ftp.cs.utexas.edu/pub/techreports/tr80-134.pdf "A generalized hough-like transformation for shape recognition" University of Texas Computer Sciences, TR-134, Feb 1980. also suggested an extension of Merlin's work for orientation and scale invariant matching which complement's Ballard's work but does not include Ballard's utilization of edge-slope information and composite structures


See also

*
Hough transform The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting pro ...
* Randomized Hough transform *
Radon Transform In mathematics, the Radon transform is the integral transform which takes a function ''f'' defined on the plane to a function ''Rf'' defined on the (two-dimensional) space of lines in the plane, whose value at a particular line is equal to the l ...
* Template matching *
Outline of object recognition Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the ...


References


External links

*
OpenCV OpenCV (''Open Source Computer Vision Library'') is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage then Itseez (which was later acquired by In ...
implementation of generalized Hough transform http://docs.opencv.org/master/dc/d46/classcv_1_1GeneralizedHoughBallard.html * Tutorial and implementation of generalized Hough transforms http://www.itriacasa.it/generalized-hough-transform/default.html * Practical Generalized Hough transform implementation http://www.irit.fr/~Julien.Pinquier/Docs/Hough_transform.html * FPGA implementation of generalized Hough transforms, IEEE Digital Library http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5382047&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5382047 *
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
implementation of generalized Hough transform http://www.mathworks.com/matlabcentral/fileexchange/44166-generalized-hough-transform {{DEFAULTSORT:Generalised Hough Transform Image processing