Generalized structure tensor
   HOME

TheInfoList



OR:

In
image analysis Image analysis or imagery analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques. Image analysis tasks can be as simple as reading barcode, bar coded tags or a ...
, the generalized structure tensor (GST) is an extension of the Cartesian
structure tensor In mathematics, the structure tensor, also referred to as the second-moment matrix, is a matrix (mathematics), matrix derived from the gradient of a function (mathematics), function. It describes the distribution of the gradient in a specified ne ...
to
curvilinear coordinates In geometry, curvilinear coordinates are a coordinate system for Euclidean space in which the coordinate lines may be curved. These coordinates may be derived from a set of Cartesian coordinates by using a transformation that is invertible, l ...
. It is mainly used to detect and to represent the "direction" parameters of curves, just as the Cartesian structure tensor detects and represents the direction in Cartesian coordinates. Curve families generated by pairs of locally orthogonal functions have been the best studied. It is a widely known method in applications of image and video processing including computer vision, such as biometric identification by fingerprints, and studies of human tissue sections.


GST in 2D and locally orthogonal bases

Let the term image represent a function f(\xi(x,y),\eta(x,y)) where x,y are real variables and \xi,\eta, and f, are real valued functions. GST represents the direction along which the image f can undergo an infinitesimal translation with minimal (
total least squares In applied statistics, total least squares is a type of errors-in-variables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generaliz ...
) error, along the "lines" fulfilling the following conditions: 1. The "lines" are ordinary lines in the curvilinear coordinate basis \xi,\eta : \cos(\theta) \xi(x,y)+\sin(\theta) \eta(x,y)= \text which are curves in Cartesian coordinates as depicted by the equation above. The error is measured in the L^2 sense and the minimality of the error refers thereby to
L2 norm In mathematics, a norm is a function from a real or complex vector space to the non-negative real numbers that behaves in certain ways like the distance from the origin: it commutes with scaling, obeys a form of the triangle inequality, and ze ...
. 2. The functions \xi(x,y), \eta(x,y) constitute a harmonic pair, i.e. they fulfill
Cauchy–Riemann equations In the field of complex analysis in mathematics, the Cauchy–Riemann equations, named after Augustin-Louis Cauchy, Augustin Cauchy and Bernhard Riemann, consist of a system of differential equations, system of two partial differential equatio ...
, : \begin & \frac=-\frac, \\ pt& \frac=\frac. \end Accordingly, such curvilinear coordinates \xi,\eta are locally orthogonal. Then GST consists in : GST=(\lambda_-\lambda_) \int w(\xi,\eta)\left \begin \frac \\ \frac \\ \end \right frac, \fracd\xi d\eta +\lambda_ I where 0\le \lambda_\le \lambda_ are errors of (infinitesimal) translation in the best direction (designated by the angle \theta) and the worst direction (designated by \theta+\pi/2). The function w(\xi,\eta) is the window function defining the "outer scale" wherein the detection of \theta will be carried out, which can be omitted if it is already included in f or if f is the full image (rather than local). The matrix I is the
identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...
. Using the
chain rule In calculus, the chain rule is a formula that expresses the derivative of the Function composition, composition of two differentiable functions and in terms of the derivatives of and . More precisely, if h=f\circ g is the function such that h ...
, it can be shown that the integration above can be implemented as
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
s in Cartesian coordinates applied to the ordinary structure tensor when \xi,\eta pair the real and imaginary parts of an
analytic function In mathematics, an analytic function is a function that is locally given by a convergent power series. There exist both real analytic functions and complex analytic functions. Functions of each type are infinitely differentiable, but complex ...
g(z), : \begin \xi(x,y)=\Re g(z)\\ \eta(x,y)=\Im g(z)\\ \end where z=x+iy. Examples of analytic functions include g(z)=\log z=\log(x+iy), as well as monomials g(z)=z^n=(x+iy)^n, g(z)=z^=(x+iy)^, where n is an arbitrary positive or negative integer. The monomials g(z)=z^n are also referred to as
harmonic functions In mathematics, mathematical physics and the theory of stochastic processes, a harmonic function is a twice continuously differentiable function f\colon U \to \mathbb R, where is an open subset of that satisfies Laplace's equation, that ...
in computer vision, and image processing. Thereby, Cartesian
Structure tensor In mathematics, the structure tensor, also referred to as the second-moment matrix, is a matrix (mathematics), matrix derived from the gradient of a function (mathematics), function. It describes the distribution of the gradient in a specified ne ...
is a special case of GST where \xi=x, and \eta=y, i.e. the harmonic function is simply g(z)= z=(x+iy). Thus by choosing a harmonic function g, one can detect all curves that are linear combinations of its real and imaginary parts by convolutions on (rectangular) image grids only, even if \xi,\eta are non-Cartesian. Furthermore, the convolution computations can be done by using complex filters applied to the complex version of the structure tensor. Thus, GST implementations have frequently been done using complex version of the structure tensor, rather than using the (1,1) tensor.


Complex version of GST

As there is a complex version of the ordinary
structure tensor In mathematics, the structure tensor, also referred to as the second-moment matrix, is a matrix (mathematics), matrix derived from the gradient of a function (mathematics), function. It describes the distribution of the gradient in a specified ne ...
, there is also a complex version of the GST : \begin \kappa_ =(\lambda_1-\lambda_2)\exp(i2\theta)&=&w*(h*f)^2\\ \kappa_ =\lambda_1+\lambda_2&=&, w, *, h*f, ^2\\ \end which is identical to its cousin with the difference that w is a complex filter. It should be recalled that, the ordinary structure tensor w is a real filter, usually defined by a sampled and scaled Gaussian to delineate the neighborhood, also known as the outer scale. This simplicity is a reason for why GST implementations have predominantly used the complex version above. For curve families \xi,\eta defined by analytic functions g, it can be shown that, the neighborhood defining function is complex valued, :w=(x \pm iy)^n\exp(-(x^2+y^2)/(2\sigma^2))\propto(D_x \pm iD_y)^n\exp(-(x^2+y^2)/(2\sigma^2)) , a so called symmetry derivative of a Gaussian. Thus, the orientation wise variation of the pattern to be looked for is directly incorporated into the neighborhood defining function, and the detection occurs in the space of the (ordinary) structure tensor.


Basic concept for its use in image processing and computer vision

Efficient detection of \theta in images is possible by image processing for a pair \xi, \eta. Complex convolutions (or the corresponding matrix operations) and point-wise non-linear mappings are the basic computational elements of GST implementations. A total least square error estimation of 2\theta is then obtained along with the two errors, \lambda_ and \lambda_. In analogy with the Cartesian
structure tensor In mathematics, the structure tensor, also referred to as the second-moment matrix, is a matrix (mathematics), matrix derived from the gradient of a function (mathematics), function. It describes the distribution of the gradient in a specified ne ...
, the estimated angle is in double angle representation, i.e. 2\theta is delivered by computations, and can be used as a shape feature whereas \lambda_-\lambda_ alone or in combination with \lambda_+\lambda_ can be used as a quality (confidence, certainty) measure for the angle estimation. Logarithmic spirals, including circles, can for instance be detected by (complex) convolutions and non-linear mappings. The spirals can be in
gray Grey (more frequent in British English) or gray (more frequent in American English) is an intermediate color between black and white. It is a neutral or achromatic color, meaning that it has no chroma. It is the color of a cloud-covered s ...
(valued) images or in a
binary image A binary image is a digital image that consists of pixels that can have one of exactly two colors, usually black and white. Each pixel is stored as a single bit — i.e. either a 0 or 1. A binary image can be stored in memory as a bitmap: a p ...
, i.e. locations of edge elements of the concerned patterns, such as contours of circles or spirals, must not be known or marked otherwise. Generalized structure tensor can be used as an alternative to
Hough transform The Hough transform () is a feature extraction technique used in image analysis, computer vision, pattern recognition, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of ...
in
image processing An image or picture is a visual representation. An image can be two-dimensional, such as a drawing, painting, or photograph, or three-dimensional, such as a carving or sculpture. Images may be displayed through other media, including a pr ...
and
computer vision Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
to detect patterns whose local orientations can be modelled, for example junction points. The main differences comprise: *Negative, as well as complex voting are allowed; *With one template multiple patterns belonging to the same family can be detected; *Image binarization is not required.


Physical and mathematical interpretation

The curvilinear coordinates of GST can explain physical processes applied to images. A well known pair of processes consist in rotation, and zooming. These are related to the coordinate transformation \xi=\log(\sqrt) and \eta=\tan^(x,y). If an image f consists in iso-curves that can be explained by only \xi i.e. its iso-curves consist in circles f(\xi,\eta)=g(\xi), where g is any real valued differentiable function defined on 1D, the image is invariant to rotations (around the origin). Zooming (comprising unzooming) operation is modeled similarly. If the image has iso-curves that look like a "star" or bicycle spokes, i.e. f(\xi,\eta)=g(\eta) for some differentiable 1D function g then, the image f is invariant to scaling (w.r.t. the origin). In combination, : f(\xi,\eta)=g( \cos(\theta) \log(\sqrt)+\sin(\theta) \tan^(x,y)) is invariant to a certain amount of rotation combined with scaling, where the amount is precised by the parameter \theta. Analogously, the Cartesian
structure tensor In mathematics, the structure tensor, also referred to as the second-moment matrix, is a matrix (mathematics), matrix derived from the gradient of a function (mathematics), function. It describes the distribution of the gradient in a specified ne ...
is a representation of a
translation Translation is the communication of the semantics, meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The English la ...
too. Here the physical process consists in an ordinary translation of a certain amount along x combined with translation along y, : \cos(\theta) x+\sin(\theta) y= \text where the amount is specified by the parameter \theta. Evidently \theta here represents the direction of the line. Generally, the estimated \theta represents the direction (in \xi,\eta coordinates) along which infinitesimal translations leave the image invariant, in practice least variant. With every curvilinear coordinate basis pair, there is thus a pair of infinitesimal translators, a linear combination of which is a
Differential operator In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and retur ...
. The latter are related to
Lie algebra In mathematics, a Lie algebra (pronounced ) is a vector space \mathfrak g together with an operation called the Lie bracket, an alternating bilinear map \mathfrak g \times \mathfrak g \rightarrow \mathfrak g, that satisfies the Jacobi ident ...
.


Miscellaneous

"Image" in the context of the GST can mean both an ordinary image and an image neighborhood thereof (local image), depending on context. For example, a photograph is an image as is any neighborhood of it.


See also

*
Structure tensor In mathematics, the structure tensor, also referred to as the second-moment matrix, is a matrix (mathematics), matrix derived from the gradient of a function (mathematics), function. It describes the distribution of the gradient in a specified ne ...
*
Hough transform The Hough transform () is a feature extraction technique used in image analysis, computer vision, pattern recognition, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of ...
*
Tensor In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects associated with a vector space. Tensors may map between different objects such as vectors, scalars, and even other ...
*
Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
*
Corner detection Corner detection is an approach used within computer vision systems to extract certain kinds of Feature detection (computer vision), features and infer the contents of an image. Corner detection is frequently used in motion detection, image reg ...
*
Edge detection Edge or EDGE may refer to: Technology Computing * Edge computing, a network load-balancing system * Edge device, an entry point to a computer network * Adobe Edge, a graphical development application * Microsoft Edge, a web browser developed b ...
*
Affine shape adaptation Affine shape adaptation is a methodology for iteratively adapting the shape of the smoothing kernels in an affine group of smoothing kernels to the local image structure in neighbourhood region of a specific image point. Equivalently, affine shape ...
*
Directional derivative In multivariable calculus, the directional derivative measures the rate at which a function changes in a particular direction at a given point. The directional derivative of a multivariable differentiable (scalar) function along a given vect ...
*
Differential operator In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and retur ...
*
Lie algebra In mathematics, a Lie algebra (pronounced ) is a vector space \mathfrak g together with an operation called the Lie bracket, an alternating bilinear map \mathfrak g \times \mathfrak g \rightarrow \mathfrak g, that satisfies the Jacobi ident ...


References

{{reflist Tensors Feature detection (computer vision)