HOME

TheInfoList



OR:

The
image segmentation In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects ( sets of pixels). The goal of segmentation is to simpl ...
problem is concerned with partitioning an image into multiple regions according to some homogeneity criterion. This article is primarily concerned with graph theoretic approaches to image segmentation applying
graph partitioning In mathematics, a graph partition is the reduction of a Graph (discrete mathematics), graph to a smaller graph by partition of a set, partitioning its set of nodes into mutually exclusive groups. Edges of the original graph that cross between the g ...
via
minimum cut In graph theory, a minimum cut or min-cut of a graph is a cut (a partition of the vertices of a graph into two disjoint subsets) that is minimal in some metric. Variations of the minimum cut problem consider weighted graphs, directed graphs, ter ...
or
maximum cut For a graph, a maximum cut is a cut whose size is at least the size of any other cut. That is, it is a partition of the graph's vertices into two complementary sets and , such that the number of edges between and is as large as possible. Fin ...
. Segmentation-based object categorization can be viewed as a specific case of
spectral clustering In multivariate statistics, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions. The similarity matrix is provided as ...
applied to image segmentation.


Applications of image segmentation

* Image compression ** Segment the image into homogeneous components, and use the most suitable compression algorithm for each component to improve compression. * Medical diagnosis ** Automatic segmentation of MRI images for identification of cancerous regions. * Mapping and measurement ** Automatic analysis of remote sensing data from satellites to identify and measure regions of interest. * Transportation ** Partition a transportation network makes it possible to identify regions characterized by homogeneous traffic states.


Segmentation using normalized cuts


Graph theoretic formulation

The set of points in an arbitrary feature space can be represented as a weighted undirected complete graph G = (V, E), where the nodes of the graph are the points in the feature space. The weight w_ of an edge (i, j) \in E is a function of the similarity between the nodes i and j. In this context, we can formulate the image segmentation problem as a graph partitioning problem that asks for a partition V_1, \cdots, V_k of the vertex set V, where, according to some measure, the vertices in any set V_i have high similarity, and the vertices in two different sets V_i, V_j have low similarity.


Normalized cuts

Let ''G'' = (''V'', ''E'', ''w'') be a weighted graph. Let A and B be two subsets of vertices. Let: : w(A, B) = \sum \limits_ w_ : \operatorname(A, B) = \frac + \frac : \operatorname(A, B) = \frac + \frac In the normalized cuts approach, for any cut (S, \overline) in G, \operatorname(S, \overline) measures the similarity between different parts, and \operatorname(S, \overline) measures the total similarity of vertices in the same part. Since \operatorname(S, \overline) = 2 - \operatorname(S, \overline), a cut (S^, ^) that minimizes \operatorname(S, \overline) also maximizes \operatorname(S, \overline). Computing a cut (S^, ^) that minimizes \operatorname(S, \overline) is an
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
problem. However, we can find in polynomial time a cut (S, \overline) of small normalized weight \operatorname(S, \overline) using spectral techniques.


The ncut algorithm

Let: : d(i) = \sum \limits_j w_ Also, let ''D'' be an n \times n diagonal matrix with d on the diagonal, and let W be an n \times n symmetric matrix with w_ = w_. After some algebraic manipulations, we get: : \min \limits_ \operatorname(S, \overline) = \min \limits_y \frac subject to the constraints: * y_i \in \, for some constant -b * y^t D 1 = 0 Minimizing \frac subject to the constraints above is
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
. To make the problem tractable, we relax the constraints on y, and allow it to take real values. The relaxed problem can be solved by solving the generalized eigenvalue problem (D - W)y = \lambda D y for the second smallest generalized eigenvalue. The partitioning algorithm: # Given a set of features, set up a weighted graph G = (V, E), compute the weight of each edge, and summarize the information in D and W. # Solve (D - W)y = \lambda D y for eigenvectors with the second smallest eigenvalues. # Use the eigenvector with the second smallest eigenvalue to bipartition the graph (e.g. grouping according to sign). # Decide if the current partition should be subdivided. # Recursively partition the segmented parts, if necessary.


Computational Complexity

Solving a standard eigenvalue problem for all eigenvectors (using the
QR algorithm In numerical linear algebra, the QR algorithm or QR iteration is an eigenvalue algorithm: that is, a procedure to calculate the eigenvalues and eigenvectors of a matrix. The QR algorithm was developed in the late 1950s by John G. F. Francis and by ...
, for instance) takes O(n^3) time. This is impractical for image segmentation applications where n is the number of pixels in the image. Since only one eigenvector, corresponding to the second smallest generalized eigenvalue, is used by the uncut algorithm, efficiency can be dramatically improved if the solve of the corresponding eigenvalue problem is performed in a matrix-free fashion, i.e., without explicitly manipulating with or even computing the matrix W, as, e.g., in the
Lanczos algorithm The Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power iteration, power methods to find the m "most useful" (tending towards extreme highest/lowest) eigenvalues and eigenvectors of an n \times n ...
.
Matrix-free methods In computational mathematics, a matrix-free method is an algorithm for solving a linear system of equations or an eigenvalue problem that does not store the coefficient matrix explicitly, but accesses the matrix by evaluating matrix-vector products ...
require only a function that performs a matrix-vector product for a given vector, on every iteration. For image segmentation, the matrix W is typically sparse, with a number of nonzero entries O(n), so such a matrix-vector product takes O(n) time. For high-resolution images, the second eigenvalue is often
ill-conditioned In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the input ...
, leading to slow convergence of iterative eigenvalue solvers, such as the
Lanczos algorithm The Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power iteration, power methods to find the m "most useful" (tending towards extreme highest/lowest) eigenvalues and eigenvectors of an n \times n ...
.
Preconditioning In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Preconditioning is typically related to reducin ...
is a key technology accelerating the convergence, e.g., in the matrix-free
LOBPCG Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest (or smallest) eigenvalues and the corresponding eigenvectors of a symmetric generalized eigenvalue problem :A x= \lambda B x, for a ...
method. Computing the eigenvector using an optimally preconditioned matrix-free method takes O(n) time, which is the optimal complexity, since the eigenvector has n components.


Software Implementations

scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector ...
uses
LOBPCG Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest (or smallest) eigenvalues and the corresponding eigenvectors of a symmetric generalized eigenvalue problem :A x= \lambda B x, for a ...
from
SciPy SciPy (pronounced "sigh pie") is a free and open-source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal ...
with algebraic multigrid preconditioning for solving the
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
problem for the
graph Laplacian In the mathematical field of graph theory, the Laplacian matrix, also called the graph Laplacian, admittance matrix, Kirchhoff matrix or discrete Laplacian, is a matrix representation of a graph. Named after Pierre-Simon Laplace, the graph Lapl ...
to perform
image segmentation In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects ( sets of pixels). The goal of segmentation is to simpl ...
via spectral
graph partitioning In mathematics, a graph partition is the reduction of a Graph (discrete mathematics), graph to a smaller graph by partition of a set, partitioning its set of nodes into mutually exclusive groups. Edges of the original graph that cross between the g ...
as first proposed in and actually tested in and.


OBJ CUT

OBJ CUT is an efficient method that automatically segments an object. The OBJ CUT method is a generic method, and therefore it is applicable to any object category model. Given an image D containing an instance of a known object category, e.g. cows, the OBJ CUT algorithm computes a segmentation of the object, that is, it infers a set of labels ''m''. Let m be a set of binary labels, and let \Theta be a shape parameter(\Theta is a shape prior on the labels from a layered pictorial structure (LPS) model). An energy function E(m, \Theta) is defined as follows. : E(m, \Theta) = \sum \phi_x(D, m_x) + \phi_x(m_x, \Theta) + \sum \Psi_(m_x, m_y) + \phi(D, m_x, m_y) (1) The term \phi_x(D, m_x) + \phi_x(m_x, \Theta) is called a unary term, and the term \Psi_(m_x, m_y) + \phi(D, m_x, m_y) is called a pairwise term. A unary term consists of the likelihood \phi_x(D, m_x) based on color, and the unary potential \phi_x(m_x, \Theta) based on the distance from \Theta. A pairwise term consists of a prior \Psi_(m_x, m_y) and a contrast term \phi(D, m_x, m_y). The best labeling m^ minimizes \sum \limits_i w_i E(m, \Theta_i), where w_i is the weight of the parameter \Theta_i. : m^ = \arg \min \limits_m \sum \limits_i w_i E(m, \Theta_i) (2)


Algorithm

# Given an image D, an object category is chosen, e.g. cows or horses. # The corresponding LPS model is matched to D to obtain the samples \Theta_1, \cdots, \Theta_s # The objective function given by equation (2) is determined by computing E(m, \Theta_i) and using w_i = g(\Theta_i, Z) # The objective function is minimized using a single MINCUT operation to obtain the segmentation m.


Other approaches

* Jigsaw approach * Image parsing * Interleaved segmentation * LOCUS * LayoutCRF J. M. Winn, J. Shotton
The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects
CVPR (1) 2006: 37–44
*
Minimum spanning tree-based segmentation Image segmentation strives to partition a digital image into regions of pixels with similar properties, e.g. homogeneity. The higher-level region representation simplifies image analysis tasks such as counting objects or detecting changes, because r ...


References

{{reflist, 32em Object recognition and categorization Image segmentation