As applied in the field of
computer vision
Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
,
graph cut optimization
Graph cut optimization is a combinatorial optimization method applicable to a family of functions of discrete variables, named after the concept of cut in the theory of flow networks. Thanks to the max-flow min-cut theorem, determining the minim ...
can be employed to
efficiently solve a wide variety of low-level computer vision problems (''early vision''), such as
image smoothing, the stereo
correspondence problem,
image segmentation
In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects (Set (mathematics), sets of pixels). The goal of segmen ...
,
object co-segmentation, and many other computer vision problems that can be formulated in terms of
energy minimization.
Many of these energy minimization problems can be approximated by solving a
maximum flow problem
In Optimization (mathematics), optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate.
The maximum flow problem can be seen as a special case of more complex ...
in a
graph
Graph may refer to:
Mathematics
*Graph (discrete mathematics), a structure made of vertices and edges
**Graph theory, the study of such graphs and their properties
*Graph (topology), a topological space resembling a graph in the sense of discret ...
(and thus, by the
max-flow min-cut theorem
In computer science and optimization theory, the max-flow min-cut theorem states that in a flow network, the maximum amount of flow passing from the ''source'' to the ''sink'' is equal to the total weight of the edges in a minimum cut, i.e., the ...
, define a minimal
cut
Cut or CUT may refer to:
Common uses
* The act of cutting, the separation of an object into two through acutely directed force
** A type of wound
** Cut (archaeology), a hole dug in the past
** Cut (clothing), the style or shape of a garment
** ...
of the graph).
Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the
maximum a posteriori estimate of a solution.
Although many computer vision algorithms involve cutting a graph (e.g., normalized cuts), the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as
graph partition
In mathematics, a graph partition is the reduction of a Graph (discrete mathematics), graph to a smaller graph by partition of a set, partitioning its set of nodes into mutually exclusive groups. Edges of the original graph that cross between the g ...
ing algorithms).
"Binary" problems (such as
denoising
Noise reduction is the process of removing noise from a signal. Noise reduction techniques exist for audio and images. Noise reduction algorithms may distort the signal to some degree. Noise rejection is the ability of a circuit to isolate an u ...
a
binary image
A binary image is a digital image that consists of pixels that can have one of exactly two colors, usually black and white. Each pixel is stored as a single bit — i.e. either a 0 or 1.
A binary image can be stored in memory as a bitmap: a p ...
) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a
grayscale
In digital photography, computer-generated imagery, and colorimetry, a greyscale (more common in Commonwealth English) or grayscale (more common in American English) image is one in which the value of each pixel is a single sample (signal), s ...
image) cannot be solved exactly, but solutions produced are usually near the global optimum.
History
The foundational theory of
graph cuts was first applied in
computer vision
Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
in the seminal paper by Greig, Porteous and Seheult
[D.M. Greig, B.T. Porteous and A.H. Seheult (1989), ]
Exact maximum a posteriori estimation for binary images
', Journal of the Royal Statistical Society, Series B, 51, 271–279. of
Durham University
Durham University (legally the University of Durham) is a collegiate university, collegiate public university, public research university in Durham, England, founded by an Act of Parliament (UK), Act of Parliament in 1832 and incorporated by r ...
. Allan Seheult and Bruce Porteous were members of Durham's lauded statistics group of the time, led by
Julian Besag
Julian Ernst Besag Fellow of the Royal Society, FRS (26 March 1945 – 6 August 2010) was a British statistician known chiefly for his work in spatial statistics (including its applications to epidemiology, image analysis and agricultural science ...
and
Peter Green, with the optimisation expert
Margaret Greig notable as the first ever female member of staff of the Durham Mathematical Sciences Department.
In the
Bayesian statistical context of smoothing noisy (or corrupted) images, they showed how the
maximum a posteriori estimate of a
binary image
A binary image is a digital image that consists of pixels that can have one of exactly two colors, usually black and white. Each pixel is stored as a single bit — i.e. either a 0 or 1.
A binary image can be stored in memory as a bitmap: a p ...
can be obtained ''exactly'' by maximizing the
flow through an associated image network, involving the introduction of a ''source'' and ''sink''. The problem was therefore shown to be efficiently solvable. Prior to this result, ''approximate'' techniques such as
simulated annealing
Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. ...
(as proposed by the
Geman brothers), or
iterated conditional modes (a type of
greedy algorithm
A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...
suggested by
Julian Besag
Julian Ernst Besag Fellow of the Royal Society, FRS (26 March 1945 – 6 August 2010) was a British statistician known chiefly for his work in spatial statistics (including its applications to epidemiology, image analysis and agricultural science ...
) were used to solve such image smoothing problems.
Although the general
-colour problem is NP hard for
the approach of Greig, Porteous and Seheult
has turned out
[Y. Boykov, O. Veksler and R. Zabih (2001),]
Fast approximate energy minimisation via graph cuts
, ''IEEE Transactions on Pattern Analysis and Machine Intelligence'', 29, 1222–1239. to have wide applicability in general computer vision problems. For general problems, Greig, Porteous and Seheult's approach is often applied iteratively to sequences of related binary problems, usually yielding near optimal solutions.
In 2011, C. Couprie ''et al''. proposed a general image segmentation framework, called the "Power Watershed", that minimized a real-valued
indicator function
In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , then the indicator functio ...
from
,1over a graph, constrained by user seeds (or unary terms) set to 0 or 1, in which the minimization of the indicator function over the graph is optimized with respect to an exponent
. When
, the Power Watershed is optimized by graph cuts, when
the Power Watershed is optimized by shortest paths,
is optimized by the
random walker algorithm and
is optimized by the
watershed algorithm. In this way, the Power Watershed may be viewed as a generalization of graph cuts that provides a straightforward connection with other energy optimization segmentation/clustering algorithms.
Binary segmentation of images
Notation
* Image:
* Output: Segmentation (also called opacity)
(soft segmentation). For hard segmentation
*
Energy function:
where C is the color parameter and λ is the coherence parameter.
*
* Optimization: The segmentation can be estimated as a global minimum over S:
Existing methods
* Standard Graph cuts: optimize energy function over the segmentation (unknown S value).
* Iterated Graph cuts:
# First step optimizes over the color parameters using K-means.
# Second step performs the usual graph cuts algorithm.
:These 2 steps are repeated recursively until convergence.
* Dynamic graph cuts:
Allows to re-run the algorithm much faster after modifying the problem (e.g. after new seeds have been added by a user).
Energy function
:
where the energy
is composed of two different models (
and
):
Likelihood / Color model / Regional term
— unary term describing the likelihood of each color.
* This term can be modeled using different local (e.g. ) or global (e.g. histograms, GMMs, Adaboost likelihood) approaches that are described below.
= Histogram
=
* We use intensities of pixels marked as seeds to get histograms for object (foreground) and background intensity distributions: P(I, O) and P(I, B).
* Then, we use these histograms to set the regional penalties as negative log-likelihoods.
= GMM (Gaussian mixture model)
=
* We usually use two distributions: one for background modelling and another for foreground pixels.
* Use a Gaussian mixture model (with 5–8 components) to model those 2 distributions.
* Goal: Try to pull apart those two distributions.
= Texon
=
* A (or ) is a set of pixels that has certain characteristics and is repeated in an image.
* Steps:
# Determine a good natural scale for the texture elements.
# Compute non-parametric statistics of the model-interior , either on intensity or on Gabor filter responses.
* Examples:
*
Deformable-model based Textured Object Segmentation*
Contour and Texture Analysis for Image Segmentation
Prior / Coherence model / Boundary term
— binary term describing the coherence between neighborhood pixels.
* In practice, pixels are defined as neighbors if they are adjacent either horizontally, vertically or diagonally (4 way connectivity or 8 way connectivity for 2D images).
* Costs can be based on local intensity gradient, Laplacian zero-crossing, gradient direction, color mixture model,...
* Different energy functions have been defined:
** Standard
Markov random field
In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph
In discrete mathematics, particularly ...
: Associate a penalty to disagreeing pixels by evaluating the difference between their segmentation label (crude measure of the length of the boundaries). See Boykov and Kolmogorov ICCV 2003
**
Conditional random field
Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without consi ...
: If the color is very different, it might be a good place to put a boundary. See Lafferty et al. 2001; Kumar and Hebert 2003
Criticism
Graph cuts methods have become popular alternatives to the level set-based approaches for optimizing the location of a contour (see for an extensive comparison). However, graph cut approaches have been criticized in the literature for several issues:
* Metrication artifacts: When an image is represented by a 4-connected lattice, graph cuts methods can exhibit unwanted "blockiness" artifacts. Various methods have been proposed for addressing this issue, such as using additional edges or by formulating the max-flow problem in continuous space.
* Shrinking bias: Since graph cuts finds a minimum cut, the algorithm can be biased toward producing a small contour. For example, the algorithm is not well-suited for segmentation of thin objects like blood vessels (see for a proposed fix).
* Multiple labels: Graph cuts is only able to find a global optimum for binary labeling (i.e., two labels) problems, such as foreground/background image segmentation. Extensions have been proposed that can find approximate solutions for multilabel graph cuts problems.
* Memory: the memory usage of graph cuts increases quickly as the image size increases. As an illustration, the Boykov-Kolmogorov max-flow algorithm v2.2 allocates
bytes (
and
are respectively the number of nodes and edges in the graph). Nevertheless, some amount of work has been recently done in this direction for reducing the graphs before the maximum-flow computation.
Algorithm
* Minimization is done using a standard minimum cut algorithm.
* Due to the
max-flow min-cut theorem
In computer science and optimization theory, the max-flow min-cut theorem states that in a flow network, the maximum amount of flow passing from the ''source'' to the ''sink'' is equal to the total weight of the edges in a minimum cut, i.e., the ...
we can solve energy minimization by maximizing the flow over the network. The
max-flow problem consists of a directed graph with edges labeled with capacities, and there are two distinct nodes: the source and the sink. Intuitively, it is easy to see that the maximum flow is determined by the bottleneck.
Implementation (exact)
The Boykov-Kolmogorov algorithm is an efficient way to compute the max-flow for computer vision-related graphs.
Implementation (approximation)
The Sim Cut algorithm approximates the minimum graph cut. The algorithm implements a solution by simulation of an electrical network. This is the approach suggested by
Cederbaum's maximum flow theorem.
[I.T. Frisch, "On Electrical analogs for flow networks," Proceedings of IEEE, 57:2, pp. 209-210, 1969] Acceleration of the algorithm is possible through
parallel computing
Parallel computing is a type of computing, computation in which many calculations or Process (computing), processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. ...
.
Software
http://pub.ist.ac.at/~vnk/software.html— An implementation of the maxflow algorithm described in "An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Computer Vision" by Vladimir Kolmogorov
http://vision.csd.uwo.ca/code/— some graph cut libraries and MATLAB wrappers
http://gridcut.com/— fast multi-core max-flow/min-cut solver optimized for grid-like graphs
http://virtualscalpel.com/— An implementation of the Sim Cut; an algorithm for computing an approximate solution of the minimum s-t cut in a massively parallel manner.
References
{{DEFAULTSORT:Graph Cuts In Computer Vision
Bayesian statistics
Computer vision
Computational problems in graph theory
Image segmentation