Optical flow or optic flow is the pattern of apparent
motion
In physics, motion is when an object changes its position with respect to a reference point in a given time. Motion is mathematically described in terms of displacement, distance, velocity, acceleration, speed, and frame of reference to an o ...
of objects, surfaces, and edges in a visual scene caused by the
relative motion
In geometry, a position or position vector, also known as location vector or radius vector, is a Euclidean vector that represents a point ''P'' in space. Its length represents the distance in relation to an arbitrary reference origin ''O'', and ...
between an observer and a scene. Optical flow can also be defined as the distribution of apparent velocities of movement of brightness pattern in an image.
The concept of optical flow was introduced by the American psychologist
James J. Gibson in the 1940s to describe the visual stimulus provided to animals moving through the world. Gibson stressed the importance of optic flow for
affordance perception, the ability to discern possibilities for action within the environment. Followers of Gibson and his
ecological approach to psychology have further demonstrated the role of the optical flow stimulus for the perception of movement by the observer in the world; perception of the shape, distance and movement of objects in the world; and the control of
locomotion.
The term optical flow is also used by roboticists, encompassing related techniques from image processing and control of navigation including
motion detection,
object segmentation, time-to-contact information, focus of expansion calculations, luminance,
motion compensated encoding, and stereo disparity measurement.
Estimation
Optical flow can be estimated in a number of ways. Broadly, optical flow estimation approaches can be divided into machine learning based models (sometimes called data-driven models), classical models (sometimes called knowledge-driven models) which do not use machine learning and hybrid models which use aspects of both learning based models and classical models.
Classical Models
Many classical models use the intuitive assumption of ''brightness constancy''; that even if a point moves between frames, its brightness stays constant.
To formalise this intuitive assumption, consider two consecutive frames from a video sequence, with intensity
, where
refer to pixel coordinates and
refers to time.
In this case, the brightness constancy constraint is
:
where
is the displacement vector between a point in the first frame and the corresponding point in the second frame.
By itself, the brightness constancy constraint cannot be solved for
and
at each pixel, since there is only one equation and two unknowns.
This is known as the ''
aperture problem''.
Therefore, additional constraints must be imposed to estimate the flow field.
Regularized Models
Perhaps the most natural approach to addressing the aperture problem is to apply a smoothness constraint or a ''regularization constraint'' to the flow field.
One can combine both of these constraints to formulate estimating optical flow as an
optimization problem
In mathematics, engineering, computer science and economics
Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goo ...
, where the goal is to minimize the cost function of the form,
:
where
is the extent of the images
,
is the gradient operator,
is a constant, and
is a
loss function.
This optimisation problem is difficult to solve owing to its non-linearity.
To address this issue, one can use a ''variational approach'' and linearise the brightness constancy constraint using a first order
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
approximation. Specifically, the brightness constancy constraint is approximated as,
:
For convenience, the derivatives of the image,
,
and
are often condensed to become
,
and
.
Doing so, allows one to rewrite the linearised brightness constancy constraint as,
:
The optimization problem can now be rewritten as
:
For the choice of
, this method is the same as the
Horn-Schunck method.
Of course, other choices of cost function have been used such as
, which is a differentiable variant of the
norm.
To solve the aforementioned optimization problem, one can use the
Euler-Lagrange equations to provide a system of partial differential equations for each point in
. In the simplest case of using
, these equations are,
:
:
where
denotes the
Laplace operator
In mathematics, the Laplace operator or Laplacian is a differential operator given by the divergence of the gradient of a Scalar field, scalar function on Euclidean space. It is usually denoted by the symbols \nabla\cdot\nabla, \nabla^2 (where \ ...
.
Since the image data is made up of discrete pixels, these equations are discretised.
Doing so yields a system of linear equations which can be solved for
at each pixel, using an iterative scheme such as
Gauss-Seidel.
Although, linearising the brightness constancy constraint simplifies the optimisation problem significantly, the linearisation is only valid for small displacements and/or smooth images. To avoid this problem, a multi-scale or coarse-to-fine approach is often used. In such a scheme, the images are initially
downsampled
In digital signal processing, downsampling, compression, and decimation are terms associated with the process of sample rate conversion, ''resampling'' in a multi-rate digital signal processing system. Both ''downsampling'' and ''decimation'' can b ...
and the linearised Euler-Lagrange equations are solved at the reduced resolution. The estimated flow field at this scale is then used to initialise the process at next scale. This initialisation process is often performed by
warping one frame using the current estimate of flow field so that it is as similar to other as possible.
An alternate approach is to discretize the optimisation problem and then perform a search of the possible
values without linearising it.
This search is often performed using
Max-flow min-cut theorem algorithms, linear programming or
belief propagation
Belief propagation, also known as sum–product message passing, is a message-passing algorithm for performing inference on graphical models, such as Bayesian networks and Markov random fields. It calculates the marginal distribution for ea ...
methods.
Parametric Models
Instead of applying the regularization constraint on a point by point basis as per a regularized model, one can group pixels into regions and estimate the motion of these regions.
This is known as a ''parametric model'', since the motion of these regions is
parameterized.
In formulating optical flow estimation in this way, one makes the assumption that the motion field in each region be fully characterised by a set of parameters.
Therefore, the goal of a parametric model is to estimate the motion parameters that minimise a loss function which can be written as,
:
where
is the set of parameters determining the motion in the region
,
is data cost term,
is a weighting function that determines the influence of pixel
on the total cost, and
and
are frames 1 and 2 from a pair of consecutive frames.
The simplest parametric model is the
Lucas-Kanade method. This uses rectangular regions and parameterises the motion as purely translational. The Lucas-Kanade method uses the original brightness constancy constrain as the data cost term and selects
.
This yields the local loss function,
:
Other possible local loss functions include the negative normalized
cross-correlation
In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a ''sliding dot product'' or ''sliding inner-product''. It is commonly used f ...
between the two frames.
Learning-Based Models
Instead of seeking to model optical flow directly, one can train a
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
system to estimate optical flow. Since 2015, when FlowNet was proposed, learning based models have been applied to optical flow and have gained prominence. Initially, these approaches were based on
Convolutional Neural Networks
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different type ...
arranged in a
U-Net architecture. However, with the advent of
transformer architecture in 2017, transformer based models have gained prominence.
Most learning-based approaches to optical flow use
supervised learning
In machine learning, supervised learning (SL) is a paradigm where a Statistical model, model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often ...
. In this case, many frame pairs of video data and their corresponding
ground-truth flow fields are used to optimise the parameters of the learning-based model to accurately estimate optical flow. This process often relies on vast training datasets due to the number of parameters involved.
Uses
Motion estimation
In computer vision and image processing, motion estimation is the process of determining ''motion vectors'' that describe the transformation from one 2D image to another; usually from adjacent video frame, frames in a video sequence. It is an wel ...
and
video compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
have developed as a major aspect of optical flow research. While the optical flow field is superficially similar to a dense motion field derived from the techniques of motion estimation, optical flow is the study of not only the determination of the optical flow field itself, but also of its use in estimating the three-dimensional nature and structure of the scene, as well as the 3D motion of objects and the observer relative to the scene, most of them using the image Jacobian.
Optical flow was used by robotics researchers in many areas such as:
object detection and tracking, image dominant plane extraction, movement detection, robot navigation and
visual odometry.
Optical flow information has been recognized as being useful for controlling micro air vehicles.
The application of optical flow includes the problem of inferring not only the motion of the observer and objects in the scene, but also the
structure
A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...
of objects and the environment. Since awareness of motion and the generation of mental maps of the structure of our environment are critical components of animal (and human)
vision
Vision, Visions, or The Vision may refer to:
Perception Optical perception
* Visual perception, the sense of sight
* Visual system, the physical mechanism of eyesight
* Computer vision, a field dealing with how computers can be made to gain und ...
, the conversion of this innate ability to a computer capability is similarly crucial in the field of
machine vision
Machine vision is the technology and methods used to provide image, imaging-based automation, automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision ...
.
Consider a five-frame clip of a ball moving from the bottom left of a field of vision, to the top right. Motion estimation techniques can determine that on a two dimensional plane the ball is moving up and to the right and vectors describing this motion can be extracted from the sequence of frames. For the purposes of video compression (e.g.,
MPEG
The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by International Organization for Standardization, ISO and International Electrotechnical Commission, IEC that sets standards for media coding, includ ...
), the sequence is now described as well as it needs to be. However, in the field of machine vision, the question of whether the ball is moving to the right or if the observer is moving to the left is unknowable yet critical information. Not even if a static, patterned background were present in the five frames, could we confidently state that the ball was moving to the right, because the pattern might have an infinite distance to the observer.
Optical flow sensor
Various configurations of optical flow sensors exist. One configuration is an image sensor chip connected to a processor programmed to run an optical flow algorithm. Another configuration uses a vision chip, which is an integrated circuit having both the
image sensor An image sensor or imager is a sensor that detects and conveys information used to form an image. It does so by converting the variable attenuation of light waves (as they refraction, pass through or reflection (physics), reflect off objects) into s ...
and the processor on the same die, allowing for a compact implementation. An example of this is a generic optical mouse sensor used in an
optical mouse. In some cases the processing circuitry may be implemented using analog or mixed-signal circuits to enable fast optical flow computation using minimal current consumption.
One area of contemporary research is the use of
neuromorphic engineering techniques to implement circuits that respond to optical flow, and thus may be appropriate for use in an optical flow sensor. Such circuits may draw inspiration from biological neural circuitry that similarly responds to optical flow.
Optical flow sensors are used extensively in computer
optical mice, as the main sensing component for measuring the motion of the mouse across a surface.
Optical flow sensors are also being used in
robotics
Robotics is the interdisciplinary study and practice of the design, construction, operation, and use of robots.
Within mechanical engineering, robotics is the design and construction of the physical structures of robots, while in computer s ...
applications, primarily where there is a need to measure visual motion or relative motion between the robot and other objects in the vicinity of the robot. The use of optical flow sensors in
unmanned aerial vehicles (UAVs), for stability and obstacle avoidance, is also an area of current research.
See also
*
Ambient optic array
*
Optical mouse
*
Range imaging
*
Vision processing unit
*
Continuity Equation
*
Motion field
References
External links
Finding Optic FlowArt of Optical Flowarticle on fxguide.com (using optical flow in visual effects)
Optical flow evaluation and ground truth sequences.Middlebury Optical flow evaluation and ground truth sequences.mrf-registration.net- Optical flow estimation through
MRF
The French Aerospace Lab:GPU implementation of a Lucas-Kanade based optical flow
CUDA Implementationby CUVI (CUDA Vision & Imaging Library)
Horn and Schunck Optical Flow:Online demo and source code of the Horn and Schunck method
TV-L1 Optical Flow:Online demo and source code of the Zach et al. method
Robust Optical Flow:Online demo and source code of the Brox et al. method
{{DEFAULTSORT:Optical Flow
Motion in computer vision