HOME

TheInfoList



OR:

Video tracking is the process of locating a moving object (or multiple objects) over time using a camera. It has a variety of uses, some of which are: human-computer interaction, security and surveillance, video communication and
compression Compression may refer to: Physical science *Compression (physics), size reduction due to forces *Compression member, a structural element such as a column *Compressibility, susceptibility to compression * Gas compression *Compression ratio, of a ...
,
augmented reality Augmented reality (AR) is an interactive experience that combines the real world and computer-generated content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory. AR can be de ...
, traffic control, medical imaging and
video editing Video editing is the manipulation and arrangement of video shots. Video editing is used to structure and present all video information, including films and television shows, video advertisements and video essays. Video editing has been dramatical ...
. Video tracking can be a time-consuming process due to the amount of data that is contained in video. Adding further to the complexity is the possible need to use
object recognition Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the ...
techniques for tracking, a challenging problem in its own right.


Objective

The objective of video tracking is to associate target objects in consecutive video frames. The association can be especially difficult when the objects are moving fast relative to the
frame rate Frame rate (expressed in or FPS) is the frequency (rate) at which consecutive images (frames) are captured or displayed. The term applies equally to film and video cameras, computer graphics, and motion capture systems. Frame rate may also be ca ...
. Another situation that increases the complexity of the problem is when the tracked object changes orientation over time. For these situations video tracking systems usually employ a motion model which describes how the image of the target might change for different possible motions of the object. Examples of simple motion models are: * When tracking planar objects, the motion model is a 2D transformation (
affine transformation In Euclidean geometry, an affine transformation or affinity (from the Latin, ''affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. More generally, ...
or
homography In projective geometry, a homography is an isomorphism of projective spaces, induced by an isomorphism of the vector spaces from which the projective spaces derive. It is a bijection that maps lines to lines, and thus a collineation. In general, ...
) of an image of the object (e.g. the initial frame). * When the target is a rigid 3D object, the motion model defines its aspect depending on its 3D position and orientation. * For
video compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
,
key frame In animation and filmmaking, a key frame (or keyframe) is a drawing or shot that defines the starting and ending points of a smooth transition. These are called ''frames'' because their position in time is measured in frames on a strip of film ...
s are divided into macroblocks. The motion model is a disruption of a key frame, where each macroblock is translated by a motion vector given by the motion parameters. * The image of deformable objects can be covered with a mesh, the motion of the object is defined by the position of the nodes of the mesh.


Algorithms

To perform video tracking an algorithm analyzes sequential
video frame In filmmaking, video production, animation, and related fields, a frame is one of the many '' still images'' which compose the complete '' moving picture''. The term is derived from the historical development of film stock, in which the sequenti ...
s and outputs the movement of targets between the frames. There are a variety of algorithms, each having strengths and weaknesses. Considering the intended use is important when choosing which algorithm to use. There are two major components of a visual tracking system: target representation and localization, as well as filtering and data association. ''Target representation and localization'' is mostly a bottom-up process. These methods give a variety of tools for identifying the moving object. Locating and tracking the target object successfully is dependent on the algorithm. For example, using blob tracking is useful for identifying human movement because a person's profile changes dynamically. Typically the computational complexity for these algorithms is low. The following are some common ''target representation and localization'' algorithms: * Kernel-based tracking (
mean-shift Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing. ...
tracking): an iterative localization procedure based on the maximization of a
similarity measure In statistics and related fields, a similarity measure or similarity function or similarity metric is a real-valued function that quantifies the similarity between two objects. Although no single definition of a similarity exists, usually such meas ...
( Bhattacharyya coefficient). * Contour tracking: detection of object boundary (e.g. active contours or
Condensation algorithm The condensation algorithm (Conditional Density Propagation) is a computer vision algorithm. The principal application is to detect and track the contour of objects moving in a cluttered environment. Object tracking is one of the more basic and di ...
). Contour tracking methods iteratively evolve an initial contour initialized from the previous frame to its new position in the current frame. This approach to contour tracking directly evolves the contour by minimizing the contour energy using gradient descent. ''Filtering and data association'' is mostly a top-down process, which involves incorporating prior information about the scene or object, dealing with object dynamics, and evaluation of different hypotheses. These methods allow the tracking of complex objects along with more complex object interaction like tracking objects moving behind obstructions. Additionally the complexity is increased if the video tracker (also named TV tracker or target tracker) is not mounted on rigid foundation (on-shore) but on a moving ship (off-shore), where typically an inertial measurement system is used to pre-stabilize the video tracker to reduce the required dynamics and bandwidth of the camera system. The computational complexity for these algorithms is usually much higher. The following are some common filtering algorithms: *
Kalman filter For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estima ...
: an optimal recursive Bayesian filter for linear functions subjected to Gaussian noise. It is an algorithm that uses a series of measurements observed over time, containing noise (random variations) and other inaccuracies, and produces estimates of unknown variables that tend to be more precise than those based on a single measurement alone. *
Particle filter Particle filters, or sequential Monte Carlo methods, are a set of Monte Carlo algorithms used to solve filtering problems arising in signal processing and Bayesian statistical inference. The filtering problem consists of estimating the i ...
: useful for sampling the underlying state-space distribution of nonlinear and non-Gaussian processes. J. Martinez-del-Rincon, D. Makris, C. Orrite-Urunuela and J.-C. Nebel (2010).
Tracking Human Position and Lower Body Parts Using Kalman and Particle Filters Constrained by Human Biomechanics
. IEEE Transactions on Systems Man and Cybernetics – Part B', 40(4).


See also

*
Match moving In visual effects, match moving is a technique that allows the insertion of computer graphics into live-action footage with correct position, scale, orientation, and motion relative to the photographed objects in the shot. The term is used loos ...
*
Motion capture Motion capture (sometimes referred as mo-cap or mocap, for short) is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robo ...
*
Motion estimation Motion estimation is the process of determining ''motion vectors'' that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions ...
*
Optical flow Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene. Optical flow can also be defined as the distribution of apparent veloci ...
*
Swistrack SwisTrack is a tool for tracking robots, humans, animals and objects using a camera or a recorded video as input source. It uses Intel's OpenCV library for fast image processing and contains interfaces for USB, FireWire and GigE cameras, as wel ...
*
Single particle tracking Single-particle tracking (SPT) is the observation of the motion of individual particles within a medium. The coordinates time series, which can be either in two dimensions (''x'', ''y'') or in three dimensions (''x'', ''y'', ''z''), is referred to ...
*
Teknomo–Fernandez algorithm The Teknomo–Fernandez algorithm (TF algorithm), is an efficient algorithm for generating the background image of a given video sequence. By assuming that the background image is shown in the majority of the video, the algorithm is able to gener ...


References

{{reflist


External links


– Interesting historical example (1980)
of
Cromemco Cyclops The Cromemco Cyclops, introduced in 1975 by Cromemco, was the first commercial all-digital camera using a digital metal-oxide-semiconductor (MOS) image sensor. It was also the first digital camera to be interfaced to a microcomputer. The digit ...
Camera used to track a ball going through a maze. Motion in computer vision Mixed reality Tracking Articles containing video clips