In
visual effects
Visual effects (sometimes abbreviated as VFX) is the process by which imagery is created or manipulated outside the context of
a live-action shot in filmmaking and video production.
The integration of live-action footage and other live-action fo ...
, match moving is a technique that allows the insertion of 2D elements, other live action elements or CG
computer graphics
Computer graphics deals with generating images and art with the aid of computers. Computer graphics is a core technology in digital photography, film, video games, digital art, cell phone and computer displays, and many specialized applications. ...
into
live-action
Live action is a form of cinematography or videography that uses photography instead of animation. Some works combine live action with animation to create a live-action animated feature film. Live action is used to define film, video games or ...
footage with correct position, scale, orientation, and
motion
In physics, motion is when an object changes its position with respect to a reference point in a given time. Motion is mathematically described in terms of displacement, distance, velocity, acceleration, speed, and frame of reference to an o ...
relative to the photographed objects in the
shot. It also allows for the removal of live action elements from the live action shot. The term is used loosely to describe several different methods of extracting camera motion information from a
motion picture
A film, also known as a movie or motion picture, is a work of visual art that simulates experiences and otherwise communicates ideas, stories, perceptions, emotions, or atmosphere through the use of moving images that are generally, since ...
. Also referred to as motion tracking or camera solving, match moving is related to
rotoscoping
Rotoscoping is an animation technique that animators use to trace over motion picture footage, frame by frame, to produce realistic action. Originally, live-action film images were projected onto a glass panel and traced onto paper. This pr ...
and
photogrammetry
Photogrammetry is the science and technology of obtaining reliable information about physical objects and the environment through the process of recording, measuring and interpreting photographic images and patterns of electromagnetic radiant ima ...
.
Match moving is sometimes confused with
motion capture
Motion capture (sometimes referred as mocap or mo-cap, for short) is the process of recording high-resolution motion (physics), movement of objects or people into a computer system. It is used in Military science, military, entertainment, sports ...
, which records the motion of objects, often human actors, rather than the camera. Typically, motion capture requires special cameras and sensors and a controlled environment (although recent developments such as the
Kinect
Kinect is a discontinued line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB color model, RGB cameras, and Thermographic camera, infrared projectors and detectors that map dep ...
camera and
Apple
An apple is a round, edible fruit produced by an apple tree (''Malus'' spp.). Fruit trees of the orchard or domestic apple (''Malus domestica''), the most widely grown in the genus, are agriculture, cultivated worldwide. The tree originated ...
's
Face ID
Face ID is a Biometrics, biometric authentication facial recognition system, facial-recognition system designed and developed by Apple Inc. for the iPhone and iPad Pro. The system can be used for unlocking a device, making Apple Pay, payments, ac ...
have begun to change this). Match moving is also distinct from
motion control photography
Motion control photography is a technique used in still and motion photography that enables precise control of, and optionally also allows repetition of, camera movements. It can be used to facilitate special effects photography. The process c ...
, which uses mechanical hardware to execute multiple identical camera moves. Match moving, by contrast, is typically a
software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
-based technology,
applied after the fact to normal footage recorded in uncontrolled environments with an ordinary camera.
Match moving is primarily used to track the movement of a camera through a shot so that an identical virtual camera move can be reproduced in a
3D animation program. When new animated elements are composited back into the original
live-action
Live action is a form of cinematography or videography that uses photography instead of animation. Some works combine live action with animation to create a live-action animated feature film. Live action is used to define film, video games or ...
shot, they will appear in perfectly matched perspective and therefore appear seamless.
As it is mostly software-based, match moving has become increasingly affordable as the cost of computer power has declined; it is now an established visual-effects tool and is even
used in live television broadcasts as part of providing effects such as the
yellow virtual down-line in
American football
American football, referred to simply as football in the United States and Canada and also known as gridiron football, is a team sport played by two teams of eleven players on a rectangular American football field, field with goalposts at e ...
.
Principle
The process of match moving can be broken down into two steps.
Tracking
The first step is identifying and
tracking
Tracking may refer to:
Science and technology Computing
* Tracking, in computer graphics, in match moving (insertion of graphics into footage)
* Tracking, composing music with music tracker software
* Eye tracking, measuring the position of ...
features. A feature is a specific point in the image that a tracking algorithm can lock onto and follow through multiple frames
SynthEyescalls them ''blips''). Often features are selected because they are bright/dark spots, edges or corners depending on the particular tracking algorithm. Popular programs use
template matching based on
NCC score and
RMS error. What is important is that each feature represents a specific point on the surface of a real object. As a feature is tracked it becomes a series of two-dimensional coordinates that represent the position of the feature across a series of frames. This series is referred to as a "track". Once tracks have been created they can be used immediately for 2-D motion tracking, or then be used to calculate 3-D information.
Calibration
The second step involves solving for 3D motion. This process attempts to derive the motion of the camera by solving the inverse-projection of the 2-D paths for the position of the camera. This process is referred to as
calibration
In measurement technology and metrology, calibration is the comparison of measurement values delivered by a device under test with those of a calibration standard of known accuracy. Such a standard could be another measurement device of known ...
.
When a point on the surface of a three-dimensional object is photographed, its position in the 2-D frame can be calculated by a
3-D projection function. We can consider a camera to be an abstraction that holds all the parameters necessary to model a camera in a real or virtual world. Therefore, a camera is a vector that includes as its elements the position of the camera, its orientation, focal length, and other possible parameters that define how the camera focuses light onto the
film plane. Exactly how this vector is constructed is not important as long as there is a compatible projection function ''P''.
The projection function ''P'' takes as its input a camera vector (denoted
''camera'') and another vector the position of a 3-D point in space (denoted
''xyz'') and returns a 2D point that has been projected onto a plane in front of the camera (denoted
''XY''). We can express this:
:
''XY'' = P(
''camera'',
''xyz'')
The projection function transforms the 3-D point and strips away the component of depth. Without knowing the depth of the component an inverse projection function can only return a set of possible 3D points, that form a line emanating from the
nodal point
In Gaussian optics, the cardinal points consist of three pairs of points located on the optical axis of a rotationally symmetric, focal, optical system. These are the '' focal points'', the principal points, and the nodal points; there are two ...
of the camera lens and passing through the projected 2-D point. We can express the inverse projection as:
:
''xyz'' ∈ P'(
''camera'',
''XY'')
or
:
Let's say we are in a situation where the features we are tracking are on the surface of a rigid object such as a building. Since we know that the real point
''xyz'' will remain in the same place in real space from one frame of the image to the next we can make the point a constant even though we do not know where it is. So:
:
''xyz''''i'' =
''xyz''''j''
where the subscripts ''i'' and ''j'' refer to arbitrary frames in the shot we are analyzing. Since this is always true then we know that:
:P'(
''camera''''i'',
''XY''''i'') ∩ P'(
''camera''''j'',
''XY''''j'') ≠
Because the value of
''XY''''i'' has been determined for all frames that the feature is tracked through by the tracking program, we can solve the reverse projection function between any two frames as long as P'(
''camera''''i'',
''XY''''i'') ∩ P'(
''camera''''j'',
''XY''''j'') is a small set. Set of possible
''camera'' vectors that solve the equation at i and j (denoted C
''ij'').
:C
''ij'' = {(
''camera''''i'',
''camera''''j''):P'(
''camera''''i'',
''XY''''i'') ∩ P'(
''camera''''j'',
''XY''''j'') ≠ {})
So there is a set of camera vector pairs C
''ij'' for which the intersection of the inverse projections of two points
''XY''''i'' and
''XY''''j'' is a non-empty, hopefully small, set centering on a theoretical stationary point
''xyz'' .
In other words, imagine a black point floating in a white void and a camera. For any position in space that we place the camera, there is a set of corresponding parameters (orientation, focal length, etc.) that will photograph that black point exactly the same way. Since ''C'' has an infinite number of members, one point is never enough to determine the actual camera position.
As we start adding tracking points, we can narrow the possible camera positions. For example, if we have a set of points {
''xyz''''i,0'',...,
''xyz''''i,n''} and {
''xyz''''j,0'',...,
''xyz''''j,n''} where i and j still refer to frames and n is an index to one of many tracking points we are following. We can derive a set of camera vector pair sets {C
''i,j,0'',...,C
''i,j,n''}.
In this way multiple tracks allow us to narrow the possible camera parameters. The set of possible camera parameters that fit, F, is the intersection of all sets:
:F = C
''i,j,0'' ∩ ... ∩ C
''i,j,n''
The fewer elements are in this set the closer we can come to extracting the actual parameters of the camera. In reality errors introduced to the tracking process require a more statistical approach to determining a good camera vector for each frame,
optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
algorithms and
bundle block adjustment are often utilized. Unfortunately there are so many elements to a camera vector that when every parameter is free we still might not be able to narrow F down to a single possibility no matter how many features we track. The more we can restrict the various parameters, especially focal length, the easier it becomes to pinpoint the solution.
In all, the 3D solving process is the process of narrowing down the possible solutions to the motion of the camera until we reach one that suits the needs of the composite we are trying to create.
Point-cloud projection
Once the camera position has been determined for every frame it is then possible to estimate the position of each feature in real space by inverse projection. The resulting set of points is often referred to as a point cloud because of its raw appearance like a
nebula
A nebula (; or nebulas) is a distinct luminescent part of interstellar medium, which can consist of ionized, neutral, or molecular hydrogen and also cosmic dust. Nebulae are often star-forming regions, such as in the Pillars of Creation in ...
. Since point clouds often reveal some of the shape of the 3-D scene they can be used as a reference for placing synthetic objects or by a reconstruction program to create a 3-D version of the actual scene.
Ground-plane determination
The camera and point cloud need to be oriented in some kind of space. Therefore, once calibration is complete, it is necessary to define a ground plane. Normally, this is a unit plane that determines the scale, orientation and origin of the projected space. Some programs attempt to do this automatically, though more often the user defines this plane. Since shifting ground planes does a simple transformation of all of the points, the actual position of the plane is really a matter of convenience.
Reconstruction
3D reconstruction
In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects.
This process can be accomplished either by active or passive methods. If the model is allowed to change its shape i ...
is the interactive process of recreating a photographed object using tracking data. This technique is related to
photogrammetry
Photogrammetry is the science and technology of obtaining reliable information about physical objects and the environment through the process of recording, measuring and interpreting photographic images and patterns of electromagnetic radiant ima ...
. In this particular case we are referring to using match moving software to reconstruct a scene from incidental footage.
A reconstruction program can create three-dimensional objects that mimic the real objects from the photographed scene. Using data from the point cloud and the user's estimation, the program can create a virtual object and then extract a texture from the footage that can be projected onto the virtual object as a surface texture.
2D vs. 3D
Match moving has two forms. Some compositing programs, such as
Shake,
Adobe Substance
Adobe (from arabic: الطوب Attub ; ) is a building material made from earth and organic materials. is Spanish for mudbrick. In some English-speaking regions of Spanish heritage, such as the Southwestern United States, the term is used ...
,
Adobe After Effects
Adobe After Effects is a digital visual effects, motion graphics, and compositing application developed by Adobe Inc.; it is used for animation and in the post-production process of film making, video games and television production. Amo ...
, and
Discreet Combustion, include two-dimensional motion tracking capabilities. Two dimensional match moving only tracks features in two-dimensional space, without any concern to camera movement or distortion. It can be used to add
motion blur
Motion blur is the apparent streaking of moving objects in a photograph or a sequence of frames, such as a film or animation. It results when the image being recorded changes during the recording of a single exposure, due to rapid movement or l ...
or
image stabilization
Image stabilization (IS) is a family of techniques that reduce motion blur, blurring associated with the motion of a camera or other imaging device during exposure (photography), exposure.
Generally, it compensates for panning (camera), pan an ...
effects to footage. This technique is sufficient to create realistic effects when the original footage does not include major changes in camera perspective. For example, a billboard deep in the background of a shot can often be replaced using two-dimensional tracking.
Three-dimensional match moving tools make it possible to extrapolate three-dimensional information from two-dimensional photography. These tools allow users to derive camera movement and other relative motion from arbitrary footage. The tracking information can be transferred to
computer graphics software and used to animate virtual cameras and simulated objects. Programs capable of 3-D match moving include:
3DEqualizerfrom Science.D.Visions (which won an
Academy Award for Technical Achievement
The Technical Achievement Award is one of three Scientific and Technical Awards given from time to time by the Academy of Motion Picture Arts and Sciences. (The other two awards are the Scientific and Engineering Award and the Academy Award of M ...
)
3DEqualizer:74th Scientific & Technical Awards
/ref>
* Blender
A blender (sometimes called a mixer (from Latin ''mixus, the PPP of miscere eng. to Mix)'' or liquidiser in British English) is a kitchen and laboratory appliance used to mix, crush, purée or emulsify food and other substances. A stationary ...
(open source; uses libmv)
Voodoo
an automatic camera tracking with dense depth recovery system for handling image/video sequences
a robust and efficient structure-from-motion system which can handle large image/video sequence datasets in near real-time and robustly work in challenging cases (e.g. loopback sequences and multiple sequences)
VISCODA VooCAT
Icarus
(University of Manchester research project, now discontinued but still popular)
* Maya MatchMover
* The Pixel Far
PFTrack
PFMatchit, PFHoe (based on PFTrack algorithms)
* KeenToolsbr>GeoTracker
PinTool
by Andersson Technologies
Boujou
(which won an Emmy award
The Emmy Awards, or Emmys, are an extensive range of awards for artistic and technical merit for the television industry. A number of annual Emmy Award ceremonies are held throughout the year, each with their own set of rules and award categor ...
in 2002)
* NukeX from The Foundry
CameraTracker
(a plug-in for Adobe After Effects
Adobe After Effects is a digital visual effects, motion graphics, and compositing application developed by Adobe Inc.; it is used for animation and in the post-production process of film making, video games and television production. Amo ...
) from The Foundry.
VideoTrace
from Punchcard (software for generating 3D models from video and images)
IXIR 2D Track Editor
It is capable of 2D tracks and Mask files of software like 3D Equalizer, PFTrack, Boujou, SynthEyes, Matchmover, Movimento, Nuke, Shake, Fusion, After Effects, Combustion, Mocha, Silhouette
mocha Pro
from Imagineer Systems, Planar Tracker-based utility for post production
fayIN
a plug-in for Adobe After Effects
Adobe After Effects is a digital visual effects, motion graphics, and compositing application developed by Adobe Inc.; it is used for animation and in the post-production process of film making, video games and television production. Amo ...
from fayteq
acquired by Facebook in 2017
Meshroom
from Alicevision, a free and open-source photogrammetry
Photogrammetry is the science and technology of obtaining reliable information about physical objects and the environment through the process of recording, measuring and interpreting photographic images and patterns of electromagnetic radiant ima ...
application that also allows users to export an animated camera along with a 3D reconstruction of a scene
Automatic vs. interactive tracking
There are two methods by which motion information can be extracted from an image. Interactive tracking, sometimes referred to as "supervised tracking", relies on the user to follow features through a scene. Automatic tracking relies on computer algorithms
In mathematics and computer science, an algorithm () is a finite sequence of mathematically rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for per ...
to identify and track features through a shot. The tracked points movements are then used to calculate a "solution". This solution is composed of all the camera's information such as the motion, focal length, and lens distortion
In geometric optics, distortion is a deviation from rectilinear projection; a projection in which straight lines in a scene remain straight in an image. It is a form of optical aberration that may be distinguished from other aberrations such as ...
.
The advantage of automatic tracking is that the computer can create many points faster than a human can. A large number of points can be analyzed with statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
to determine the most reliable data. The disadvantage of automatic tracking is that, depending on the algorithm, the computer can be easily confused as it tracks objects through the scene. Automatic tracking methods are particularly ineffective in shots involving fast camera motion such as that seen with hand-held camera work and in shots with repetitive subject matter like small tiles or any sort of regular pattern where one area is not very distinct. This tracking method also suffers when a shot contains a large amount of motion blur, making the small details it needs harder to distinguish.
The advantage of interactive tracking is that a human user can follow features through an entire scene and will not be confused by features that are not rigid. A human user can also determine where features are in a shot that suffers from motion blur; it is extremely difficult for an automatic tracker to correctly find features with high amounts of motion blur. The disadvantage of interactive tracking is that the user will inevitably introduce small errors as they follow objects through the scene, which can lead to what is called "drift".
Professional-level motion tracking is usually achieved using a combination of interactive and automatic techniques. An artist can remove points that are clearly anomalous and use "tracking mattes" to block confusing information out of the automatic tracking process. Tracking mattes are also employed to cover areas of the shot which contain moving elements such as an actor or a spinning ceiling fan.
Tracking mattes
A tracking matte is similar in concept to a garbage matte
Mattes are used in photography and special effects filmmaking to combine two or more image elements into a single, final image. Usually, mattes are used to combine a foreground image (e.g. actors on a set) with a background image (e.g. a scenic ...
used in traveling matte
Mattes are used in photography and special effects filmmaking to combine two or more image elements into a single, final image. Usually, mattes are used to combine a foreground image (e.g. actors on a set) with a background image (e.g. a scenic ...
compositing. However, the purpose of a tracking matte is to prevent tracking algorithms from using unreliable, irrelevant, or non-rigid tracking points. For example, in a scene where an actor walks in front of a background, the tracking artist will want to use only the background to track the camera through the scene, knowing that motion of the actor will throw off the calculations. In this case, the artist will construct a tracking matte to follow the actor through the scene, blocking that information from the tracking process.
Refining
Since there are often multiple possible solutions to the calibration process and a significant amount of error can accumulate, the final step to match moving often involves refining the solution by hand. This could mean altering the camera motion itself or giving hints to the calibration mechanism. This interactive calibration is referred to as "refining".
Most match moving applications are based on similar algorithms for tracking and calibration. Often, the initial results obtained are similar. However, each program has different refining capabilities.
Real time
On-set, real-time camera tracking is becoming more widely used in feature film production to allow elements that will be inserted in post-production be visualised live on-set. This has the benefit of helping the director and actors improve performances by actually seeing set extensions or CGI characters whilst (or shortly after) they do a take. No longer do they need to perform to green/blue screens and have no feedback of the result. Eye-line references, actor positioning, and CGI interaction can now be done live on-set giving everyone confidence that the shot is correct and going to work in the final composite.
To achieve this, a number of components from hardware to software need to be combined. Software collects all of the 360 degrees of freedom movement of the camera as well as metadata such as zoom, focus, iris and shutter elements from many different types of hardware devices, ranging from motion capture systems such as active LED marker based system from PhaseSpace, passive systems such as Motion Analysis or Vicon, to rotary encoders fitted to camera cranes and dollies such as Technocranes and Fisher Dollies, or inertia & gyroscopic sensors mounted directly to the camera. There are also laser based tracking systems that can be attached to anything, including Steadicams, to track cameras outside in the rain at distances of up to 30 meters.
Motion control cameras can also be used as a source or destination for 3D camera data. Camera moves can be pre-visualised in advance and then converted into motion control data that drives a camera crane along precisely the same path as the 3-D camera. Encoders on the crane can also be used in real time on-set to reverse this process to generate live 3D cameras. The data can be sent to any number of different 3D applications, allowing 3D artists to modify their CGI elements live on set as well. The main advantage being that set design issues that would be time-consuming and costly issues later down the line can be sorted out during the shooting process, ensuring that the actors "fit" within each environment for each shot whilst they do their performances.
Real time motion capture systems can also be mixed within camera data stream allowing virtual characters to be inserted into live shots on-set. This dramatically improves the interaction between real and non-real MoCap driven characters as both plate and CGI performances can be choreographed together.
See also
* 1st & Ten (graphics system)
* PVI Virtual Media Services
* Structure from motion
Structure from motion (SfM) is a photogrammetric range imaging technique for estimating three-dimensional structures from two-dimensional image sequences that may be coupled with local motion signals. It is a classic problem studied in the fiel ...
* Virtual studio
References
* ''Matchmoving: The Invisible Art of Camera Tracking'', by Tim Dobbert, Sybex, Feb 2005,
* 3-D Estimation and Applications to Match Move - An early paper on match moving, which gets in depth about mathematics.
Comparison of matchmoving and tracking applications
Tracking and 3D Matchmoving Tutorials
Dead Link*
External links
Retrieved May 2013
{{DEFAULTSORT:Match Moving
Computer animation
Video processing
Visual effects
Motion in computer vision