HOME

TheInfoList



OR:

In the field of
computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...
, any two images of the same planar surface in space are related by a
homography In projective geometry, a homography is an isomorphism of projective spaces, induced by an isomorphism of the vector spaces from which the projective spaces derive. It is a bijection that maps lines to lines, and thus a collineation. In general, ...
(assuming a
pinhole camera model The pinhole camera model describes the mathematical relationship between the coordinates of a point in three-dimensional space and its projection onto the image plane of an ''ideal'' pinhole camera, where the camera aperture is described as a poi ...
). This has many practical applications, such as
image rectification Image rectification is a transformation process used to project images onto a common image plane. This process has several degrees of freedom and there are many strategies for transforming images to the common plane. Image rectification is used in c ...
,
image registration Image registration is the process of transforming different sets of data into one coordinate system. Data may be multiple photographs, data from different sensors, times, depths, or viewpoints. It is used in computer vision, medical imaging, milit ...
, or camera motion—rotation and translation—between two images. Once
camera resectioning Camera resectioning is the process of estimating the parameters of a pinhole camera model approximating the camera that produced a given photograph or video; it determines which incoming light ray is associated with each pixel on the resulting imag ...
has been done from an estimated homography matrix, this information may be used for navigation, or to insert models of 3D objects into an image or video, so that they are rendered with the correct perspective and appear to have been part of the original scene (see
Augmented reality Augmented reality (AR) is an interactive experience that combines the real world and computer-generated content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory. AR can be de ...
).


3D plane to plane equation

We have two cameras ''a'' and ''b'', looking at points P_i in a plane. Passing from the projection ^bp_i=\left(^bu_i;^bv_i;1\right) of P_i in ''b'' to the projection ^ap_i=\left(^au_i;^av_i;1\right) of P_i in ''a'': : ^ap_i = \fracK_a \cdot H_ \cdot K_b^ \cdot ^bp_i where ^az_i and ^bz_i are the z coordinates of P in each camera frame and where the homography matrix H_ is given by : H_ = R - \frac. R is the
rotation matrix In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix :R = \begin \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end ...
by which ''b'' is rotated in relation to ''a''; ''t'' is the translation
vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...
from ''a'' to ''b''; ''n'' and ''d'' are the normal vector of the plane and the distance from origin to the plane respectively. ''K''''a'' and ''K''''b'' are the cameras' intrinsic parameter matrices. The figure shows camera ''b'' looking at the plane at distance ''d''. Note: From above figure, assuming n^T P_i + d = 0 as plane model, n^T P_i is the projection of vector P_i along n, and equal to -d. So t = t \cdot 1 = t \left(-\frac\right). And we have H_ P_i = R P_i + t where H_ = R - \frac. This formula is only valid if camera ''b'' has no rotation and no translation. In the general case where R_a,R_b and t_a,t_b are the respective rotations and translations of camera ''a'' and ''b'', R=R_a R_b^T and the homography matrix H_ becomes : H_ = R_a R_b^T - \frac where ''d'' is the distance of the camera ''b'' to the plane. The homography matrix can only be computed between images taken from the same camera shot at different angles. It doesn't matter what is present in the images. The matrix contains a warped form of the images.


Affine homography

When the image region in which the homography is computed is small or the image has been acquired with a large focal length, an ''affine homography'' is a more appropriate model of image displacements. An affine homography is a special type of a general homography whose last row is fixed to : h_=h_=0, \; h_=1.


See also

*
Direct linear transformation Direct linear transformation (DLT) is an algorithm which solves a set of variables from a set of similarity relations: : \mathbf_ \propto \mathbf \, \mathbf_   for \, k = 1, \ldots, N where \mathbf_ and \mathbf_ are known vectors, \, ...
*
Epipolar geometry Epipolar geometry is the geometry of stereo vision. When two cameras view a 3D scene from two distinct positions, there are a number of geometric relations between the 3D points and their projections onto the 2D images that lead to constraints b ...
*
Feature (computer vision) In computer vision and image processing, a feature is a piece of information about the content of an image; typically about whether a certain region of the image has certain properties. Features may be specific structures in the image such as po ...
*
Fundamental matrix (computer vision) In computer vision, the fundamental matrix \mathbf is a 3×3 matrix which relates corresponding points in stereo images. In epipolar geometry, with homogeneous image coordinates, x and x′, of corresponding points in a stereo image pair, Fx de ...
*
Pose (computer vision) In the fields of computing and computer vision, pose (or spatial pose) represents the position and orientation of an object, usually in three dimensions. Poses are often stored internally as transformation matrices. The term “pose” is largely ...
*
Photogrammetry Photogrammetry is the science and technology of obtaining reliable information about physical objects and the environment through the process of recording, measuring and interpreting photographic images and patterns of electromagnetic radiant ima ...


References

*{{cite journal , author=O. Chum and T. Pajdla and P. Sturm , title=The Geometric Error for Homographies , journal=Computer Vision and Image Understanding , volume=97 , pages=86–102 , year=2005 , doi=10.1016/j.cviu.2004.03.004 , issue=1, url=http://perception.inrialpes.fr/Publications/2005/CPS05/ChumPajdlaSturm-cviu05.pdf


Toolboxes


homest
is a
GPL The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general u ...
C/
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
library for
robust Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...
, non-linear (based on the
Levenberg–Marquardt algorithm In mathematics and computing, the Levenberg–Marquardt algorithm (LMA or just LM), also known as the damped least-squares (DLS) method, is used to solve non-linear least squares problems. These minimization problems arise especially in least sq ...
) homography estimation from matched point pairs (Manolis Lourakis). *
OpenCV OpenCV (''Open Source Computer Vision Library'') is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage then Itseez (which was later acquired by In ...
is a complete (''open and free'') computer vision software library that has many routines related to homography estimation
cvFindHomography
and re-projection

.


External links

* Serge Belongie & David Kriegman (2007
Explanation of Homography Estimation
from Department of Computer Science and Engineering,
University of California, San Diego The University of California, San Diego (UC San Diego or colloquially, UCSD) is a public university, public Land-grant university, land-grant research university in San Diego, California. Established in 1960 near the pre-existing Scripps Insti ...
. * A. Criminisi, I. Reid & A. Zisserman (1997
"A Plane Measuring Device"
§3 Computing the Plane to Plane Homography, from Visual Geometry Group, Department of Engineering Science,
University of Oxford , mottoeng = The Lord is my light , established = , endowment = £6.1 billion (including colleges) (2019) , budget = £2.145 billion (2019–20) , chancellor ...
. * Elan Dubrofsky (2009
Homography Estimation
Master's thesis A thesis ( : theses), or dissertation (abbreviated diss.), is a document submitted in support of candidature for an academic degree or professional qualification presenting the author's research and findings.International Standard ISO 7144: ...
, from Department of Computer Science,
University of British Columbia The University of British Columbia (UBC) is a public university, public research university with campuses near Vancouver and in Kelowna, British Columbia. Established in 1908, it is British Columbia's oldest university. The university ranks a ...
. * Richard Hartley & Andrew Zisserman (2004
Multiple View Geometry
from Visual Geometry Group, Oxford. Includes
Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
br>Functions
for calculating a homography and the
fundamental matrix (computer vision) In computer vision, the fundamental matrix \mathbf is a 3×3 matrix which relates corresponding points in stereo images. In epipolar geometry, with homogeneous image coordinates, x and x′, of corresponding points in a stereo image pair, Fx de ...
.
GIMP Tutorial – using the Perspective Tool
by Billy Kerr on
YouTube YouTube is a global online video platform, online video sharing and social media, social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by ...
. Shows how to do a perspective transform using
GIMP GIMP ( ; GNU Image Manipulation Program) is a free and open-source raster graphics editor used for image manipulation (retouching) and image editing, free-form drawing, transcoding between different image file formats, and more specialized task ...
. * Allan Jepson (2010
Planar Homographies
from Department of Computer Science,
University of Toronto The University of Toronto (UToronto or U of T) is a public research university in Toronto, Ontario, Canada, located on the grounds that surround Queen's Park. It was founded by royal charter in 1827 as King's College, the first institution ...
. Includes 2D homography from four pairs of corresponding points, mosaics in image processing, removing perspective distortion in computer vision, rendering textures in computer graphics, and computing planar shadows.
Plane transfer homography
Course notes from CSE576 at
University of Washington The University of Washington (UW, simply Washington, or informally U-Dub) is a public research university in Seattle, Washington. Founded in 1861, Washington is one of the oldest universities on the West Coast; it was established in Seattle a ...
in
Seattle Seattle ( ) is a seaport city on the West Coast of the United States. It is the seat of King County, Washington. With a 2020 population of 737,015, it is the largest city in both the state of Washington and the Pacific Northwest regio ...
. * Etienne Vincent & Robert Laganiere (2000
Detecting Planar Homographies in an Image Pair
from School of Information Technology and Engineering,
University of Ottawa The University of Ottawa (french: Université d'Ottawa), often referred to as uOttawa or U of O, is a bilingual public research university in Ottawa, Ontario, Canada. The main campus is located on directly to the northeast of Downtown Ottawa ...
. Describes an algorithm for detecting planes in images, uses random sample consensus (
RANSAC Random sample consensus (RANSAC) is an iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers, when outliers are to be accorded no influence on the values of the estimates. Therefore, it a ...
) method, describes heuristics and iteration. Geometry in computer vision Functions and mappings