The pinhole camera model describes the mathematical relationship between the
coordinates of a point in
three-dimensional space
In geometry, a three-dimensional space (3D space, 3-space or, rarely, tri-dimensional space) is a mathematical space in which three values ('' coordinates'') are required to determine the position of a point. Most commonly, it is the three- ...
and its
projection onto the image plane of an ''ideal''
pinhole camera
A pinhole camera is a simple camera without a lens but with a tiny aperture (the so-called ''Pinhole (optics), pinhole'')—effectively a light-proof box with a small hole in one side. Light from a scene passes through the aperture and projects a ...
, where the camera aperture is described as a point and no lenses are used to focus light. The model does not include, for example,
geometric distortions or blurring of unfocused objects caused by lenses and finite sized apertures. It also does not take into account that most practical cameras have only discrete image coordinates. This means that the pinhole camera model can only be used as a first order approximation of the mapping from a
3D scene to a
2D image
An image or picture is a visual representation. An image can be Two-dimensional space, two-dimensional, such as a drawing, painting, or photograph, or Three-dimensional space, three-dimensional, such as a carving or sculpture. Images may be di ...
. Its validity depends on the quality of the camera and, in general, decreases from the center of the image to the edges as lens distortion effects increase.
Some of the effects that the pinhole camera model does not take into account can be compensated, for example by applying suitable coordinate transformations on the image coordinates; other effects are sufficiently small to be neglected if a high quality camera is used. This means that the pinhole camera model often can be used as a reasonable description of how a camera depicts a 3D scene, for example in
computer vision
Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
and
computer graphics
Computer graphics deals with generating images and art with the aid of computers. Computer graphics is a core technology in digital photography, film, video games, digital art, cell phone and computer displays, and many specialized applications. ...
.
Geometry
The
geometry
Geometry (; ) is a branch of mathematics concerned with properties of space such as the distance, shape, size, and relative position of figures. Geometry is, along with arithmetic, one of the oldest branches of mathematics. A mathematician w ...
related to the mapping of a pinhole camera is illustrated in the figure. The figure contains the following basic objects:
* A 3D orthogonal coordinate system with its origin at O. This is also where the
''camera aperture'' is located. The three axes of the coordinate system are referred to as X1, X2, X3. Axis X3 is pointing in the viewing direction of the camera and is referred to as the ''
optical axis
An optical axis is an imaginary line that passes through the geometrical center of an optical system such as a camera lens, microscope or telescopic sight. Lens elements often have rotational symmetry about the axis.
The optical axis defines ...
'', ''principal axis'', or ''principal ray''. The plane which is spanned by axes X1 and X2 is the front side of the camera, or ''principal plane''.
* An image plane, where the 3D world is projected through the aperture of the camera. The image plane is parallel to axes X1 and X2 and is located at distance
from the origin O in the negative direction of the X3 axis, where ''f'' is the
focal length
The focal length of an Optics, optical system is a measure of how strongly the system converges or diverges light; it is the Multiplicative inverse, inverse of the system's optical power. A positive focal length indicates that a system Converge ...
of the pinhole camera. A practical implementation of a pinhole camera implies that the image plane is located such that it intersects the X3 axis at coordinate ''-f'' where ''f > 0''.
* A point R at the intersection of the optical axis and the image plane. This point is referred to as the ''principal point'' or ''image center''.
* A point P somewhere in the world at coordinate
relative to the axes X1, X2, and X3.
* The ''projection line'' of point P into the camera. This is the green line which passes through point P and the point O.
* The projection of point P onto the image plane, denoted Q. This point is given by the intersection of the projection line (green) and the image plane. In any practical situation we can assume that
> 0 which means that the intersection point is well defined.
* There is also a 2D coordinate system in the image plane, with origin at R and with axes Y1 and Y2 which are parallel to X1 and X2, respectively. The coordinates of point Q relative to this coordinate system is
.
The ''pinhole'' aperture of the camera, through which all projection lines must pass, is assumed to be infinitely small, a point. In the literature this point in 3D space is referred to as the ''optical (or lens or camera) center''.
Formulation
Next we want to understand how the coordinates
of point Q depend on the coordinates
of point P. This can be done with the help of the following figure which shows the same scene as the previous figure but now from above, looking down in the negative direction of the X2 axis.
In this figure we see two
similar triangles, both having parts of the projection line (green) as their
hypotenuses. The
catheti of the left triangle are
and ''f'' and the catheti of the right triangle are
and
. Since the two triangles are similar it follows that
:
or
A similar investigation, looking in the negative direction of the X1 axis gives
:
or
This can be summarized as
:
which is an expression that describes the relation between the 3D coordinates
of point P and its image coordinates
given by point Q in the image plane.
Rotated image and the virtual image plane
The mapping from 3D to 2D coordinates described by a pinhole camera is a
perspective projection
Linear or point-projection perspective () is one of two types of graphical projection perspective in the graphic arts; the other is parallel projection. Linear perspective is an approximate representation, generally on a flat surface, of ...
followed by a 180° rotation in the image plane. This corresponds to how a real pinhole camera operates; the resulting image is rotated 180° and the relative size of projected objects depends on their distance to the focal point and the overall size of the image depends on the distance ''f'' between the image plane and the focal point. In order to produce an unrotated image, which is what we expect from a camera, there are two possibilities:
* Rotate the coordinate system in the image plane 180° (in either direction). This is the way any practical implementation of a pinhole camera would solve the problem; for a photographic camera we rotate the image before looking at it, and for a digital camera we read out the pixels in such an order that it becomes rotated.
* Place the image plane so that it intersects the X3 axis at ''f'' instead of at ''-f'' and rework the previous calculations. This would generate a ''virtual (or front) image plane'' which cannot be implemented in practice, but provides a theoretical camera which may be simpler to analyse than the real one.
In both cases, the resulting mapping from 3D coordinates to 2D image coordinates is given by the expression above, but without the negation, thus
:
In homogeneous coordinates
The mapping from 3D coordinates of points in space to 2D image coordinates can also be represented in
homogeneous coordinates
In mathematics, homogeneous coordinates or projective coordinates, introduced by August Ferdinand Möbius in his 1827 work , are a system of coordinates used in projective geometry, just as Cartesian coordinates are used in Euclidean geometry. ...
. Let
be a representation of a 3D point in
homogeneous coordinates
In mathematics, homogeneous coordinates or projective coordinates, introduced by August Ferdinand Möbius in his 1827 work , are a system of coordinates used in projective geometry, just as Cartesian coordinates are used in Euclidean geometry. ...
(a 4-dimensional vector), and let
be a representation of the image of this point in the pinhole camera (a 3-dimensional vector). Then the following relation holds
:
where
is the
camera matrix and the
means equality between elements of
projective space
In mathematics, the concept of a projective space originated from the visual effect of perspective, where parallel lines seem to meet ''at infinity''. A projective space may thus be viewed as the extension of a Euclidean space, or, more generally ...
s. This implies that the left and right hand sides are equal up to a non-zero scalar multiplication. A consequence of this relation is that also
can be seen as an element of a
projective space
In mathematics, the concept of a projective space originated from the visual effect of perspective, where parallel lines seem to meet ''at infinity''. A projective space may thus be viewed as the extension of a Euclidean space, or, more generally ...
; two camera matrices are equivalent if they are equal up to a scalar multiplication. This description of the pinhole camera mapping, as a linear transformation
instead of as a fraction of two linear expressions, makes it possible to simplify many derivations of relations between 3D and 2D coordinates.
See also
*
Camera resectioning
*
Collinearity equation
The collinearity equations are a set of two equations, used in photogrammetry and computer stereo vision, to relate coordinates in a sensor plane (in two dimensions) to object coordinates (in three dimensions). The equations originate from the P ...
*
Entrance pupil
In an optical system, the entrance pupil is the optical image of the physical aperture stop, as 'seen' through the optical elements in front of the stop. The corresponding image of the aperture stop as seen through the optical elements behin ...
, the equivalent location of the pinhole in relation to object space in a real camera.
*
Exit pupil
In optics, the exit pupil is a virtual aperture in an optical system. Only ray (optics), rays which pass through this virtual aperture can exit the system. The exit pupil is the image of the aperture stop in the optics that follow it. In a optic ...
, the equivalent location of the pinhole in relation to the image plane in a real camera.
*
Ibn al-Haytham
Ḥasan Ibn al-Haytham (Latinization of names, Latinized as Alhazen; ; full name ; ) was a medieval Mathematics in medieval Islam, mathematician, Astronomy in the medieval Islamic world, astronomer, and Physics in the medieval Islamic world, p ...
*
Pinhole camera
A pinhole camera is a simple camera without a lens but with a tiny aperture (the so-called ''Pinhole (optics), pinhole'')—effectively a light-proof box with a small hole in one side. Light from a scene passes through the aperture and projects a ...
, the practical implementation of the mathematical model described in this article.
*
Rectilinear lens
References
Bibliography
*
*
*
*
*
*{{cite book , last1=Szeliski , first1=Richard , title=Computer Vision: Algorithms and Applications , date=2022 , publisher=Springer Nature , isbn=978-3030343729 , pages=925 , edition=2 , url=https://szeliski.org/Book/ , access-date=30 December 2023
Geometry in computer vision
Cameras