Camera resectioning is the process of estimating the parameters of a

pinhole camera model The pinhole camera model describes the mathematical relationship between the coordinates of a point in three-dimensional space and its projection onto the image plane of an ''ideal'' pinhole camera, where the camera aperture is described as a ...

approximating the camera that produced a given photograph or video; it determines which incoming

light ray In optics, a ray is an idealized geometrical model of light or other electromagnetic radiation, obtained by choosing a curve that is perpendicular to the ''wavefronts'' of the actual light, and that points in the direction of energy flow. Rays ...

is associated with each pixel on the resulting image. Basically, the process determines the pose of the pinhole camera. Usually, the camera parameters are represented in a 3 × 4

projection matrix In statistics, the projection matrix (\mathbf), sometimes also called the influence matrix or hat matrix (\mathbf), maps the vector of response values (dependent variable values) to the vector of fitted values (or predicted values). It describes ...

called the '' camera matrix''. The extrinsic parameters define the camera '' pose'' (position and orientation) while the intrinsic parameters specify the camera image format (focal length, pixel size, and image origin). This process is often called geometric camera calibration or simply camera calibration, although that term may also refer to photometric camera calibration or be restricted for the estimation of the intrinsic parameters only. Exterior orientation and interior orientation refer to the determination of only the extrinsic and intrinsic parameters, respectively. The classic camera calibration requires special objects in the scene, which is not required in '' camera auto-calibration''. Camera resectioning is often used in the application of stereo vision where the camera projection matrices of two cameras are used to calculate the 3D world coordinates of a point viewed by both cameras.

Formulation

The camera projection matrix is derived from the intrinsic and extrinsic parameters of the camera, and is often represented by the series of transformations; e.g., a matrix of camera intrinsic parameters, a 3 × 3

rotation matrix In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation (mathematics), rotation in Euclidean space. For example, using the convention below, the matrix :R = \begin \cos \theta & -\sin \theta \\ \sin \t ...

, and a translation vector. The camera projection matrix can be used to associate points in a camera's image space with locations in 3D world space.

Homogeneous coordinates

In this context, we use

\ v\ 1 T

to represent a 2D point position in ''pixel'' coordinates and

_w\ y_w\ z_w\ 1 T

is used to represent a 3D point position in ''world'' coordinates. In both cases, they are represented in

homogeneous coordinates In mathematics, homogeneous coordinates or projective coordinates, introduced by August Ferdinand Möbius in his 1827 work , are a system of coordinates used in projective geometry, just as Cartesian coordinates are used in Euclidean geometry. ...

(i.e. they have an additional last component, which is initially, by convention, a 1), which is the most common notation in

robotics Robotics is the interdisciplinary study and practice of the design, construction, operation, and use of robots. Within mechanical engineering, robotics is the design and construction of the physical structures of robots, while in computer s ...

and

rigid body In physics, a rigid body, also known as a rigid object, is a solid body in which deformation is zero or negligible, when a deforming pressure or deforming force is applied on it. The distance between any two given points on a rigid body rema ...

transforms.

Projection

Referring to the

, a camera matrix

M

is used to denote a projective mapping from ''world'' coordinates to ''pixel'' coordinates. :

\begin
wu\\
wv\\
w\end=K\, \begin
R & T\end\begin
x_\\
y_\\
z_\\
1\end
=M \begin
x_\\
y_\\
z_\\
1\end

where

M = K\, \begin R & T\end

u,v

by convention are the x and y coordinates of the pixel in the camera,

K

is the intrinsic matrix as described below, and

R\,T

form the extrinsic matrix as described below.

x_,y_,z_

are the coordinates of the source of the light ray which hits the camera sensor in world coordinates, relative to the origin of the world. By dividing the matrix product by

w

, the theoretical value for the pixel coordinates can be found.

Intrinsic parameters

K=\begin
\alpha_ & \gamma & u_\\
0 & \alpha_ & v_\\
0 & 0 & 1\end

The

K

contains 5 intrinsic parameters of the specific camera model. These parameters encompass

focal length The focal length of an Optics, optical system is a measure of how strongly the system converges or diverges light; it is the Multiplicative inverse, inverse of the system's optical power. A positive focal length indicates that a system Converge ...

, image sensor format, and camera principal point. The parameters

\alpha_ = f \cdot m_

and

\alpha_ = f \cdot m_

represent focal length in terms of pixels, where

m_

and

m_

are the inverses of the width and height of a pixel on the projection plane and

f

is the

in terms of distance.

\gamma

represents the skew coefficient between the x and the y axis, and is often 0.

u_

and

v_

represent the principal point, which would be ideally in the center of the image. Nonlinear intrinsic parameters such as lens distortion are also important although they cannot be included in the linear camera model described by the intrinsic parameter matrix. Many modern camera calibration algorithms estimate these intrinsic parameters as well in the form of non-linear optimisation techniques. This is done in the form of optimising the camera and distortion parameters in the form of what is generally known as

bundle adjustment In photogrammetry and computer stereo vision, bundle adjustment is simultaneous refining of the 3D coordinates describing the scene geometry, the parameters of the relative motion, and the optical characteristics of the camera(s) employed to acq ...

Extrinsic parameters

\beginR_ & T_ \\
0_ & 1\end_

R,T

are the extrinsic parameters which denote the coordinate system transformations from 3D world coordinates to 3D camera coordinates. Equivalently, the extrinsic parameters define the position of the camera center and the camera's heading in world coordinates.

T

is the position of the origin of the world coordinate system expressed in coordinates of the camera-centered coordinate system.

T

is often mistakenly considered the position of the camera. The position,

C

, of the camera expressed in world coordinates is

C = -R^T = -R^T T

(since

R

is a

). This can be verified by checking that the point

R^T, 1 /math> is transformed to, 0, 0, 1 T, which is what is expected (since the camera's location is, in the camera's coordinates, the origin).

Camera calibration is often used as an early stage in

computer vision Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...

. When a

camera A camera is an instrument used to capture and store images and videos, either digitally via an electronic image sensor, or chemically via a light-sensitive material such as photographic film. As a pivotal technology in the fields of photograp ...

is used, light from the environment is focused on an image plane and captured. This process reduces the dimensions of the data taken in by the camera from three to two (light from a 3D scene is stored on a 2D image). Each

pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a Raster graphics, raster image, or the smallest addressable element in a dot matrix display device. In most digital display devices, p ...

on the image plane therefore corresponds to a shaft of light from the original scene.

Algorithms

There are many different approaches to calculate the intrinsic and extrinsic parameters for a specific camera setup. The most common ones are: # Direct linear transformation (DLT) method # Zhang's method # Tsai's method # Selby's method (for X-ray cameras)

Zhang's method

Zhang's method is a camera calibration method that uses traditional calibration techniques (known calibration points) and self-calibration techniques (correspondence between the calibration points when they are in different positions). To perform a full calibration by the Zhang method, at least three different images of the calibration target/gauge are required, either by moving the gauge or the camera itself. If some of the intrinsic parameters are given as data (orthogonality of the image or optical center coordinates), the number of images required can be reduced to two. In a first step, an approximation of the estimated projection matrix

H

between the calibration target and the image plane is determined using DLT method. Subsequently, self-calibration techniques are applied to obtain the image of the absolute conic matrix. The main contribution of Zhang's method is how to, given

n

poses of the calibration target, extract a constrained intrinsic matrix

K

, along with

n

instances of

R

and

T

calibration parameters.

Derivation

Assume we have a homography

\textbf

that maps points

x_\pi

on a "probe plane"

\pi

to points

x

on the image. The circular points

I, J = \begin1 & \pm j & 0\end^

lie on both our probe plane

\pi

and on the absolute conic

\Omega_\infty

. Lying on

\Omega_\infty

of course means they are also projected onto the ''image'' of the absolute conic (IAC)

\omega

, thus

x_1^T \omega x_1= 0

and

x_2^T \omega x_2= 0

. The circular points project as :

\begin
x_1 & = \textbf I = 
\begin
h_1 & h_2 & h_3
\end
\begin
1 \\
j \\
0
\end
= h_1 + j h_2
\\
x_2 & =  \textbf J =
\begin
h_1 & h_2 & h_3
\end
\begin
1 \\
-j \\
0
\end
= h_1 - j h_2
\end

. We can actually ignore

x_2

while substituting our new expression for

x_1

as follows: :

\begin
x_1^T \omega x_1 &= \left ( h_1 + j h_2 \right )^T \omega \left ( h_1 + j h_2 \right ) \\
 &= \left ( h_1^T + j h_2^T \right ) \omega \left ( h_1 + j h_2 \right ) \\
 &= h_1^T \omega h_1 + j \left ( h_2^T \omega h_2 \right ) \\
 &= 0
\end

Tsai's algorithm

Tsai's algorithm, a significant method in camera calibration, involves several detailed steps for accurately determining a camera's orientation and position in 3D space. The procedure, while technical, can be generally broken down into three main stages:

Initial Calibration

The process begins with the initial calibration stage, where a series of images are captured by the camera. These images, often featuring a known calibration pattern like a checkerboard, are used to estimate intrinsic camera parameters such as focal length and optical center.Roger Y. Tsai, "A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses," ''IEEE Journal of Robotics and Automation'', Vol. RA-3, No.4, August 1987 In some applications, variants of the chessboard target are used which are robust to partial occlusions. Such targets like the ChArUcoOpenCV. https://docs.opencv.org/3.4/df/d4a/tutorial_charuco_detection.html. and PuzzleBoard targetsP. Stelldinger, et al. "PuzzleBoard: A New Camera Calibration Pattern with Position Encoding." German Conference on Pattern Recognition. (2024). https://users.informatik.haw-hamburg.de/~stelldinger/pub/PuzzleBoard/. (2024). simplify the measurement of distortions in the corners of the camera sensor.

Pose Estimation

Following initial calibration, the algorithm undertakes pose estimation. This involves calculating the camera's position and orientation relative to a known object in the scene. The process typically requires identifying specific points in the calibration pattern and solving for the camera's rotation and translation vectors.

Refinement of Parameters

The final phase is the refinement of parameters. In this stage, the algorithm refines the lens distortion coefficients, addressing radial and tangential distortions. Further optimization of internal and external camera parameters is performed to enhance the calibration accuracy. This structured approach has positioned Tsai's Algorithm as a pivotal technique in both academic research and practical applications within robotics and industrial metrology.

Selby's method (for X-ray cameras)

Selby's camera calibration methodBoris Peter Selby et al.
"Patient positioning with X-ray detector self-calibration for image guided therapy"
{{Webarchive, url=https://web.archive.org/web/20231110063713/https://link.springer.com/article/10.1007/s13246-011-0090-4 , date=2023-11-10 , Australasian Physical & Engineering Science in Medicine, Vol.34, No.3, pages 391–400, 2011 addresses the auto-calibration of X-ray camera systems. X-ray camera systems, consisting of the X-ray generating tube and a solid state detector can be modelled as pinhole camera systems, comprising 9 intrinsic and extrinsic camera parameters. Intensity based registration based on an arbitrary X-ray image and a reference model (as a tomographic dataset) can then be used to determine the relative camera parameters without the need of a special calibration body or any ground-truth data.

References

External links

Zhang's Camera Calibration Method with Software

Camera Calibration
- Augmented reality lecture at TU Muenchen, Germany Geometry in computer vision Mixed reality Stereophotogrammetry