Camera Calibration Theory

Single Camera Model

We assume the camera adheres to the “pin-hole” model. With this model, there are certain quantities we need to find to characterize the “intrinsics” of the system:

  1. position of image center
  2. focal length
  3. scaling factors to convert height and width of pixels to physical units
  4. lens distortion

Note that 2) and 3) are typically combined as, mathematically, it’s not necessary to know the focal point explicitly. This is talked about in the next section.

The diagram below describes the model:


Where:

  • image point: (x_i, y_i)
  • scene point: (x_s, y_s, z_s)

We can compute image points from scene points via:

x_i=f\frac{x_s}{z_s}

y_i=f\frac{y_s}{z_s}

If you delay the division by z_s, the equations become linear:

\begin{bmatrix}u \\v\\w\end{bmatrix} = \begin{bmatrix}f & 0 & 0\\0 & f & 0\\0 & 0 & 1\end{bmatrix}\begin{bmatrix}x_s \\y_s\\z_s\end{bmatrix}

x_i and y_i can be recovered by:

x_i=\frac{u}{w}

y_i=\frac{v}{w}

Transformation from physical units in the image plane to pixels

The diagram below describes this transformation:

Where:

  • image center: (x_o, y_o)
  • scaling factors: k_x, k_y

This leads to:

x_{pix} = k_x x_i + x_o = k_x f \frac{x_s}{z_s} + x_o

y_{pix} = k_y y_i + y_o = k_y f \frac{y_s}{z_s} + y_o

Again, delaying the division of z_s results in a linear equation:

\begin{bmatrix}u^{\prime} \\v^{\prime}\\w^{\prime}\end{bmatrix} = \begin{bmatrix}\alpha_x & 0 & x_o\\0 & \alpha_y & y_o\\0 & 0 & 1\end{bmatrix}\begin{bmatrix}x_s \\y_s\\z_s\end{bmatrix} = \begin{bmatrix}A\end{bmatrix} \begin{bmatrix}x_s \\y_s\\z_s\end{bmatrix}

where:

  • \alpha_x = f k_x
  • \alpha_y = f k_y
  • x_{pix} = \frac{u^{\prime}}{w^{\prime}}
  • y_{pix} = \frac{v^{\prime}}{w^{\prime}}

As discussed in the previous section, f, k_x, and k_y are typically not found separately; instead, \alpha_x and \alpha_y are found.

Camera Coordinates to World Coordinates

The diagram below describes this transformation:

Where:

    • world point: (X_w, Y_w, Z_w)

This leads to:

\begin{bmatrix}x_s \\y_s\\z_s\end{bmatrix} = \begin{bmatrix} \begin{bmatrix} R \end{bmatrix} \vec{t} \end{bmatrix}\begin{bmatrix}X_w \\ Y_w \\Z_w \\ 1\end{bmatrix}

If we left-multiply both sides by \begin{bmatrix}A\end{bmatrix}:

\begin{bmatrix}u^{\prime} \\v^{\prime}\\w^{\prime}\end{bmatrix} = \begin{bmatrix}\alpha_x & 0 & x_o\\0 & \alpha_y & y_o\\0 & 0 & 1\end{bmatrix} \begin{bmatrix} \begin{bmatrix} R \end{bmatrix} \vec{t} \end{bmatrix}\begin{bmatrix}X_w \\ Y_w \\Z_w \\ 1\end{bmatrix}

This is a linear equation which transforms world coordinates to pixel coordinates (recall: x_{pix} = \frac{u^{\prime}}{w^{\prime}} and y_{pix} = \frac{v^{\prime}}{w^{\prime}} ). However, this does not yet take lens distortion into account.

Lens Distortion

We model lens distortion to occur in the image plane; we now must make a distinction between distorted and undistorted coordinates:

x_{pix}^{ideal} = k_x x_i + x_o

y_{pix}^{ideal} = k_y y_i + y_o

and:

x_{pix}^{distorted} = k_x x_i^{distorted} + x_o

y_{pix}^{distorted} = k_y y_i^{distorted} + y_o

We assume there are radial and tangential lens distortions:

x_i^{distorted} = x_i + \delta x_i^{(r)} + \delta x_i^{(t)}

y_i^{distorted} = y_i + \delta y_i^{(r)} + \delta y_i^{(t)}

Where:

    • \delta x_i^{(r)} = x_i(k_1 r_i^{2} + k_2 r_i^{4})
    • \delta y_i^{(r)} = y_i(k_1 r_i^{2} + k_2 r_i^{4})
    • \delta x_i^{(t)} = 2 p_1 x_i y_i + p_2(r_i^{2} +2 x_i^{2})
    • \delta y_i^{(t)} = p_1(r_i^{2} +2 y_i^{2}) + 2 p_2 x_i y_i
    •  r_i = \sqrt{x_i^{2} + y_i^{2}}

All together:

x_{pix}^{distorted} = k_x (x_i + \delta x_i^{(r)} + \delta x_i^{(t)}) + x_o

y_{pix}^{distorted} = k_y (y_i + \delta y_i^{(r)} + \delta y_i^{(t)}) + y_o

The following diagram describes the lens distortion:

Single Camera Calibration Numerical Implementation

We now have a full camera model to work with. Some things:

  1. Camera calibration involves using a calibration target with known world coordinates (X_w, Y_w, and Z_w). We take pictures of this calibration target, find these known coordinates in “pixel space”, and then find the “intrinsic” and “extrinsic” parameters, through linear and non-linear optimization, which make:

    x_{pix}^{measured} \approx x_{pix}^{distorted}

    y_{pix}^{measured} \approx y_{pix}^{distorted}

    i.e. we’re trying to make the model fit our measurements.

  2. We re-parameterized things to be linear in the previous section so that linear optimization can be used. This is possible up until lens distortion is involved. So, an initial guess can be made (usually assuming zero lens distortion, a reasonable assumption for most applications) with linear optimization, and then it is followed up with non-linear refinement with the full camera model.
  3. For non-linear optimization, the rotation matrix needs to be re-parameterized so that it has 3 parameters. This is because the rotation matrix has 9 elements, but only 3 degrees of freedom. I use Euler angles here, but Rodriguez’s formula is also popular.
  4. If the calibration target is flat, then Z_w=0 and:

    \begin{bmatrix}u^{\prime} \\v^{\prime}\\w^{\prime}\end{bmatrix} = \begin{bmatrix} A \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} & \vec{t} \end{bmatrix}\begin{bmatrix}X_w \\ Y_w \\ 1\end{bmatrix} = \begin{bmatrix} H \end{bmatrix} \begin{bmatrix}X_w \\ Y_w \\ 1\end{bmatrix}

    \begin{bmatrix} H \end{bmatrix} is known as a homography and acts as a projection operator for homogeneous coordinates. Often, it is found up to a scale factor.

For this article, I’m assuming a 2D checkerboard pattern is used for calibration. The next sections explain how calibration (i.e. estimation of “intrinsic” and “extrinsic” parameters) is done for this case.

Sub-pixel corner detection

This is based on the fact that at a corner, a vector with the tail at the corner and pointing in a region around the corner should always have a zero dot product with the intensity gradient at the tip of the vector:

  • in flat regions: <\nabla I(\vec{p}),\vec{p}-\vec{q}> = 0
  • in boundary regions: <\nabla I(\vec{p}),\vec{p}-\vec{q}> = 0

The dot product is zero in flat regions because the gradient is zero in that region. In boundary regions, it’s zero because \nabla I(\vec{p}) and  \vec{p}-\vec{q} are orthogonal:

You can use this constraint to set up a linear set of equations to solve for  \vec{q} by sampling points around an initial guess of  \vec{q} :

One way to get an initial guess for  \vec{q} is to get the four corners of the entire checkerboard, then compute a homography, and then reproject the world coordinates of the checkerboard into pixel space. If the lens distortion is low, this should be a reasonable initial guess for  \vec{q} , which can then be refined with this method.

Here’s the math:

\nabla I_x (p_x-q_x) + \nabla I_y (p_y - q_y) = 0

\nabla I_x p_x- \nabla I_x q_x + \nabla I_y p_y - \nabla I_y q_y = 0

\nabla I_x q_x + \nabla I_y q_y = \nabla I_x p_x + \nabla I_y p_y

Reformed as matrices:

\begin{bmatrix}\nabla I_{x_1} & \nabla I_{y_1} \\ \vdots & \vdots \\ \nabla I_{x_N} & \nabla I_{y_N} \end{bmatrix}\begin{bmatrix} q_x \\ q_y\end{bmatrix}=\begin{bmatrix} \nabla I_{x_1} p_{x_1} + \nabla I_{y_1} p_{y_1} \\ \vdots \\ \nabla I_{x_N} p_{x_N} + \nabla I_{y_N} p_{y_N} \end{bmatrix}

So now  \vec{q} can be solved for using a linear solver.

Some things you can do to improve this algorithm:

  • Apply a Gaussian kernel to the sampling points around  \vec{q} to give less weight to points further away from the corner.
  • Apply multiple iterations (will require interpolation) until convergence is reached (i.e. position of  \vec{q} moves less than a certain threshold)

Note that large  \vec{p} and large  \nabla I(\vec{p}) will influence a linear least squares solver more, so if there just happens to be a large gradient far away in the sampling grid, the solver will try to make  \vec{q} more orthogonal to that, than a closer gradient with smaller intensity (maybe due to blur), which may not be desirable. Using a Gaussian kernel should help mitigate this potential problem, but it is still an important consideration to keep in mind.

Computing Homographies

If we have two sets of points, for example a set of corner points on our checkerboard in “world coordinates” (i.e. in real units with respect to the checkerboard) and the corresponding set of points in “pixel space” (i.e. we have detected these corner points in an image we’ve taken of the checkerboard), then we can approximate the transformation from these two sets of points using a homography. The assumption here is we’re using the pin-hole camera model and there is very little lens distortion.

Let:

\vec{\tilde{m}} = \begin{bmatrix}u^{\prime} & v^{\prime} & w^{\prime}\end{bmatrix}^T

\vec{\tilde{M}} = \begin{bmatrix}X_w & Y_w & 1\end{bmatrix}^T

\begin{bmatrix} H \end{bmatrix} = \begin{bmatrix} A \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} & \vec{t} \end{bmatrix}

Then:

\vec{\tilde{m}} = \begin{bmatrix} H \end{bmatrix} \vec{\tilde{M}}

Let:

\begin{bmatrix} H \end{bmatrix}=\begin{bmatrix} \vec{\bar{h_1}}^T \\ \vec{\bar{h_2}}^T \\ \vec{\bar{h_3}}^T\end{bmatrix}

Then:

u^{\prime} = \vec{\bar{h_1}}^T \vec{\tilde{M}}=\frac{u^{\prime}}{w^{\prime}}\vec{\bar{h_3}}^T \vec{\tilde{M}}

v^{\prime} = \vec{\bar{h_2}}^T \vec{\tilde{M}}=\frac{v^{\prime}}{w^{\prime}}\vec{\bar{h_3}}^T \vec{\tilde{M}}

w^{\prime} = \vec{\bar{h_3}}^T \vec{\tilde{M}}

We form homogeneous equations:

\vec{\bar{h_1}}^T \vec{\tilde{M}}-x_{pix}\vec{\bar{h_3}}^T \vec{\tilde{M}} = 0

\vec{\bar{h_2}}^T \vec{\tilde{M}}-y_{pix}\vec{\bar{h_3}}^T \vec{\tilde{M}} = 0

Then, we put them in matrix form:

\begin{bmatrix} \vec{\tilde{M}}^T & \vec{0} & -x_{pix} \vec{\tilde{M}}^T \\ \vec{0} & \vec{\tilde{M}}^T & -y_{pix} \vec{\tilde{M}}^T \end{bmatrix} \begin{bmatrix} \vec{\bar{h_1}} \\ \vec{\bar{h_2}} \\ \vec{\bar{h_3}} \end{bmatrix} = \begin{bmatrix} L \end{bmatrix} \vec{x} = \vec{0}

To solve:

\begin{bmatrix} L \end{bmatrix} \vec{x} = \vec{0}

The solution is the eigenvector of \begin{bmatrix} L \end{bmatrix}^T \begin{bmatrix} L \end{bmatrix} associated with the smallest eigenvalue. To compute it, use the SVD:

\begin{bmatrix} L \end{bmatrix} = \begin{bmatrix} U \end{bmatrix} \begin{bmatrix} D \end{bmatrix} \begin{bmatrix} V \end{bmatrix}^T

\begin{bmatrix} L \end{bmatrix}^T \begin{bmatrix} L\end{bmatrix}=\begin{bmatrix}V \end{bmatrix}\begin{bmatrix}D \end{bmatrix}^T\begin{bmatrix}U \end{bmatrix}^T\begin{bmatrix}U \end{bmatrix}\begin{bmatrix}D \end{bmatrix}\begin{bmatrix}V \end{bmatrix}^T=\begin{bmatrix}V \end{bmatrix}\begin{bmatrix}D \end{bmatrix}^T\begin{bmatrix}D \end{bmatrix}\begin{bmatrix}V \end{bmatrix}^T

Assuming the eigenvalues are sorted in descending order, then the last column of \begin{bmatrix} V \end{bmatrix} will contain the least eigenvector.

Note that since this is a homogeneous equation, once we solve for \vec{x}, what we will actually find is \begin{bmatrix} \hat{H} \end{bmatrix} = \lambda \begin{bmatrix} H \end{bmatrix} , i.e. a scaled version of \begin{bmatrix} A \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} & \vec{t} \end{bmatrix}. It’s important to keep this in mind.

Also, there is a normalized version of the above which makes it more stable and accounts for the fact that (x_{pix},y_{pix}) and (X_w,Y_w) may have very different scales, which may make  \begin{bmatrix} L \end{bmatrix} ill conditioned.

The normalization step does two things:

  1. translates both sets of points so their centroids are at the origin
  2. both sets of points are scaled so the average distance from the origin is equal to \sqrt{2}, which I believe is done to ensure the x and y components are close to 1.

A “normalization matrix” is used:

 \begin{bmatrix}T_m\end{bmatrix} = \begin{bmatrix} s_m & 0 & \bar{-x_{pix}}s_m \\ 0 & s_m & -\bar{y_{pix}}s_m \\ 0 & 0 & 1 \end{bmatrix}

where:

\bar{x_{pix}}= mean of x_{pix}

\bar{y_{pix}}= mean of y_{pix}

s_m=\frac{\sqrt{2}N}{\sum_{i=1}^{N}\sqrt{(x_{pix_i} - \bar{x_{pix}})^2+(y_{pix_i} - \bar{y_{pix}})^2}}

The same matrix is also constructed for the set of points in world coordinates, which we call \begin{bmatrix} T_M \end{bmatrix} .

Note:

  • Original homography: \vec{\tilde{m}} = \begin{bmatrix} H \end{bmatrix} \vec{\tilde{M}}
  • Preconditioned homography: \vec{\tilde{m^{\prime}}} = \begin{bmatrix} H^{\prime} \end{bmatrix} \vec{\tilde{M^{\prime}}}

where:

\vec{\tilde{m^{\prime}}} = \begin{bmatrix}T_m\end{bmatrix}\vec{\tilde{m}}

\vec{\tilde{M^{\prime}}}= \begin{bmatrix}T_M\end{bmatrix}\vec{\tilde{M}}

If we do the math to see the updated \begin{bmatrix} L \end{bmatrix}:

\begin{gathered} \begin{bmatrix} s_m & 0 & \bar{-x_{pix}}s_m \\ 0 & s_m & -\bar{y_{pix}}s_m \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix}u^{\prime} \\v^{\prime}\\w^{\prime}\end{bmatrix} = \begin{bmatrix}s_m u^{\prime}-\bar{x_{pix}}s_m w^{\prime} \\s_m v^{\prime}-\bar{y_{pix}}s_m w^{\prime} \\ w^{\prime} \end{bmatrix}=\begin{bmatrix}s_m (u^{\prime}-\bar{x_{pix}} w^{\prime}) \\s_m (v^{\prime}-\bar{y_{pix}} w^{\prime}) \\ w^{\prime} \end{bmatrix}=\vec{\tilde{m^{\prime}}} \end{gathered}

\begin{gathered} \begin{bmatrix} s_M & 0 & \bar{-X_w}s_M \\ 0 & s_M & -\bar{Y_w}s_M \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix}X_w \\ Y_w \\ 1 \end{bmatrix} = \begin{bmatrix}s_M X_w-\bar{X_w}s_M \\s_M Y_w-\bar{Y_w}s_M \\ 1 \end{bmatrix}=\begin{bmatrix}s_M (X_w-\bar{X_w}) \\s_M (Y_w-\bar{Y_w}) \\ 1 \end{bmatrix}=\vec{\tilde{M^{\prime}}} \end{gathered}

Then:

s_m (u^{\prime}-\bar{x_{pix}} w^{\prime}) = \vec{\bar{h_1^{\prime}}}^T \vec{\tilde{M^{\prime}}}=\frac{s_m (u^{\prime}-\bar{x_{pix}} w^{\prime})}{w^{\prime}}\vec{\bar{h_3^{\prime}}}^T \vec{\tilde{M^{\prime}}}

s_m (v^{\prime}-\bar{y_{pix}} w^{\prime}) = \vec{\bar{h_2^{\prime}}}^T \vec{\tilde{M^{\prime}}}=\frac{s_m (v^{\prime}-\bar{y_{pix}} w^{\prime}) }{w^{\prime}}\vec{\bar{h_3^{\prime}}}^T \vec{\tilde{M^{\prime}}}

w^{\prime} = \vec{\bar{h_3^{\prime}}}^T \vec{\tilde{M^{\prime}}}

We form homogeneous equations:

\vec{\bar{h_1^{\prime}}}^T \vec{\tilde{M^{\prime}}}-s_m(x_{pix}-\bar{x_{pix}})\vec{\bar{h_3^{\prime}}}^T \vec{\tilde{M^{\prime}}} = 0

\vec{\bar{h_2^{\prime}}}^T \vec{\tilde{M^{\prime}}}-s_m(y_{pix}-\bar{y_{pix}})\vec{\bar{h_3^{\prime}}}^T \vec{\tilde{M^{\prime}}} = 0

Then, we put them in matrix form:

\begin{bmatrix} \vec{\tilde{M^{\prime}}}^T & \vec{0} & -s_m(x_{pix}-\bar{x_{pix}}) \vec{\tilde{M^{\prime}}}^T \\ \vec{0} & \vec{\tilde{M^{\prime}}}^T & -s_m(y_{pix}-\bar{y_{pix}}) \vec{\tilde{M^{\prime}}}^T \end{bmatrix} \begin{bmatrix} \vec{\bar{h_1^{\prime}}} \\ \vec{\bar{h_2^{\prime}}} \\ \vec{\bar{h_3^{\prime}}} \end{bmatrix} = \begin{bmatrix} L^{\prime} \end{bmatrix} \vec{x^{\prime}}= \vec{0}

We can now solve for \begin{bmatrix} \hat{H^{\prime}} \end{bmatrix} using \begin{bmatrix} L^{\prime} \end{bmatrix} \vec{x^{\prime}} = \vec{0} . But, we must recover the original homography after we have computed the preconditioned homography; to recover it:

\begin{bmatrix}T_m\end{bmatrix}\vec{\tilde{m}}=\begin{bmatrix} {H}^{\prime}\end{bmatrix} \begin{bmatrix}T_M\end{bmatrix}\vec{\tilde{M}}

\vec{\tilde{m}}=\begin{bmatrix}T_m\end{bmatrix}^{-1}\begin{bmatrix} H^{\prime}\end{bmatrix} \begin{bmatrix}T_M\end{bmatrix}\vec{\tilde{M}}

This means:

\begin{bmatrix}H\end{bmatrix} = \begin{bmatrix}T_m\end{bmatrix}^{-1}\begin{bmatrix} H^{\prime}\end{bmatrix} \begin{bmatrix}T_M\end{bmatrix}

Which implies:

\begin{bmatrix}\hat{H}\end{bmatrix} = \begin{bmatrix}T_m\end{bmatrix}^{-1}\begin{bmatrix} \hat{H^{\prime}}\end{bmatrix} \begin{bmatrix}T_M\end{bmatrix}

So, to use the preconditioned approach:

  1. Use \begin{bmatrix} L^{\prime} \end{bmatrix} \vec{x^{\prime}} = \vec{0} and the SVD method to solve for \begin{bmatrix} \hat{H^{\prime}} \end{bmatrix}
  2. Recover \begin{bmatrix}\hat{H}\end{bmatrix} with \begin{bmatrix}T_m\end{bmatrix}^{-1}\begin{bmatrix}\hat{ H^{\prime}}\end{bmatrix} \begin{bmatrix}T_M\end{bmatrix}

So now we have a linear method for computing homographies. Another refinement you can do is a nonlinear optimization of the homography using the linear method as the initial guess. This is more or less optional but included for completeness.

The first step is to divide the initial linear homography guess by \begin{bmatrix}\hat{H}\end{bmatrix}_{3,3} so that \begin{bmatrix}\hat{H}\end{bmatrix}_{3,3}=1. Homographies have 8 degrees of freedom and this is a typical constraint added to parameterize the homography for nonlinear optimization.

Next, we derive the Jacobian so we can perform Gauss-Newton iterations (note the initial guess is typically very good and the optimization is well behaved, so in my experience more robust methods like Levenberg-Marquardt aren’t necessary).

We start with:

\begin{bmatrix}u^{\prime} \\v^{\prime}\\w^{\prime}\end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & 1 \end{bmatrix} \begin{bmatrix}X_w \\ Y_w \\ 1\end{bmatrix}

Let:

x_{pix}^{ideal} = \frac{u^{\prime}}{w^{\prime}} = \frac{h_{11}X_w+h_{12}Y_w+h_{13}}{h_{31}X_w+h_{32}Y_w+1}

y_{pix}^{ideal} = \frac{v^{\prime}}{w^{\prime}} = \frac{h_{21}X_w+h_{22}Y_w+h_{23}}{h_{31}X_w+h_{32}Y_w+1}

Set the cost function to:

 C_{LS}=\sum_{i=1}^{N}(x^i_p-x^m_p)^2+(y^i_p-y^m_p)^2

where i = ideal, m = measured, and p = pix.

To compute the Jacobian, we start off with the residuals:

res_x=x^i(\vec{h})-x^m

res_y=y^i(\vec{h})-y^m

Take the partial derivative:

\frac{\partial{res_x}}{\partial{\vec{h}}} = \frac{\partial{x^i}}{\partial{\vec{h}}}

\frac{\partial{res_y}}{\partial{\vec{h}}} = \frac{\partial{y^i}}{\partial{\vec{h}}}

The Jacobian has the form:

 \begin{bmatrix}J\end{bmatrix}_{ij} = \begin{bmatrix} \frac{\partial{res_i}}{\partial{\vec{h_j}}}\end{bmatrix}

  \newcommand\overmat[2]{% \makebox[0pt][l]{$\smash{\color{white}\overbrace{\phantom{% \begin{matrix}#2 \end{matrix}}}^{\text{\color{black}#1}}}$}#2} \newcommand\bovermat[2]{% \makebox[0pt][l]{$\smash{\overbrace{\phantom{% \begin{matrix}#2\end{matrix}}}^{\text{#1}}}$}#2} \newcommand\coolrightbrace[2]{% \left.\vphantom{\begin{matrix} #1 \end{matrix}}\right\}#2}  \begin{bmatrix}J\end{bmatrix} = \begin{blockarray}{ccccccccc} & \partial{h_{11}} & \partial{h_{21}} & \partial{h_{31}} & \partial{h_{12}} & \partial{h_{22}} & \partial{h_{32}} & \partial{h_{13}} & \partial{h_{23}} \\ \begin{block}{c[cccccccc@{\hspace*{5pt}}]} \partial{res_{x}} & $\frac{X_w}{w^{\prime}}$ & 0 & $\frac{-u^{\prime}X_w}{w^{\prime}^2}$ & $\frac{Y_w}{w^{\prime}}$ & 0 & $\frac{-u^{\prime}Y_w}{w^{\prime}^2}$ & $\frac{1}{w^{\prime}}$ & 0 \\ \partial{res_{y}} & 0 & $\frac{X_w}{w^{\prime}}$ & $\frac{-v^{\prime}X_w}{w^{\prime}^2}$ & 0 & $\frac{Y_w}{w^{\prime}}$ & $\frac{-v^{\prime}Y_w}{w^{\prime}^2}$ & 0 & $\frac{1}{w^{\prime}}$ \\ \end{block} \end{blockarray}

Where:

  \newcommand\overmat[2]{% \makebox[0pt][l]{$\smash{\color{white}\overbrace{\phantom{% \begin{matrix}#2 \end{matrix}}}^{\text{\color{black}#1}}}$}#2} \newcommand\bovermat[2]{% \makebox[0pt][l]{$\smash{\overbrace{\phantom{% \begin{matrix}#2\end{matrix}}}^{\text{#1}}}$}#2} \newcommand\coolrightbrace[2]{% \left.\vphantom{\begin{matrix} #1 \end{matrix}}\right\}#2}  \begin{bmatrix}J\end{bmatrix} = \begin{bmatrix} \bovermat{8}{\frac{\partial{res_{x_1}}}{\partial{\vec{h}}}} \\ \vdots \\ \frac{\partial{res_{y_1}}}{\partial{\vec{h}}} \\ \vdots \\ \end{bmatrix}% \begin{matrix} \coolrightbrace{\frac{\partial{res_{x_1}}}{\partial{\vec{h}}} \\ \vdots \\ \frac{\partial{res_{y_1}}}{\partial{\vec{h}}} \\ \vdots}{2N} \\ \end{matrix}

And you can perform Gauss-Newton updates using the pseudo inverse (although you really should use a direct linear solver):

 \Delta\vec{h} = -\begin{bmatrix}J\end{bmatrix}^+\vec{res}

\vec{h_{new}} = \vec{h_{old}} + \Delta\vec{h}

Until some convergence condition.

Now, we have a method to determine a fully optimized homography given two sets of points.

Initial guess of intrinsic parameters

After homographies for each image of a calibration board are computed, we need to compute an initial guess for the intrinsic parameters.

We can only find an initial linear guess for the parameters in \begin{bmatrix}A\end{bmatrix}: x_o,  y_o , \alpha_x, and  \alpha_y . We assume that the lens distortion parameters are zero, which is reasonable for most camera/lens combinations.

Furthermore, we assume the following:

  • x_o = \frac{image \textunderscore width+1}{2}
  • y_o = \frac{image \textunderscore height+1}{2}
  • \alpha_x = \alpha_y

We assume the first two things because the principle point estimation is usually difficult and unstable, so we assume it’s at the middle of the image (this also assumes the first pixel has a position of (1,1); i.e. one-based indexing). It’s also a good assumption for most cameras that k_x=k_y (assuming the pixels are square) and hence \alpha_x = \alpha_y = \alpha. These constraints will allow us to get a better initial guess for \alpha.

Let:

\begin{bmatrix}\bar{H}\end{bmatrix}=\begin{bmatrix}1 & 0 & -x_o \\ 0 & 1 & -y_o \\0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \hat{H} \end{bmatrix} = \lambda \begin{bmatrix} \alpha & 0 & 0 \\ 0 & \alpha & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} & \vec{t} \end{bmatrix}

Now, let:

 \vec{v_1} = \begin{bmatrix} \bar{H} \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} = \lambda \begin{bmatrix} \alpha & 0 & 0 \\ 0 & \alpha & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix}

 \vec{v_2} = \begin{bmatrix} \bar{H} \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} = \lambda \begin{bmatrix} \alpha & 0 & 0 \\ 0 & \alpha & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix}

 \vec{v_3} = \begin{bmatrix} \bar{H} \end{bmatrix} \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \lambda \begin{bmatrix} \alpha & 0 & 0 \\ 0 & \alpha & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix}

 \vec{v_4} = \begin{bmatrix} \bar{H} \end{bmatrix} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} = \lambda \begin{bmatrix} \alpha & 0 & 0 \\ 0 & \alpha & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix}

Then:

 \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \frac{1}{\lambda} \begin{bmatrix} \vec{r_1}^T \\ \vec{r_2}^T \end{bmatrix} \begin{bmatrix} \frac{1}{\alpha} & 0 & 0 \\ 0 & \frac{1}{\alpha} & 0 \\ 0 & 0 & 1 \end{bmatrix} \vec{v_1}

 \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \frac{1}{\lambda} \begin{bmatrix} \vec{r_1}^T \\ \vec{r_2}^T \end{bmatrix} \begin{bmatrix} \frac{1}{\alpha} & 0 & 0 \\ 0 & \frac{1}{\alpha} & 0 \\ 0 & 0 & 1 \end{bmatrix} \vec{v_2}

 \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \frac{1}{\lambda} \begin{bmatrix} \vec{r_1}^T \\ \vec{r_2}^T \end{bmatrix} \begin{bmatrix} \frac{1}{\alpha} & 0 & 0 \\ 0 & \frac{1}{\alpha} & 0 \\ 0 & 0 & 1 \end{bmatrix} \vec{v_3}

 \begin{bmatrix} 1 \\ -1 \end{bmatrix} = \frac{1}{\lambda} \begin{bmatrix} \vec{r_1}^T \\ \vec{r_2}^T \end{bmatrix} \begin{bmatrix} \frac{1}{\alpha} & 0 & 0 \\ 0 & \frac{1}{\alpha} & 0 \\ 0 & 0 & 1 \end{bmatrix} \vec{v_4}

Then:

 \begin{bmatrix} 1 & 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \bigg( \frac{1}{\lambda} \vec{v_1}^T \begin{bmatrix} \frac{1}{\alpha} & 0 & 0 \\ 0 & \frac{1}{\alpha} & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} \end{bmatrix}\bigg) \bigg(\frac{1}{\lambda} \begin{bmatrix} \vec{r_1}^T \\ \vec{r_2}^T \end{bmatrix} \begin{bmatrix} \frac{1}{\alpha} & 0 & 0 \\ 0 & \frac{1}{\alpha} & 0 \\ 0 & 0 & 1 \end{bmatrix} \vec{v_2} \bigg) = 0

 \vec{v_1}^T \begin{bmatrix} \frac{1}{\alpha^2} & 0 & 0 \\ 0 & \frac{1}{\alpha^2} & 0 \\ 0 & 0 & 1 \end{bmatrix} \vec{v_2} = 0

 \frac{1}{\alpha^2} v_{11} v_{21} + \frac{1}{\alpha^2} v_{12} v_{22} + v_{13} v_{23} = 0

We arrive at the following constraint:

 \frac{1}{\alpha^2} ( v_{11} v_{21} + v_{12} v_{22} ) = - v_{13} v_{23}

If you repeat the same steps for:  \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix} , you get the additional constraint:

 \frac{1}{\alpha^2} ( v_{31} v_{41} + v_{32} v_{42} ) = - v_{33} v_{43}

We put these constraints into matrix form:

 \begin{bmatrix} v_{11} v_{21} + v_{12} v_{22} \\ v_{31} v_{41} + v_{32} v_{42} \end{bmatrix} (\frac{1}{\alpha^2}) = \begin{bmatrix} -v_{13} v_{23} \\ -v_{33} v_{43} \end{bmatrix}

Let:

 \vec{a} x = \vec{b}

where:

  •  \vec{a} = \begin{bmatrix} v_{11} v_{21} + v_{12} v_{22} \\ v_{31} v_{41} + v_{32} v_{42} \end{bmatrix}
  •  x = \frac{1}{\alpha^2}
  •  \vec{b} = \begin{bmatrix} -v_{13} v_{23} \\ -v_{33} v_{43} \end{bmatrix}

Then, you can solve for  x and take the inverse and square root to recover  \alpha .

Some things:

  • You can normalize \vec{v_1}, \vec{v_2}, \vec{v_3}, and \vec{v_4} by dividing each with their respective norm before solving for  x . This does not effect the orthogonality constraint and prevents issues in case one of the  \vec{v} ‘s are very large.
  • You can solve for  x using least squares. However, you can also solve the  x   that makes the projection of  \vec{a} on  \vec{b} equal to  \vec{b} :

    x \frac{<\vec{a}, \vec{b}>}{\sqrt{<\vec{b},\vec{b}>}} = \sqrt{<\vec{b},\vec{b}>}

    x = \frac{<\vec{b},\vec{b}>}{<\vec{a},\vec{b}>}

    \alpha = \sqrt{\frac{<\vec{a},\vec{b}>}{<\vec{b},\vec{b}>}}

    This is what Bouguet does in his Matlab toolbox and from my tests appears to work slightly better than the least squares results. I have a feeling it might be because you obtain a larger x when \vec{a} and  \vec{b} are not collinear, which in turn results in a smaller \alpha and prevents over estimates, but beyond that I’m not really sure.

Initial guess of extrinsic parameters

The next step is to get an initial guess of the extrinsic parameters per calibration board image.

Since we now have an initial guess for \begin{bmatrix} A \end{bmatrix}:

\begin{bmatrix} \hat{H} \end{bmatrix} = \begin{bmatrix} \vec{\hat{h_1}} & \vec{\hat{h_2}} & \vec{\hat{h_3}} \end{bmatrix}= \lambda \begin{bmatrix} A \end{bmatrix} \begin{bmatrix} \vec{r_1} & \vec{r_2} & \vec{t} \end{bmatrix}

\begin{bmatrix} A \end{bmatrix}^{-1} \begin{bmatrix} \vec{\hat{h_1}} & \vec{\hat{h_2}} & \vec{\hat{h_3}} \end{bmatrix}= \lambda \begin{bmatrix} \vec{r_1} & \vec{r_2} & \vec{t} \end{bmatrix}

This results in the following:

\vec{r_1} = \frac{1}{\lambda} \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_1}}

\vec{r_2} = \frac{1}{\lambda} \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_2}}

\vec{t} = \frac{1}{\lambda} \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_3}}

Since  \vec{r_1} and  \vec{r_2} are supposed to be unit vectors:

\lambda_1 = \| \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_1}} \|

\lambda_2 = \| \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_2}} \|

We can then solve for the extrinsics:

\vec{r_1} = \frac{1}{\lambda_1} \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_1}}

\vec{r_2} = \frac{1}{\lambda_2} \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_2}}

\vec{r_3} = \vec{r_1} \times \vec{r_2}

\vec{t} = \frac{1}{(\lambda_1+\lambda_2)/2} \begin{bmatrix} A \end{bmatrix}^{-1} \vec{\hat{h_3}}

A few things:

  • The parameterization of the homography is only unique because we constrained  \lambda to be positive. To see the effect this has, look at \vec{t}; in particular t_z:

    t_z = \frac{\hat{h}_{3,3}}{\lambda}

     \hat{h}_{3,3} is 1 (recall we set this constraint when we computed the homography). So, if \lambda is constrained to be positive, the effect this has is to ensure  t_z is positive. This is correct, as the calibration board is in the positive z direction.

  • {\lambda_1} and {\lambda_2} should be equal, but will probably be slightly off due to some combination of model misfit, noise, etc… So we take the average of the two when computing  \vec{t} .
  • \vec{r_1} and \vec{r_2} should be orthogonal, but again, due to model misfit, noise, etc… will probably be slightly off. To make  \begin{bmatrix} R \end{bmatrix} an orthogonal matrix, we can compute the best orthogonal approximation:

    \begin{bmatrix} \vec{r_1} & \vec{r_2} & \vec{r_3} \end{bmatrix} = \begin{bmatrix} R \end{bmatrix} \approx \begin{bmatrix} \hat{R} \end{bmatrix}

    Take the SVD of \begin{bmatrix} R\end{bmatrix}:

     \begin{bmatrix} R \end{bmatrix} = \begin{bmatrix} U \end{bmatrix}\begin{bmatrix} D \end{bmatrix}\begin{bmatrix} V \end{bmatrix}^T

    Then:

     \begin{bmatrix} \hat{R} \end{bmatrix} = \begin{bmatrix} U \end{bmatrix} \begin{bmatrix} V \end{bmatrix}^T

    Solves:

    _{\substack{min \\ \begin{bmatrix} \hat{R}\end{bmatrix}}} \| \begin{bmatrix} \hat{R} \end{bmatrix} - \begin{bmatrix} R \end{bmatrix} \|^2_F   subject to   \begin{bmatrix} \hat{R}\end{bmatrix}^T \begin{bmatrix} \hat{R}\end{bmatrix} = \begin{bmatrix}I\end{bmatrix}

    I won’t give a proof, but it should be somewhat intuitive.

  • You can (optionally) do nonlinear refinement of just the extrinsics (on a per calibration board image basis) before optimizing all parameters, but this only requires modifying the global optimization problem slightly, which I’ll talk about next.

Nonlinear Refinement of All Parameters (Single Camera)

Note that by now we have initial guesses for all intrinsic and extrinsic parameters. We can now define a cost function and optimize it across all intrinsic and extrinsic parameters.

But, before we do that, it makes things easier to parameterize the camera model with “normalized” coordinates.

Let:

 x_n = \frac{x_s}{z_s}

 y_n = \frac{y_s}{z_s}

 r_n = \sqrt{x_n^2 + y_n^2}

Then:

 x_i = f x_n

 y_i = f y_n

 r_i = \sqrt{f^2(x_n^2 + y_n^2)} = f\sqrt{x_n^2+y_n^2} = f r_n

Lets parameterize our camera model with normalized coordinates:

x_{p}^{d} = k_x (x_i + \delta x_i^{(r)} + \delta x_i^{(t)}) + x_o

x_{p}^{d} = k_x (x_i + x_i(k_1 r_i^{2} + k_2 r_i^{4}) + 2 p_1 x_i y_i + p_2(r_i^{2} +2 x_i^{2})) + x_o

x_{p}^{d} = k_x (f x_n (1 + f^2 k_1  r_n^{2} + f^4 k_2  r_n^{4}) + 2 f^2 p_1 x_n y_n + f^2 p_2 (r_n^{2} +2 x_n^{2})) + x_o

x_{p}^{d} = f k_x ( x_n (1 + f^2 k_1 r_n^{2} + f^4 k_2 r_n^{4}) + 2 f p_1 x_n y_n + f p_2 (r_n^{2} +2 x_n^{2})) + x_o

x_{p}^{d} = \alpha_x ( x_n (1 + \beta_1 r_n^{2} + \beta_2 r_n^{4}) + 2 \beta_3 x_n y_n + \beta_4 (r_n^{2} +2 x_n^{2})) + x_o

Doing the same for  y_p^d , we obtain:

y_{p}^{d} = \alpha_y ( y_n (1 + \beta_1 r_n^{2} + \beta_2 r_n^{4}) + \beta_3 (r_n^{2} +2 y_n^{2}) + 2 \beta_4 x_n y_n ) + y_o

Where:

  •  \beta_1 = f^2 k_1
  •  \beta_2 = f^4 k_2
  •  \beta_3 = f p_1
  •  \beta_4 = f p_2

Let:

 \vec{m} = parameterization of extrinsic parameters (expanded on soon)

Now, take the partial derivatives of x_{p}^{d}:

 \frac{\partial{x^d_p}}{\partial{\alpha_x}} = x_n(1+\beta_1 r_n^2 + \beta_2 r_n^4 ) + 2 \beta_3 x_n y_n + \beta_4 (r_n^2 +2 x_n^2)

 \frac{\partial{x^d_p}}{\partial{\alpha_y}} = 0

 \frac{\partial{x^d_p}}{\partial{x_o}} = 1

 \frac{\partial{x^d_p}}{\partial{y_o}} = 0

 \frac{\partial{x^d_p}}{\partial{\beta_1}} = \alpha_x x_n r_n^2

 \frac{\partial{x^d_p}}{\partial{\beta_2}} = \alpha_x x_n r_n^4

 \frac{\partial{x^d_p}}{\partial{\beta_3}} = 2 \alpha_x x_n y_n

 \frac{\partial{x^d_p}}{\partial{\beta_4}} = \alpha_x(r_n^2 + 2 x_n^2)

 \begin{gathered}\frac{\partial{x^d_p}}{\partial{\vec{m}}} = \alpha_x[\frac{\partial{x_n}}{\partial{\vec{m}}}(1+\beta_1 r_n^2 + \beta_2 r_n^4) + x_n(2 \beta_1 r_n \frac{\partial{r_n}} {\partial{\vec{m}}} + 4 \beta_2 r_n^3 \frac{\partial{r_n}}{\partial{\vec{m}}}) + 2 \beta_3 (\frac{\partial{x_n}}{\partial{\vec{m}}} y_n + x_n\frac{\partial{y_n}}{\partial{\vec{m}}}) + \beta_4(2 r_n \frac{\partial{r_n}}{\partial{\vec{m}}} + 4 x_n \frac{\partial{x_n}}{\partial{\vec{m}}})] \end{gathered}

Do the same for y_{p}^{d}:

 \frac{\partial{y^d_p}}{\partial{\alpha_x}} = 0

 \frac{\partial{y^d_p}}{\partial{\alpha_y}} = y_n(1+\beta_1 r_n^2 + \beta_2 r_n^4 ) + \beta_3 (r_n^2 +2 y_n^2) + 2 \beta_4 x_n y_n

 \frac{\partial{y^d_p}}{\partial{x_o}} = 0

 \frac{\partial{y^d_p}}{\partial{y_o}} = 1

 \frac{\partial{y^d_p}}{\partial{\beta_1}} = \alpha_y y_n r_n^2

 \frac{\partial{y^d_p}}{\partial{\beta_2}} = \alpha_y y_n r_n^4

 \frac{\partial{y^d_p}}{\partial{\beta_3}} = \alpha_y(r_n^2 + 2 y_n^2)

 \frac{\partial{y^d_p}}{\partial{\beta_4}} = 2 \alpha_y x_n y_n

 \begin{gathered}\frac{\partial{y^d_p}}{\partial{\vec{m}}} = \alpha_y[\frac{\partial{y_n}}{\partial{\vec{m}}}(1+\beta_1 r_n^2 + \beta_2 r_n^4) + y_n(2 \beta_1 r_n \frac{\partial{r_n}} {\partial{\vec{m}}} + 4 \beta_2 r_n^3 \frac{\partial{r_n}}{\partial{\vec{m}}}) + \beta_3(2 r_n \frac{\partial{r_n}}{\partial{\vec{m}}} + 4 y_n \frac{\partial{y_n}}{\partial{\vec{m}}})+ 2 \beta_4 (\frac{\partial{x_n}}{\partial{\vec{m}}} y_n + x_n\frac{\partial{y_n}}{\partial{\vec{m}}})] \end{gathered}

Note that:

r_n = (x_n^2 + y_n^2)^{\frac{1}{2}}

\frac{\partial{r_n}}{\partial{\vec{m}}} = \frac{1}{2}(x_n^2 + y_n^2)^{-\frac{1}{2}}(2 x_n \frac{\partial{x_n}}{\partial{\vec{m}}} + 2 y_n \frac{\partial{y_n}}{\partial{\vec{m}}}) = \frac{x_n\frac{\partial{x_n}}{\partial{\vec{m}}} + y_n \frac{\partial{y_n}}{\partial{\vec{m}}}}{r_n}

Now we must find  \frac{\partial{x_n}}{\partial{\vec{m}}} and  \frac{\partial{y_n}}{\partial{\vec{m}}} .

But, before we can do that, we must parameterize rotation. For that, I’ve elected to use Euler angles; I specify a X-Y-Z rotation sequence:

 \begin{bmatrix} R \end{bmatrix} = \begin{bmatrix} R_z(\theta_z)\end{bmatrix} \begin{bmatrix} R_y(\theta_y) \end{bmatrix} \begin{bmatrix} R_x(\theta_x) \end{bmatrix}

 \begin{bmatrix} R \end{bmatrix} = \begin{bmatrix} cos(\theta_z) & -sin(\theta_z) & 0 \\ sin(\theta_z) & cos(\theta_z) & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} cos(\theta_y) & 0 & sin(\theta_y) \\ 0 & 1 & 0 \\ -sin(\theta_y) & 0 & cos(\theta_y) \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 \\ 0 & cos(\theta_x) & -sin(\theta_x) \\ 0 & sin(\theta_x) & cos(\theta_x) \end{bmatrix}

 \begin{bmatrix} R \end{bmatrix} = \begin{bmatrix} cos(\theta_y) cos(\theta_z) & -cos(\theta_x)sin(\theta_z) + sin(\theta_x)sin(\theta_y)cos(\theta_z) & sin(\theta_x)sin(\theta_z)+cos(\theta_x)sin(\theta_y)cos(\theta_z) \\ cos(\theta_y)sin(\theta_z) & cos(\theta_x)cos(\theta_z) + sin(\theta_x)sin(\theta_y)sin(\theta_z) & -sin(\theta_x)cos(\theta_z)+cos(\theta_x)sin(\theta_y)sin(\theta_z) \\ -sin(\theta_y) & sin(\theta_x)cos(\theta_y) & cos(\theta_x) cos(\theta_y) \end{bmatrix}

Note that the inverse is:

 \theta_x = atan2(R_{32}, R_{33})

 \theta_y = atan2(-R_{31}, \sqrt{R_{11}^2 + R_{21}^2})

 \theta_z = atan2(R_{21}, R_{11})

We take the Jacobian:

   \begin{bmatrix} \frac{\partial{\vec{R}}}{\partial{\vec{\theta}}} \end{bmatrix} = \begin{blockarray}{cccc} & \partial{\theta_x} & \partial{\theta_y} & \partial{\theta_z} \\ \begin{block}{c[ccc@{\hspace*{5pt}}]} \partial{R_{11}} & 0 & $-sin(\theta_y)cos(\theta_z)$ & $-cos(\theta_y)sin(\theta_z)$ \\ \partial{R_{21}} & 0 & $-sin(\theta_y)sin(\theta_z)$ & $cos(\theta_y)cos(\theta_z)$ \\ \partial{R_{31}} & 0 & $-cos(\theta_y)$ & 0 \\ \partial{R_{12}} & $sin(\theta_x)sin(\theta_z)+cos(\theta_x)sin(\theta_y)cos(\theta_z)$ & $sin(\theta_x)cos(\theta_y)cos(\theta_z)$ & $-cos(\theta_x)cos(\theta_z)-sin(\theta_x)sin(\theta_y)sin(\theta_z)$ \\ \partial{R_{22}} & $-sin(\theta_x)cos(\theta_z)+cos(\theta_x)sin(\theta_y)sin(\theta_z)$ & $sin(\theta_x)cos(\theta_y)sin(\theta_z)$ & $-cos(\theta_x)sin(\theta_z) + sin(\theta_x) sin(\theta_y) cos(\theta_z)$ \\ \partial{R_{32}} & $cos(\theta_x)cos(\theta_y)$ & $-sin(\theta_x)sin(\theta_y)$ & 0 \\ \partial{R_{13}} & $cos(\theta_x)sin(\theta_z)-sin(\theta_x)sin(\theta_y)cos(\theta_z)$ & $cos(\theta_x)cos(\theta_y)cos(\theta_z)$ & $sin(\theta_x)cos(\theta_z)-cos(\theta_x)sin(\theta_y)sin(\theta_z)$ \\ \partial{R_{23}} & $-cos(\theta_x)cos(\theta_z)-sin(\theta_x)sin(\theta_y)sin(\theta_z)$ & $cos(\theta_x)cos(\theta_y)sin(\theta_z)$ & $sin(\theta_x)sin(\theta_z) + cos(\theta_x)sin(\theta_y)cos(\theta_z)$ \\ \partial{R_{33}} & $-sin(\theta_x)cos(\theta_y)$ & $-cos(\theta_x)sin(\theta_y)$ & 0 \\ \end{block} \end{blockarray}

Now, we can compute the full Jacobian:

\begin{bmatrix} \frac{\partial{\vec{p_n}}}{\partial{\vec{m}}} \end{bmatrix} = \begin{bmatrix} \frac{\partial{\vec{p_n}}}{\partial{\vec{Rt}}} \end{bmatrix} \begin{bmatrix} \frac{\partial{\vec{Rt}}}{\partial{\vec{m}}} \end{bmatrix}

Where:

   \begin{bmatrix} \frac{\partial{\vec{p_n}}}{\partial{\vec{Rt}}} \end{bmatrix} = \begin{blockarray}{cccccccccc} & \partial{R_{11}} & \partial{R_{21}} & \partial{R_{31}} & \partial{R_{12}} & \partial{R_{22}} & \partial{R_{32}} & \partial{t_x} & \partial{t_y} & \partial{t_z} \\ \begin{block}{c[ccccccccc@{\hspace*{5pt}}]} \partial{x_n} & $\frac{X_w}{z_s}$ & 0 & $\frac{-x_s X_w}{z_s^2}$ & $\frac{Y_w}{z_s}$ & 0 & $\frac{-x_s Y_w}{z_s^2}$ & $\frac{1}{z_s}$ & 0 & $\frac{-x_s}{z_s^2}$ \\ \partial{y_n} & 0 & $\frac{X_w}{z_s}$ & $\frac{-y_s X_w}{z_s^2}$ & 0 & $\frac{Y_w}{z_s}$ & $\frac{-y_s Y_w}{z_s^2}$ & 0 & $\frac{1}{z_s}$ & $\frac{-y_s}{z_s^2}}$ \\ \end{block} \end{blockarray}

and:

  \begin{bmatrix} \frac{\partial{\vec{Rt}}}{\partial{\vec{m}}} \end{bmatrix} = \begin{blockarray}{ccccccc} & \partial{\theta_x} & \partial{\theta_y} & \partial{\theta_z} & \partial{t_x} & \partial{t_y} & \partial{t_z} \\ \begin{block}{c[ccc|ccc@{\hspace*{5pt}}]} \partial{R_{11}} & \BAmulticolumn{3}{c|}{\multirow{6}{*}{$From\ \begin{bmatrix} \frac{\partial{\vec{R}}}{\partial{\vec{\theta}}} \end{bmatrix}$}} & \BAmulticolumn{3}{c}{\multirow{6}{*}{$0$}} \\ \partial{R_{21}} & & & & & & \\ \partial{R_{31}} & & & & & & \\ \partial{R_{12}} & & & & & & \\ \partial{R_{22}} & & & & & & \\ \partial{R_{32}} & & & & & & \\ \cline{2-7}% don't use \hline \partial{t_x} & \BAmulticolumn{3}{c|}{\multirow{3}{*}{$0$}} & 1 & 0 & 0 \\ \partial{t_y} & & & & 0 & 1 & 0 \\ \partial{t_z} & & & & 0 & 0 & 1 \\ \end{block} \end{blockarray}

We can now define our cost function for single camera calibration:

 C_{LS} = \sum_{j=1}^{M}\sum_{i=1}^{N} (x_{ij}^d-x_{ij}^m)^2 + (y_{ij}^d-y_{ij}^m)^2 = \sum_{j=1}^{M}\sum_{i=1}^{N}res_{x_{ij}}^2 + res_{y_{ij}}^2

Where:

  • M = number of calibration board images
  • N = number of points on calibration board

We put all of our parameters (intrinsic + extrinsic) in a single vector,  \vec{p} :

 \vec{p} = (\overbrace{\alpha_x,\ \alpha_y,\ x_o,\ y_o,\ \beta_1,\ \beta_2,\ \beta_3,\ \beta_4}^8,\ \overbrace{\theta_{x_1},\ \theta_{y_1},\ \theta_{z_1},\ t_{x_1},\ t_{y_1},\ t_{z_1},\ ...,\ \theta_{x_M},\ \theta_{y_M},\ \theta_{z_M},\ t_{x_M},\ t_{y_M},\ t_{z_M}}^{6M})

Which results in 8+6M total parameters.

To compute the Jacobian, we start off with the residuals:

 res_x = x^d(\vec{p})-x^m

 res_y = y^d(\vec{p})-y^m

Take the partial derivative:

 \frac{\partial{res_x}}{\partial{\vec{p}}} = \frac{\partial{x}^d}{\partial{\vec{p}}}

 \frac{\partial{res_y}}{\partial{\vec{p}}} = \frac{\partial{y}^d}{\partial{\vec{p}}}

The Jacobian has the form:

\begin{bmatrix}J\end{bmatrix}_{ij} = \begin{bmatrix} \frac{\partial{res_i}}{\partial{\vec{p_j}}}\end{bmatrix}

Where:

  \newcommand\overmat[2]{% \makebox[0pt][l]{$\smash{\color{white}\overbrace{\phantom{% \begin{matrix}#2 \end{matrix}}}^{\text{\color{black}#1}}}$}#2} \newcommand\bovermat[2]{% \makebox[0pt][l]{$\smash{\overbrace{\phantom{% \begin{matrix}#2\end{matrix}}}^{\text{#1}}}$}#2} \newcommand\coolrightbrace[2]{% \left.\vphantom{\begin{matrix} #1 \end{matrix}}\right\}#2}  \begin{bmatrix} J \end{bmatrix} = \begin{blockarray}{cccccc} & \bovermat{8+6M}{\partial{\vec{a}} & \partial{\vec{m_1}} & \partial{\vec{m_2}} & \hdots & \partial{\vec{m_M}}} \\ \begin{block}{c[ccccc@{\hspace*{5pt}}]} \partial{\vec{res_1}} & X & X & 0 & \hdots & 0 \\ \partial{\vec{res_2}} & X & 0 & X & \hdots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ \partial{\vec{res_M}} & X & 0 & 0 & \hdots & X \\ \end{block} \end{blockarray}% \begin{matrix} \\ \coolrightbrace{\begin{blockarray}{cccccc} & \bovermat{8+6M}{\partial{\vec{a}} & \partial{\vec{m_1}} & \partial{\vec{m_2}} & \hdots & \partial{\vec{m_M}}} \\ \begin{block}{c[ccccc@{\hspace*{5pt}}]} \partial{\vec{res_1}} & X & X & 0 & \hdots & 0 \\ \partial{\vec{res_2}} & X & 0 & X & \hdots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ \partial{\vec{res_M}} & X & 0 & 0 & \hdots & X \\ \end{block} \end{blockarray}}{2NM} \\ \end{matrix}

Where the X‘s represent filled blocks and 0‘s are empty blocks. This makes intuitive since, as the intrinsic parameters will affect the residuals for all calibration board images. For extrinsics, only the extrinsic parameters for a specific calibration board image will affect its residuals.

To solve this, I’d recommend using the Levenberg-Marquardt method since the nonlinear optimization is not well-behaved under certain circumstances (In my tests, if very few calibration boards are provided, it can cause the Gauss-Newton method to fail):

 \Delta\vec{p} = -(\begin{bmatrix}J\end{bmatrix}^T \begin{bmatrix} J \end{bmatrix} + \lambda\begin{bmatrix}I\end{bmatrix})^{-1}\begin{bmatrix}J\end{bmatrix}^T \vec{res}

\vec{p_{new}} = \vec{p_{old}} + \Delta\vec{p}

As usual, it’s recommended to use a sparse linear solver instead of directly taking the inverse.

Some things:

  • All initial guesses are used from the previous sections. \beta_1, \beta_2, \beta_3, and \beta_4 are all initialized to zero.
  • \lambda is usually initialized to 0.01 and will either grow or shrink every iteration depending on whether the mean squared error (MSE) increases or decreases. If MSE decreases, then \lambda is decreased and we proceed to the next iteration. If MSE inceases, then we increase \lambda until MSE descreases. Increasing lambda has the two effects: 1) it makes the step closer to the gradient and 2) it reduces the step size. Decreasing lambda has the effect of making the step closer to the Gauss-Newton method, which is an agressive optimization. Using small steps in the direction of the gradient is incredibly robust, but slightly slow.
  • If there is only a single calibration board image, the principle point is removed from the optimization (simply remove the columns from the Jacobian corresponding to those two parameters, and then do not update them during the iterations), as it is very unstable.

We now have enough information to fully calibrate and optimize a camera using a checker board calibration target.

Stereo Vision

The first thing we need to consider is how to incorporate two cameras into the model. We do this by simply adding the transformation between the two cameras as follows:

\begin{bmatrix} \begin{bmatrix} R_R \end{bmatrix} & \vec{t_R} \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} R_s \end{bmatrix} & \vec{t_s} \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \begin{bmatrix} R_L \end{bmatrix} & \vec{t_L} \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} R_s \end{bmatrix} \begin{bmatrix} R_L \end{bmatrix} & \begin{bmatrix} R_s \end{bmatrix} \vec{t_L} + \vec{t_s} \\ 0 & 1 \end{bmatrix}

This results in:

 \begin{bmatrix} R_R \end{bmatrix} = \begin{bmatrix} R_s \end{bmatrix} \begin{bmatrix} R_L \end{bmatrix}

 \vec{t_R} = \begin{bmatrix} R_s \end{bmatrix} \vec{t_L} + \vec{t_s}

We can solve for \begin{bmatrix} R_s \end{bmatrix} and \vec{t_s}:

 \begin{bmatrix} R_s \end{bmatrix} = \begin{bmatrix} R_R \end{bmatrix} \begin{bmatrix} R_L \end{bmatrix}^T

 \vec{t_s} = \vec{t_R} - \begin{bmatrix} R_R \end{bmatrix} \begin{bmatrix} R_L \end{bmatrix}^T\vec{t_L}

In order to optimize the stereo model, we do the following:

  • We get initial guesses for the intrinsics, \begin{bmatrix}R_L\end{bmatrix}, \begin{bmatrix}R_R\end{bmatrix}, \vec{t_L} and \vec{t_R} by doing single camera optimization on each camera separately.
  • Instead of optimizing \begin{bmatrix}R_L\end{bmatrix}, \begin{bmatrix}R_R\end{bmatrix}, \vec{t_L}, and \vec{t_R} for each calibration board image, we instead optimize \begin{bmatrix}R_L\end{bmatrix}, \begin{bmatrix}R_s\end{bmatrix}, \vec{t_L}, and \vec{t_s}. This reduces the number of extrinsic parameters from 12M to 6M+6, where M is the number of calibration board image pairs.
  • To get an initial guess for  \begin{bmatrix} R_s \end{bmatrix} and  \vec{t_s} , we do the following:

     \begin{bmatrix} r_{R_{11}} & r_{R_{12}} & r_{R_{13}} \\ r_{R_{21}} & r_{R_{22}} & r_{R_{23}} \\ r_{R_{31}} & r_{R_{32}} & r_{R_{33}} \\ \end{bmatrix} = \begin{bmatrix} r_{s_{11}} & r_{s_{12}} & r_{s_{13}} \\ r_{s_{21}} & r_{s_{22}} & r_{s_{23}} \\ r_{s_{31}} & r_{s_{32}} & r_{s_{33}} \\ \end{bmatrix} \begin{bmatrix} r_{L_{11}} & r_{L_{12}} & r_{L_{13}} \\ r_{L_{21}} & r_{L_{22}} & r_{L_{23}} \\ r_{L_{31}} & r_{L_{32}} & r_{L_{33}} \\ \end{bmatrix}

    Then:

     r_{R_{11}} = r_{s_{11}}r_{L_{11}} + r_{s_{12}}r_{L_{21}} + r_{s_{13}}r_{L_{31}}
     r_{R_{21}} = r_{s_{21}}r_{L_{11}} + r_{s_{22}}r_{L_{21}} + r_{s_{23}}r_{L_{31}}
     r_{R_{31}} = r_{s_{31}}r_{L_{11}} + r_{s_{32}}r_{L_{21}} + r_{s_{33}}r_{L_{31}}
     r_{R_{12}} = r_{s_{11}}r_{L_{12}} + r_{s_{12}}r_{L_{22}} + r_{s_{13}}r_{L_{32}}
     r_{R_{22}} = r_{s_{21}}r_{L_{12}} + r_{s_{22}}r_{L_{22}} + r_{s_{23}}r_{L_{32}}
     \vdots

    and:

     t_{R_x} = r_{s_{11}}t_{L_x} + r_{s_{12}}t_{L_y} + r_{s_{13}}t_{L_z} + t_{s_x}
     t_{R_y} = r_{s_{21}}t_{L_x} + r_{s_{22}}t_{L_y} + r_{s_{23}}t_{L_z} + t_{s_y}
     t_{R_z} = r_{s_{31}}t_{L_x} + r_{s_{32}}t_{L_y} + r_{s_{33}}t_{L_z} + t_{s_z}

    If we put these in matrix form:

     \begin{bmatrix} r_{L_{11}} & 0 & 0 & r_{L_{21}} & 0 & 0 & r_{L_{31}} & 0 & 0 \\ 0 & r_{L_{11}} & 0 & 0 & r_{L_{21}} & 0 & 0 & r_{L_{31}} & 0 \\ 0 & 0 & r_{L_{11}} & 0 & 0 & r_{L_{21}} & 0 & 0 & r_{L_{31}} \\ r_{L_{12}} & 0 & 0 & r_{L_{22}} & 0 & 0 & r_{L_{32}} & 0 & 0 \\ 0 & r_{L_{12}} & 0 & 0 & r_{L_{22}} & 0 & 0 & r_{L_{32}} & 0 \\ 0 & 0 & r_{L_{12}} & 0 & 0 & r_{L_{22}} & 0 & 0 & r_{L_{32}} \\ r_{L_{13}} & 0 & 0 & r_{L_{23}} & 0 & 0 & r_{L_{33}} & 0 & 0 \\ 0 & r_{L_{13}} & 0 & 0 & r_{L_{23}} & 0 & 0 & r_{L_{33}} & 0 \\ 0 & 0 & r_{L_{13}} & 0 & 0 & r_{L_{23}} & 0 & 0 & r_{L_{33}} \\ \end{bmatrix} \begin{bmatrix} r_{s_{11}} \\ r_{s_{21}} \\ r_{s_{31}} \\ r_{s_{12}} \\ r_{s_{22}} \\ r_{s_{32}} \\ r_{s_{13}} \\ r_{s_{23}} \\ r_{s_{33}} \\ \end{bmatrix} = \begin{bmatrix} r_{R_{11}} \\ r_{R_{21}} \\ r_{R_{31}} \\ r_{R_{12}} \\ r_{R_{22}} \\ r_{R_{32}} \\ r_{R_{13}} \\ r_{R_{23}} \\ r_{R_{33}} \end{bmatrix}

    and:

     \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} t_{s_x} \\ t_{s_y} \\ t_{s_z} \end{bmatrix} = \begin{bmatrix} \vec{t_R}-\begin{bmatrix}R_R\end{bmatrix}\begin{bmatrix}R_L\end{bmatrix}^T\vec{t_L} \end{bmatrix}

We now have a model for stereo vision and a way to provide an initial guess for this model.

Some things:

\frac{\partial{x^d_L}}{\partial{\vec{a_L}}}, \frac{\partial{y^d_L}}{\partial{\vec{a_L}}}, \frac{\partial{x^d_R}}{\partial{\vec{a_R}}}, \frac{\partial{y^d_R}}{\partial{\vec{a_R}}} have the same form as single camera calibration.

\frac{\partial{x^d_L}}{\partial{\vec{m_L}}} and \frac{\partial{y^d_L}}{\partial{\vec{m_L}}} have the same form as single camera calibration.

\frac{\partial{x^d_R}}{\partial{\vec{m_L}}}, \frac{\partial{y^d_R}}{\partial{\vec{m_L}}}, \frac{\partial{x^d_R}}{\partial{\vec{m_s}}}, \frac{\partial{y^d_R}}{\partial{\vec{m_s}}} are new quantities that we need to derive.

We need to compute the Jacobians:

 \begin{bmatrix} \frac{\partial{\vec{p_{n_R}}}}{\partial{\vec{m_L}}} \end{bmatrix} = \begin{bmatrix} \frac{\partial{\vec{p_{n_R}}}}{\partial{\vec{Rt_R}}} \end{bmatrix} \begin{bmatrix} \frac{\partial{\vec{Rt_R}}}{\partial{\vec{Rt_L}}} \end{bmatrix} \begin{bmatrix} \frac{\partial{\vec{Rt_L}}}{\partial{\vec{m_L}}} \end{bmatrix}

 \begin{bmatrix} \frac{\partial{\vec{p_{n_R}}}}{\partial{\vec{m_s}}} \end{bmatrix} = \begin{bmatrix} \frac{\partial{\vec{p_{n_R}}}}{\partial{\vec{Rt_R}}} \end{bmatrix} \begin{bmatrix} \frac{\partial{\vec{Rt_R}}}{\partial{\vec{Rt_s}}} \end{bmatrix} \begin{bmatrix} \frac{\partial{\vec{Rt_s}}}{\partial{\vec{m_s}}} \end{bmatrix}

The only quantities we haven’t derived yet are:

   \begin{bmatrix} \frac{\partial{\vec{Rt_R}}}{\partial{\vec{Rt_L}}} \end{bmatrix} =  \begin{blockarray}{cccccccccc} & \partial{r_{L_{11}}} & \partial{r_{L_{21}}} & \partial{r_{L_{31}}} & \partial{r_{L_{12}}} & \partial{r_{L_{22}}} & \partial{r_{L_{32}}} & \partial{t_{L_x}} & \partial{t_{L_y}} & \partial{t_{L_z}} \\ \begin{block}{c[ccccccccc@{\hspace*{5pt}}]} \partial{r_{R_{11}}} & $r_{s_{11}}$ & $r_{s_{12}}$ & $r_{s_{13}}$ & 0 & 0 & 0 & 0 & 0 & 0 \\ \partial{r_{R_{21}}} & $r_{s_{21}}$ & $r_{s_{22}}$ & $r_{s_{23}}$ & 0 & 0 &0 & 0 & 0 & 0 \\ \partial{r_{R_{31}}} & $r_{s_{31}}$ & $r_{s_{32}}$ & $r_{s_{33}}$ & 0 & 0 & 0 & 0 & 0 & 0 \\ \partial{r_{R_{12}}} & 0 & 0 & 0 & $r_{s_{11}}$ & $r_{s_{12}}$ & $r_{s_{13}}$ & 0 & 0 & 0 \\ \partial{r_{R_{22}}} & 0 & 0 & 0 & $r_{s_{21}}$ & $r_{s_{22}}$ & $r_{s_{23}}$ & 0 & 0 & 0 \\ \partial{r_{R_{32}}} & 0 & 0 & 0 & $r_{s_{31}}$ & $r_{s_{32}}$ & $r_{s_{33}}$ & 0 & 0 & 0 \\ \partial{t_{R_x}} & 0 & 0 & 0 & 0 & 0 & 0 & $r_{s_{11}}$ & $r_{s_{12}}$ & $r_{s_{13}}$ \\ \partial{t_{R_y}} & 0 & 0 & 0 & 0 & 0 & 0 & $r_{s_{21}}$ & $r_{s_{22}}$ & $r_{s_{23}}$ \\ \partial{t_{R_z}} & 0 & 0 & 0 & 0 & 0 & 0 & $r_{s_{31}}$ & $r_{s_{32}}$ & $r_{s_{33}}$ \\ \end{block} \end{blockarray}

and

    \begin{bmatrix} \frac{\partial{\vec{Rt_R}}}{\partial{\vec{Rt_s}}} \end{bmatrix} =  \begin{blockarray}{ccccccccccccc} & \partial{r_{s_{11}}} & \partial{r_{s_{21}}} & \partial{r_{s_{31}}} & \partial{r_{s_{12}}} & \partial{r_{s_{22}}} & \partial{r_{s_{32}}} & \partial{r_{s_{13}}} & \partial{r_{s_{23}}} & \partial{r_{s_{33}}} & \partial{t_{s_x}} & \partial{t_{s_y}} & \partial{t_{s_z}} \\ \begin{block}{c[cccccccccccc@{\hspace*{5pt}}]} \partial{r_{R_{11}}} & $r_{L_{11}}$ & 0 & 0 & $r_{L_{21}}$ & 0 & 0 & $r_{L_{31}}$ & 0 & 0 & 0 & 0 & 0 \\ \partial{r_{R_{21}}} & 0 & $r_{L_{11}}$ & 0 & 0 & $r_{L_{21}}$ &0 & 0 & $r_{L_{31}}$ & 0 & 0 & 0 & 0 \\ \partial{r_{R_{31}}} & 0 & 0 & $r_{L_{11}}$ & 0 & 0 & $r_{L_{21}}$ & 0 & 0 & $r_{L_{31}}$ & 0 & 0 & 0 \\ \partial{r_{R_{12}}} & $r_{L_{12}}$ & 0 & 0 & $r_{L_{22}}$ & 0 & 0 & $r_{L_{32}}$ & 0 & 0 & 0 & 0 & 0 \\ \partial{r_{R_{22}}} & 0 & $r_{L_{12}}$ & 0 & 0 & $r_{L_{22}}$ & 0 & 0 & $r_{L_{32}}$ & 0 & 0 & 0 & 0 \\ \partial{r_{R_{32}}} & 0 & 0 & $r_{L_{12}}$ & 0 & 0 & $r_{L_{22}}$ & 0 & 0 & $r_{L_{32}}$ & 0 & 0 & 0 \\ \partial{t_{R_x}} & $t_{L_x}$ & 0 & 0 & $t_{L_y}$ & 0 & 0 & $t_{L_z}$ & 0 & 0 & 1 & 0 & 0 \\ \partial{t_{R_y}} & 0 & $t_{L_x}$ & 0 & 0 & $t_{L_y}$ & 0 & 0 & $t_{L_z}$ & 0 & 0 & 1 & 0  \\ \partial{t_{R_z}} & 0 & 0 & $t_{L_x}$ & 0 & 0 & $t_{L_y}$ & 0 & 0 & $t_{L_z}$ & 0 & 0 & 1  \\ \end{block} \end{blockarray}

The rest of the Jacobians have already been derived in the single camera calibration section.

Now, we’re ready to define our cost function for stereo calibration:

 C_{LS} = \sum_{j=1}^{M}\sum_{i=1}^{N} (x_{L_{ij}}^d-x_{L_{ij}}^m)^2 + (y_{L_{ij}}^d-y_{L_{ij}}^m)^2 + (x_{R_{ij}}^d-x_{R_{ij}}^m)^2 + (y_{R_{ij}}^d-y_{R_{ij}}^m)^2

  C_{LS} = \sum_{j=1}^{M}\sum_{i=1}^{N} res_{x_{L_{ij}}}^2 + res_{y_{L_{ij}}}^2 + res_{x_{R_{ij}}}^2 + res_{y_{R_{ij}}}^2

Where:

  • M = number of calibration board image pairs
  • N = number of points on calibration board

We put all of our parameters (intrinsic + extrinsic) in a single vector,  \vec{p} :

 \vec{p} = (\overbrace{\alpha_{x_L},\ \alpha_{y_L},\ x_{o_L},\ y_{o_L},\ \beta_{1_L},\ \beta_{2_L},\ \beta_{3_L},\ \beta_{4_L}}^8,\overbrace{\alpha_{x_R},\ \alpha_{y_R},\ x_{o_R},\ y_{o_R},\ \beta_{1_R},\ \beta_{2_R},\ \beta_{3_R},\ \beta_{4_R}}^8,\ \overbrace{\theta_{{x_L}_1},\ \theta_{{y_L}_1},\ \theta_{{z_L}_1},\ t_{{x_L}_1},\ t_{{y_L}_1},\ t_{{z_L}_1},\ ...,\ \theta_{{x_L}_M},\ \theta_{{y_L}_M},\ \theta_{{z_L}_M},\ t_{{x_L}_M},\ t_{{y_L}_M},\ t_{{z_L}_M}}^{6M},\overbrace{\theta_{x_s},\ \theta_{y_s},\ \theta_{z_s},\ t_{x_s},\ t_{y_s},\ t_{z_s}}^6)

Which results in 22+6M total parameters.

To compute the Jacobian, we start off with the residuals:

 res_{x_L} = x_L^d(\vec{p})-x_L^m

 res_{y_L} = y_L^d(\vec{p})-y_L^m

 res_{x_R} = x_R^d(\vec{p})-x_R^m

 res_{y_R} = y_R^d(\vec{p})-y_R^m

Take the partial derivative:

 \frac{\partial{res_{x_L}}}{\partial{\vec{p}}} = \frac{\partial{x_L^d}}{\partial{\vec{p}}}

 \frac{\partial{res_{y_L}}}{\partial{\vec{p}}} = \frac{\partial{y_L^d}}{\partial{\vec{p}}}

 \frac{\partial{res_{x_R}}}{\partial{\vec{p}}} = \frac{\partial{x_R^d}}{\partial{\vec{p}}}

 \frac{\partial{res_{y_R}}}{\partial{\vec{p}}} = \frac{\partial{y_R^d}}{\partial{\vec{p}}}

The Jacobian has the form:

\begin{bmatrix}J\end{bmatrix}_{ij} = \begin{bmatrix} \frac{\partial{res_i}}{\partial{\vec{p_j}}}\end{bmatrix}

Where:

  \newcommand\overmat[2]{% \makebox[0pt][l]{$\smash{\color{white}\overbrace{\phantom{% \begin{matrix}#2 \end{matrix}}}^{\text{\color{black}#1}}}$}#2} \newcommand\bovermat[2]{% \makebox[0pt][l]{$\smash{\overbrace{\phantom{% \begin{matrix}#2\end{matrix}}}^{\text{#1}}}$}#2} \newcommand\coolrightbrace[2]{% \left.\vphantom{\begin{matrix} #1 \end{matrix}}\right\}#2}  \begin{bmatrix} J \end{bmatrix} = \begin{blockarray}{cccccccc} & \bovermat{22+6M}{\partial{\vec{a_L}} & \partial{\vec{a_R}} & \partial{\vec{m_{L_1}}} & \partial{\vec{m_{L_2}}} & \hdots & \partial{\vec{m_{L_M}}} & \partial{\vec{m_s}}} \\ \begin{block}{c[ccccccc@{\hspace*{5pt}}]} \partial{\vec{res_{L_1}}} & X & 0 & X & 0 & \hdots & 0 & 0 \\ \partial{\vec{res_{R_1}}} & 0 & X & X & 0 & \hdots & 0 & X \\ \partial{\vec{res_{L_2}}} & X & 0 & 0 & X & \hdots & 0 & 0 \\ \partial{\vec{res_{R_2}}} & 0 & X & 0 & X & \hdots & 0 & X \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ \partial{\vec{res_{L_M}}} & X & 0 & 0 & 0 & \hdots & X & 0 \\ \partial{\vec{res_{R_M}}} & 0 & X & 0 & 0 & \hdots & X & X \\ \end{block} \end{blockarray}% \begin{matrix} \\ \coolrightbrace{\begin{blockarray}{cccccccc} & \bovermat{22+6M}{\partial{\vec{a_L}} & \partial{\vec{a_R}} & \partial{\vec{m_{L_1}}} & \partial{\vec{m_{L_2}}} & \hdots & \partial{\vec{m_{L_M}}} & \partial{\vec{m_s}}} \\ \begin{block}{c[ccccccc@{\hspace*{5pt}}]} \partial{\vec{res_{L_1}}} & X & 0 & X & 0 & \hdots & 0 & 0 \\ \partial{\vec{res_{R_1}}} & 0 & X & X & 0 & \hdots & 0 & X \\ \partial{\vec{res_{L_2}}} & X & 0 & 0 & X & \hdots & 0 & 0 \\ \partial{\vec{res_{R_2}}} & 0 & X & 0 & X & \hdots & 0 & X \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ \partial{\vec{res_{L_M}}} & X & 0 & 0 & 0 & \hdots & X & 0 \\ \partial{\vec{res_{R_M}}} & 0 & X & 0 & 0 & \hdots & X & X \\ \end{block} \end{blockarray}}{4NM} \\ \end{matrix}

Where, like before, the X‘s represent filled blocks and 0‘s are empty blocks.

To solve this, again, I’d recommend using the Levenberg-Marquardt method since it is not well-behaved under certain circumstances:

 \Delta\vec{p} = -(\begin{bmatrix}J\end{bmatrix}^T \begin{bmatrix} J \end{bmatrix} + \lambda\begin{bmatrix}I\end{bmatrix})^{-1}\begin{bmatrix}J\end{bmatrix}^T\vec{res}

\vec{p_{new}} = \vec{p_{old}} + \Delta\vec{p}

Until some convergence condition.

Some things:

  • \lambda is set and used as described in the single camera calibration section.
  • If there is only a single calibration board image, the principle point is again removed from the optimization similarly to the single camera calibration section.

Now, we can perform camera calibration for stereo cameras using a checkerboard calibration target. The next article will go over how to do this in Matlab.

Leave a Reply

Your email address will not be published. Required fields are marked *