Overview
Pose estimation determines the transformation (rotation and translation) from the object coordinate system to the camera coordinate system. This is essential for:- Augmented reality applications
- Robot navigation and manipulation
- 3D scene reconstruction
- Object tracking and localization
Core Functions
solvePnP
Finds an object pose from 3D-2D point correspondences.Array of object points in the object coordinate space, Nx3 1-channel or 1xN/Nx1 3-channel, where N is the number of points.
vector<Point3d> can also be passed.Array of corresponding image points in pixel coordinates, Nx2 1-channel or 1xN/Nx1 2-channel.
vector<Point2d> can also be passed.Input camera intrinsic matrix (3x3).
Input vector of distortion coefficients. If the vector is NULL/empty, zero distortion coefficients are assumed.
Output rotation vector (see Rodrigues) that, together with tvec, brings points from the model coordinate system to the camera coordinate system.
Output translation vector.
If true, the function uses the provided rvec and tvec values as initial approximations and further optimizes them (used with SOLVEPNP_ITERATIVE).
Method for solving the PnP problem (see SolvePnP Methods below).
Coordinate Systems:
- Input objectPoints: 3D points in world coordinate frame
- Output rvec/tvec: Transformation from world to camera coordinate frame
- The transformation
Xc = R * Xw + tbrings world points to camera coordinates
SolvePnP Methods
| Method | Value | Description | Requirements |
|---|---|---|---|
SOLVEPNP_ITERATIVE | 0 | Levenberg-Marquardt optimization. DLT for non-planar (≥6 pts), homography for planar (≥4 pts) | ≥4 points |
SOLVEPNP_EPNP | 1 | Efficient PnP based on Lepetit et al. 2009 | ≥4 points |
SOLVEPNP_P3P | 2 | P3P algorithm based on Ding et al. 2023 | Exactly 4 points (3 for estimation, 1 for validation) |
SOLVEPNP_DLS | 3 | Broken - fallback to EPnP | Not recommended |
SOLVEPNP_UPNP | 4 | Broken - fallback to EPnP | Not recommended |
SOLVEPNP_AP3P | 5 | Efficient algebraic solution by Ke & Roumeliotis 2017 | Exactly 4 points |
SOLVEPNP_IPPE | 6 | Infinitesimal Plane-Based Pose Estimation | ≥4 coplanar points |
SOLVEPNP_IPPE_SQUARE | 7 | IPPE for square markers (returns 2 solutions) | Exactly 4 coplanar points in specific order |
SOLVEPNP_SQPNP | 8 | SQPnP: Fast and globally optimal solution | ≥3 points |
solvePnPRansac
Finds object pose from 3D-2D point correspondences using RANSAC to handle outliers.Number of RANSAC iterations.
Inlier threshold value in pixels. The maximum allowed distance between observed and computed point projections to consider it an inlier.
The probability that the algorithm produces a useful result (typically 0.99).
Output vector that contains indices of inliers in objectPoints and imagePoints.
- Randomly selects minimal subsets of points
- Estimates pose for each subset
- Counts inliers (points within reprojectionError threshold)
- Refines final pose using all inliers
Minimal Sample Sets:
- Default method uses SOLVEPNP_EPNP for minimal sample estimation
- If you choose SOLVEPNP_P3P or SOLVEPNP_AP3P, these methods are used
- If exactly 4 input points, SOLVEPNP_P3P is automatically used
- Final pose is refined using all inliers with the method specified in flags (unless P3P/AP3P, then EPNP is used)
USAC-based solvePnPRansac
Advanced robust estimation using USAC (Universal RANSAC) framework.solvePnPGeneric
Returns all possible solutions for pose estimation (multiple solutions from P3P methods).Vector of output rotation vectors. P3P methods return 0-4 solutions, SOLVEPNP_IPPE returns 2 solutions, others return 1 solution.
Vector of output translation vectors corresponding to rvecs.
Optional output array of reprojection error (RMSE) for each solution.
P3P solutions are sorted by reprojection errors (lowest to highest).
solveP3P
Finds an object pose from 3 3D-2D point correspondences.Array of object points, 3x3 1-channel or 1x3/3x1 3-channel. Exactly 3 points required.
Array of corresponding image points, 3x2 1-channel or 1x3/3x1 2-channel. Exactly 3 points required.
Method for solving P3P:
SOLVEPNP_P3P: Based on Ding et al. 2023SOLVEPNP_AP3P: Based on Ke & Roumeliotis 2017
Pose Refinement
solvePnPRefineLM
Refines a pose using Levenberg-Marquardt optimization.Input/Output rotation vector. Input values used as initial solution.
Input/Output translation vector. Input values used as initial solution.
Termination criteria for the iterative optimization algorithm.
solvePnPRefineVVS
Refines a pose using Virtual Visual Servoing (VVS).Gain for the virtual visual servoing control law, equivalent to the α gain in the Damped Gauss-Newton formulation.
Homography-Based Methods
findHomography
Finds a perspective transformation between two planes.Coordinates of points in the original plane, CV_32FC2 or
vector<Point2f>.Coordinates of points in the target plane, CV_32FC2 or
vector<Point2f>.Method for computing homography:
- 0: Regular method using all points (least squares)
RANSAC(8): RANSAC-based robust methodLMEDS(4): Least-Median robust methodRHO(16): PROSAC-based robust method
Maximum allowed reprojection error to treat a point pair as an inlier (pixels). Used in RANSAC and RHO methods.
Optional output mask set by robust methods. Input mask values are ignored.
Maximum number of RANSAC iterations.
Confidence level, between 0 and 1.
- Finding initial intrinsic and extrinsic matrices
- Planar object tracking
- Image rectification
If H cannot be estimated, an empty matrix is returned.
USAC-based findHomography
Decomposition Methods
decomposeProjectionMatrix
Decomposes a projection matrix into rotation matrix and camera intrinsic matrix.3x4 input projection matrix P.
Output 3x3 camera intrinsic matrix.
Output 3x3 external rotation matrix R.
Output 4x1 translation vector T.
Optional 3x3 rotation matrix around x-axis.
Optional 3x3 rotation matrix around y-axis.
Optional 3x3 rotation matrix around z-axis.
Optional three-element vector containing three Euler angles of rotation in degrees.
RQDecomp3x3
Computes an RQ decomposition of 3x3 matrices.3x3 input matrix.
Output 3x3 upper-triangular matrix.
Output 3x3 orthogonal matrix.
Helper Functions
projectPoints
Projects 3D points to an image plane.Array of object points in world coordinate frame, 3xN/Nx3 1-channel or 1xN/Nx1 3-channel.
Rotation vector (Rodrigues) that performs change of basis from world to camera coordinate system.
Translation vector.
Output array of image points in pixel coordinates, 1xN/Nx1 2-channel, or
vector<Point2f>.Optional output 2Nx(10+numDistCoeffs) Jacobian matrix of derivatives of image points with respect to rotation, translation, focal lengths, principal point, and distortion coefficients.
Optional fixed aspect ratio parameter. If not 0, the function assumes aspect ratio (fx/fy) is fixed.
drawFrameAxes
Draws axes of the world/object coordinate system from pose estimation.Length of the painted axes in the same unit as tvec (usually meters).
Line thickness of the painted axes.
- OX is drawn in red
- OY is drawn in green
- OZ is drawn in blue
See Also
- Camera Calibration - calibrateCamera for obtaining camera intrinsics
- Stereo Vision - Stereo calibration and rectification
- OpenCV samples:
plane_ar.py(planar augmented reality)
