Skip to main content

Overview

Pose estimation determines the transformation (rotation and translation) from the object coordinate system to the camera coordinate system. This is essential for:
  • Augmented reality applications
  • Robot navigation and manipulation
  • 3D scene reconstruction
  • Object tracking and localization

Core Functions

solvePnP

Finds an object pose from 3D-2D point correspondences.
bool cv::solvePnP(
    InputArray objectPoints,
    InputArray imagePoints,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    OutputArray rvec,
    OutputArray tvec,
    bool useExtrinsicGuess = false,
    int flags = SOLVEPNP_ITERATIVE
)
objectPoints
InputArray
required
Array of object points in the object coordinate space, Nx3 1-channel or 1xN/Nx1 3-channel, where N is the number of points. vector<Point3d> can also be passed.
imagePoints
InputArray
required
Array of corresponding image points in pixel coordinates, Nx2 1-channel or 1xN/Nx1 2-channel. vector<Point2d> can also be passed.
cameraMatrix
InputArray
required
Input camera intrinsic matrix (3x3).
distCoeffs
InputArray
required
Input vector of distortion coefficients. If the vector is NULL/empty, zero distortion coefficients are assumed.
rvec
OutputArray
required
Output rotation vector (see Rodrigues) that, together with tvec, brings points from the model coordinate system to the camera coordinate system.
tvec
OutputArray
required
Output translation vector.
useExtrinsicGuess
bool
default:"false"
If true, the function uses the provided rvec and tvec values as initial approximations and further optimizes them (used with SOLVEPNP_ITERATIVE).
flags
int
default:"SOLVEPNP_ITERATIVE"
Method for solving the PnP problem (see SolvePnP Methods below).
Returns: True if a solution is found, false otherwise.
Coordinate Systems:
  • Input objectPoints: 3D points in world coordinate frame
  • Output rvec/tvec: Transformation from world to camera coordinate frame
  • The transformation Xc = R * Xw + t brings world points to camera coordinates

SolvePnP Methods

MethodValueDescriptionRequirements
SOLVEPNP_ITERATIVE0Levenberg-Marquardt optimization. DLT for non-planar (≥6 pts), homography for planar (≥4 pts)≥4 points
SOLVEPNP_EPNP1Efficient PnP based on Lepetit et al. 2009≥4 points
SOLVEPNP_P3P2P3P algorithm based on Ding et al. 2023Exactly 4 points (3 for estimation, 1 for validation)
SOLVEPNP_DLS3Broken - fallback to EPnPNot recommended
SOLVEPNP_UPNP4Broken - fallback to EPnPNot recommended
SOLVEPNP_AP3P5Efficient algebraic solution by Ke & Roumeliotis 2017Exactly 4 points
SOLVEPNP_IPPE6Infinitesimal Plane-Based Pose Estimation≥4 coplanar points
SOLVEPNP_IPPE_SQUARE7IPPE for square markers (returns 2 solutions)Exactly 4 coplanar points in specific order
SOLVEPNP_SQPNP8SQPnP: Fast and globally optimal solution≥3 points
For SOLVEPNP_IPPE_SQUARE, object points must be defined in this exact order:
  • point 0: [-squareLength/2, squareLength/2, 0]
  • point 1: [ squareLength/2, squareLength/2, 0]
  • point 2: [ squareLength/2, -squareLength/2, 0]
  • point 3: [-squareLength/2, -squareLength/2, 0]
Example:
// Define object points (e.g., 3D corners of a marker)
vector<Point3f> objectPoints = {
    {-0.05f,  0.05f, 0.0f},
    { 0.05f,  0.05f, 0.0f},
    { 0.05f, -0.05f, 0.0f},
    {-0.05f, -0.05f, 0.0f}
};

// Corresponding 2D image points (detected in image)
vector<Point2f> imagePoints = {
    {234.5f, 156.3f},
    {456.2f, 167.8f},
    {443.1f, 389.4f},
    {221.7f, 378.9f}
};

Mat cameraMatrix = ...; // 3x3 camera intrinsic matrix
Mat distCoeffs = ...;   // Distortion coefficients

Mat rvec, tvec;
bool success = solvePnP(objectPoints, imagePoints, 
                        cameraMatrix, distCoeffs,
                        rvec, tvec, false, SOLVEPNP_IPPE_SQUARE);

solvePnPRansac

Finds object pose from 3D-2D point correspondences using RANSAC to handle outliers.
bool cv::solvePnPRansac(
    InputArray objectPoints,
    InputArray imagePoints,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    OutputArray rvec,
    OutputArray tvec,
    bool useExtrinsicGuess = false,
    int iterationsCount = 100,
    float reprojectionError = 8.0,
    double confidence = 0.99,
    OutputArray inliers = noArray(),
    int flags = SOLVEPNP_ITERATIVE
)
iterationsCount
int
default:"100"
Number of RANSAC iterations.
reprojectionError
float
default:"8.0"
Inlier threshold value in pixels. The maximum allowed distance between observed and computed point projections to consider it an inlier.
confidence
double
default:"0.99"
The probability that the algorithm produces a useful result (typically 0.99).
inliers
OutputArray
Output vector that contains indices of inliers in objectPoints and imagePoints.
Returns: True if a solution is found. This function estimates an object pose and is resistant to outliers using RANSAC. The algorithm:
  1. Randomly selects minimal subsets of points
  2. Estimates pose for each subset
  3. Counts inliers (points within reprojectionError threshold)
  4. Refines final pose using all inliers
Minimal Sample Sets:
  • Default method uses SOLVEPNP_EPNP for minimal sample estimation
  • If you choose SOLVEPNP_P3P or SOLVEPNP_AP3P, these methods are used
  • If exactly 4 input points, SOLVEPNP_P3P is automatically used
  • Final pose is refined using all inliers with the method specified in flags (unless P3P/AP3P, then EPNP is used)

USAC-based solvePnPRansac

Advanced robust estimation using USAC (Universal RANSAC) framework.
bool cv::solvePnPRansac(
    InputArray objectPoints,
    InputArray imagePoints,
    InputOutputArray cameraMatrix,
    InputArray distCoeffs,
    OutputArray rvec,
    OutputArray tvec,
    OutputArray inliers,
    const UsacParams& params = UsacParams()
)
USAC provides several advanced RANSAC variants with configurable parameters for better performance and accuracy.

solvePnPGeneric

Returns all possible solutions for pose estimation (multiple solutions from P3P methods).
int cv::solvePnPGeneric(
    InputArray objectPoints,
    InputArray imagePoints,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    OutputArrayOfArrays rvecs,
    OutputArrayOfArrays tvecs,
    bool useExtrinsicGuess = false,
    SolvePnPMethod flags = SOLVEPNP_ITERATIVE,
    InputArray rvec = noArray(),
    InputArray tvec = noArray(),
    OutputArray reprojectionError = noArray()
)
rvecs
OutputArrayOfArrays
required
Vector of output rotation vectors. P3P methods return 0-4 solutions, SOLVEPNP_IPPE returns 2 solutions, others return 1 solution.
tvecs
OutputArrayOfArrays
required
Vector of output translation vectors corresponding to rvecs.
reprojectionError
OutputArray
Optional output array of reprojection error (RMSE) for each solution.
Returns: Number of solutions found.
P3P solutions are sorted by reprojection errors (lowest to highest).

solveP3P

Finds an object pose from 3 3D-2D point correspondences.
int cv::solveP3P(
    InputArray objectPoints,
    InputArray imagePoints,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    OutputArrayOfArrays rvecs,
    OutputArrayOfArrays tvecs,
    int flags
)
objectPoints
InputArray
required
Array of object points, 3x3 1-channel or 1x3/3x1 3-channel. Exactly 3 points required.
imagePoints
InputArray
required
Array of corresponding image points, 3x2 1-channel or 1x3/3x1 2-channel. Exactly 3 points required.
flags
int
required
Method for solving P3P:
  • SOLVEPNP_P3P: Based on Ding et al. 2023
  • SOLVEPNP_AP3P: Based on Ke & Roumeliotis 2017
Returns: Number of solutions (0-4). Solutions are sorted by reprojection errors.

Pose Refinement

solvePnPRefineLM

Refines a pose using Levenberg-Marquardt optimization.
void cv::solvePnPRefineLM(
    InputArray objectPoints,
    InputArray imagePoints,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    InputOutputArray rvec,
    InputOutputArray tvec,
    TermCriteria criteria = TermCriteria(TermCriteria::EPS + TermCriteria::COUNT, 20, FLT_EPSILON)
)
rvec
InputOutputArray
required
Input/Output rotation vector. Input values used as initial solution.
tvec
InputOutputArray
required
Input/Output translation vector. Input values used as initial solution.
criteria
TermCriteria
default:"TermCriteria(EPS+COUNT, 20, FLT_EPSILON)"
Termination criteria for the iterative optimization algorithm.
Minimizes projection error using Levenberg-Marquardt iterative minimization. Requires at least 3 object points and an initial pose estimate.

solvePnPRefineVVS

Refines a pose using Virtual Visual Servoing (VVS).
void cv::solvePnPRefineVVS(
    InputArray objectPoints,
    InputArray imagePoints,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    InputOutputArray rvec,
    InputOutputArray tvec,
    TermCriteria criteria = TermCriteria(TermCriteria::EPS + TermCriteria::COUNT, 20, FLT_EPSILON),
    double VVSlambda = 1
)
VVSlambda
double
default:"1"
Gain for the virtual visual servoing control law, equivalent to the α gain in the Damped Gauss-Newton formulation.
Minimizes projection error using Virtual Visual Servoing scheme (Chaumette 2006, Marchand 2016).

Homography-Based Methods

findHomography

Finds a perspective transformation between two planes.
Mat cv::findHomography(
    InputArray srcPoints,
    InputArray dstPoints,
    int method = 0,
    double ransacReprojThreshold = 3,
    OutputArray mask = noArray(),
    const int maxIters = 2000,
    const double confidence = 0.995
)
srcPoints
InputArray
required
Coordinates of points in the original plane, CV_32FC2 or vector<Point2f>.
dstPoints
InputArray
required
Coordinates of points in the target plane, CV_32FC2 or vector<Point2f>.
method
int
default:"0"
Method for computing homography:
  • 0: Regular method using all points (least squares)
  • RANSAC (8): RANSAC-based robust method
  • LMEDS (4): Least-Median robust method
  • RHO (16): PROSAC-based robust method
ransacReprojThreshold
double
default:"3"
Maximum allowed reprojection error to treat a point pair as an inlier (pixels). Used in RANSAC and RHO methods.
mask
OutputArray
Optional output mask set by robust methods. Input mask values are ignored.
maxIters
int
default:"2000"
Maximum number of RANSAC iterations.
confidence
double
default:"0.995"
Confidence level, between 0 and 1.
Returns: The 3x3 homography matrix H such that:
s_i [x'_i, y'_i, 1]^T ≈ H [x_i, y_i, 1]^T
The function finds the perspective transformation between source and destination planes. Useful for:
  • Finding initial intrinsic and extrinsic matrices
  • Planar object tracking
  • Image rectification
If H cannot be estimated, an empty matrix is returned.

USAC-based findHomography

Mat cv::findHomography(
    InputArray srcPoints,
    InputArray dstPoints,
    OutputArray mask,
    const UsacParams& params
)
Uses USAC framework for robust homography estimation with configurable parameters.

Decomposition Methods

decomposeProjectionMatrix

Decomposes a projection matrix into rotation matrix and camera intrinsic matrix.
void cv::decomposeProjectionMatrix(
    InputArray projMatrix,
    OutputArray cameraMatrix,
    OutputArray rotMatrix,
    OutputArray transVect,
    OutputArray rotMatrixX = noArray(),
    OutputArray rotMatrixY = noArray(),
    OutputArray rotMatrixZ = noArray(),
    OutputArray eulerAngles = noArray()
)
projMatrix
InputArray
required
3x4 input projection matrix P.
cameraMatrix
OutputArray
required
Output 3x3 camera intrinsic matrix.
rotMatrix
OutputArray
required
Output 3x3 external rotation matrix R.
transVect
OutputArray
required
Output 4x1 translation vector T.
rotMatrixX
OutputArray
Optional 3x3 rotation matrix around x-axis.
rotMatrixY
OutputArray
Optional 3x3 rotation matrix around y-axis.
rotMatrixZ
OutputArray
Optional 3x3 rotation matrix around z-axis.
eulerAngles
OutputArray
Optional three-element vector containing three Euler angles of rotation in degrees.
Decomposes a projection matrix into calibration and rotation matrix and the position of a camera. Based on RQDecomp3x3.

RQDecomp3x3

Computes an RQ decomposition of 3x3 matrices.
Vec3d cv::RQDecomp3x3(
    InputArray src,
    OutputArray mtxR,
    OutputArray mtxQ,
    OutputArray Qx = noArray(),
    OutputArray Qy = noArray(),
    OutputArray Qz = noArray()
)
src
InputArray
required
3x3 input matrix.
mtxR
OutputArray
required
Output 3x3 upper-triangular matrix.
mtxQ
OutputArray
required
Output 3x3 orthogonal matrix.
Returns: Three Euler angles in degrees. Used in decomposeProjectionMatrix to decompose the left 3x3 submatrix of a projection matrix.

Helper Functions

projectPoints

Projects 3D points to an image plane.
void cv::projectPoints(
    InputArray objectPoints,
    InputArray rvec,
    InputArray tvec,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    OutputArray imagePoints,
    OutputArray jacobian = noArray(),
    double aspectRatio = 0
)
objectPoints
InputArray
required
Array of object points in world coordinate frame, 3xN/Nx3 1-channel or 1xN/Nx1 3-channel.
rvec
InputArray
required
Rotation vector (Rodrigues) that performs change of basis from world to camera coordinate system.
tvec
InputArray
required
Translation vector.
imagePoints
OutputArray
required
Output array of image points in pixel coordinates, 1xN/Nx1 2-channel, or vector<Point2f>.
jacobian
OutputArray
Optional output 2Nx(10+numDistCoeffs) Jacobian matrix of derivatives of image points with respect to rotation, translation, focal lengths, principal point, and distortion coefficients.
aspectRatio
double
default:"0"
Optional fixed aspect ratio parameter. If not 0, the function assumes aspect ratio (fx/fy) is fixed.
Computes 2D projections of 3D points given intrinsic and extrinsic camera parameters. Used during optimization in calibrateCamera, solvePnP, and stereoCalibrate.

drawFrameAxes

Draws axes of the world/object coordinate system from pose estimation.
void cv::drawFrameAxes(
    InputOutputArray image,
    InputArray cameraMatrix,
    InputArray distCoeffs,
    InputArray rvec,
    InputArray tvec,
    float length,
    int thickness = 3
)
length
float
required
Length of the painted axes in the same unit as tvec (usually meters).
thickness
int
default:"3"
Line thickness of the painted axes.
Draws the world/object coordinate system axes w.r.t. the camera frame:
  • OX is drawn in red
  • OY is drawn in green
  • OZ is drawn in blue

See Also

  • Camera Calibration - calibrateCamera for obtaining camera intrinsics
  • Stereo Vision - Stereo calibration and rectification
  • OpenCV samples: plane_ar.py (planar augmented reality)