Stereo Vision - OpenCV

Overview

Stereo vision uses two cameras to compute depth information by triangulation. OpenCV provides functions for:

Calibrating stereo camera systems
Computing rectification transformations
Stereo correspondence algorithms (StereoBM, StereoSGBM)
3D reconstruction from disparity maps

Stereo Calibration

stereoCalibrate

Calibrates a stereo camera setup by finding intrinsic parameters for each camera and extrinsic parameters between them.

double cv::stereoCalibrate(
    InputArrayOfArrays objectPoints,
    InputArrayOfArrays imagePoints1,
    InputArrayOfArrays imagePoints2,
    InputOutputArray cameraMatrix1,
    InputOutputArray distCoeffs1,
    InputOutputArray cameraMatrix2,
    InputOutputArray distCoeffs2,
    Size imageSize,
    OutputArray R,
    OutputArray T,
    OutputArray E,
    OutputArray F,
    int flags = CALIB_FIX_INTRINSIC,
    TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, 1e-6)
)

objectPoints

InputArrayOfArrays

required

Vector of vectors of calibration pattern points. Both cameras need to see the same object points. Structure same as calibrateCamera.

imagePoints1

InputArrayOfArrays

required

Vector of vectors of projections of calibration pattern points observed by the first camera.

imagePoints2

InputArrayOfArrays

required

Vector of vectors of projections of calibration pattern points observed by the second camera.

cameraMatrix1

InputOutputArray

required

Input/output camera intrinsic matrix for the first camera.

distCoeffs1

InputOutputArray

required

Input/output vector of distortion coefficients for the first camera.

cameraMatrix2

InputOutputArray

required

Input/output camera intrinsic matrix for the second camera.

distCoeffs2

InputOutputArray

required

Input/output vector of distortion coefficients for the second camera.

imageSize

Size

required

Size of the image used only to initialize camera intrinsic matrices.

OutputArray

required

Output rotation matrix between the first and second camera coordinate systems. This matrix brings points from the first camera’s coordinate system to the second camera’s coordinate system.

OutputArray

required

Output translation vector between the coordinate systems of the cameras. Equivalent to the position of the first camera with respect to the second camera.

OutputArray

required

Output essential matrix.

OutputArray

required

Output fundamental matrix.

flags

int

default:"CALIB_FIX_INTRINSIC"

Different flags for stereo calibration (see Stereo Calibration Flags below).

criteria

TermCriteria

default:"TermCriteria(COUNT+EPS, 30, 1e-6)"

Termination criteria for the iterative optimization algorithm.

Returns: The overall RMS re-projection error. The function estimates the transformation between two cameras:

R2 = R * R1
T2 = R * T1 + T

Optionally computes the essential matrix E:

E = [T]_x * R

where [T]_x is the skew-symmetric matrix of T. And the fundamental matrix F:

F = cameraMatrix2^(-T) * E * cameraMatrix1^(-1)

Due to high dimensionality and noise, the function can diverge. If intrinsic parameters can be estimated with high accuracy for each camera individually (using calibrateCamera), it’s recommended to pass CALIB_FIX_INTRINSIC flag with the computed intrinsic parameters.

Extended Version

double cv::stereoCalibrate(
    InputArrayOfArrays objectPoints,
    InputArrayOfArrays imagePoints1,
    InputArrayOfArrays imagePoints2,
    InputOutputArray cameraMatrix1,
    InputOutputArray distCoeffs1,
    InputOutputArray cameraMatrix2,
    InputOutputArray distCoeffs2,
    Size imageSize,
    InputOutputArray R,
    InputOutputArray T,
    OutputArray E,
    OutputArray F,
    OutputArrayOfArrays rvecs,
    OutputArrayOfArrays tvecs,
    OutputArray perViewErrors,
    int flags = CALIB_FIX_INTRINSIC,
    TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, 1e-6)
)

rvecs

OutputArrayOfArrays

Output vector of rotation vectors (Rodrigues) estimated for each pattern view in the coordinate system of the first camera.

tvecs

OutputArrayOfArrays

Output vector of translation vectors estimated for each pattern view.

perViewErrors

OutputArray

Output vector of the RMS re-projection error estimated for each pattern view.

Stereo Calibration Flags

Flags control the calibration behavior (in addition to single camera flags):

Flag	Value	Description
`CALIB_FIX_INTRINSIC`	0x00100	Fix cameraMatrix1/2 and distCoeffs1/2 so that only R, T, E, F are estimated
`CALIB_USE_INTRINSIC_GUESS`	0x00001	Optimize some or all intrinsic parameters according to specified flags
`CALIB_USE_EXTRINSIC_GUESS`	1 << 22	R and T contain valid initial values that are optimized further
`CALIB_FIX_PRINCIPAL_POINT`	0x00004	Fix principal points during optimization
`CALIB_FIX_FOCAL_LENGTH`	0x00010	Fix fx and fy for both cameras
`CALIB_FIX_ASPECT_RATIO`	0x00002	Optimize fy, fix ratio fx/fy
`CALIB_SAME_FOCAL_LENGTH`	0x00200	Enforce fx^(0) = fx^(1) and fy^(0) = fy^(1)
`CALIB_ZERO_TANGENT_DIST`	0x00008	Set tangential distortion coefficients to zero for each camera
`CALIB_FIX_K1` … `CALIB_FIX_K6`	Various	Do not change corresponding radial distortion coefficient
`CALIB_RATIONAL_MODEL`	0x04000	Enable k4, k5, k6 coefficients (8 coefficients total)
`CALIB_THIN_PRISM_MODEL`	0x08000	Enable s1, s2, s3, s4 coefficients (12 coefficients total)
`CALIB_FIX_S1_S2_S3_S4`	0x10000	Thin prism distortion coefficients are not changed
`CALIB_TILTED_MODEL`	0x40000	Enable tauX and tauY coefficients (14 coefficients)
`CALIB_FIX_TAUX_TAUY`	0x80000	Tilted sensor model coefficients are not changed

It’s usually reasonable to restrict some parameters, e.g., pass CALIB_SAME_FOCAL_LENGTH and CALIB_ZERO_TANGENT_DIST flags.

Stereo Rectification

stereoRectify

Computes rectification transforms for each head of a calibrated stereo camera.

void cv::stereoRectify(
    InputArray cameraMatrix1,
    InputArray distCoeffs1,
    InputArray cameraMatrix2,
    InputArray distCoeffs2,
    Size imageSize,
    InputArray R,
    InputArray T,
    OutputArray R1,
    OutputArray R2,
    OutputArray P1,
    OutputArray P2,
    OutputArray Q,
    int flags = CALIB_ZERO_DISPARITY,
    double alpha = -1,
    Size newImageSize = Size(),
    CV_OUT Rect* validPixROI1 = 0,
    CV_OUT Rect* validPixROI2 = 0
)

cameraMatrix1

InputArray

required

First camera intrinsic matrix.

distCoeffs1

InputArray

required

First camera distortion parameters.

cameraMatrix2

InputArray

required

Second camera intrinsic matrix.

distCoeffs2

InputArray

required

Second camera distortion parameters.

imageSize

Size

required

Size of the image used for stereo calibration.

InputArray

required

Rotation matrix from the coordinate system of the first camera to the second camera (from stereoCalibrate).

InputArray

required

Translation vector from the coordinate system of the first camera to the second camera (from stereoCalibrate).

OutputArray

required

Output 3x3 rectification transform (rotation matrix) for the first camera. Performs change of basis from unrectified to rectified first camera’s coordinate system.

OutputArray

required

Output 3x3 rectification transform (rotation matrix) for the second camera.

OutputArray

required

Output 3x4 projection matrix in the new (rectified) coordinate systems for the first camera. Projects points given in the rectified first camera coordinate system into the rectified first camera’s image.

OutputArray

required

Output 3x4 projection matrix in the new (rectified) coordinate systems for the second camera.

OutputArray

required

Output 4x4 disparity-to-depth mapping matrix (see reprojectImageTo3D).

flags

int

default:"CALIB_ZERO_DISPARITY"

Operation flags:

CALIB_ZERO_DISPARITY (0x00400): Makes principal points of each camera have the same pixel coordinates in rectified views

alpha

double

default:"-1"

Free scaling parameter between 0 and 1:

alpha=0: Rectified images are zoomed and shifted so only valid pixels are visible (no black areas)
alpha=1: Rectified images are decimated and shifted so all pixels from original images are retained
-1: Default scaling

newImageSize

Size

default:"Size()"

New image resolution after rectification. When (0,0), it’s set to the original imageSize. Setting to larger value helps preserve details.

validPixROI1

Rect*

Optional output rectangle inside the rectified first image where all pixels are valid.

validPixROI2

Rect*

Optional output rectangle inside the rectified second image where all pixels are valid.

The function computes rotation matrices for each camera that make both camera image planes the same plane. This makes all epipolar lines parallel, simplifying dense stereo correspondence. Horizontal Stereo: For cameras shifted mainly along x-axis, the projection matrices are:

P1 = [f  0  cx1  0   ]
     [0  f  cy   0   ]
     [0  0  1    0   ]

P2 = [f  0  cx2  Tx*f]
     [0  f  cy   0   ]
     [0  0  1    0   ]

Q  = [1  0  0    -cx1       ]
     [0  1  0    -cy        ]
     [0  0  0    f          ]
     [0  0  -1/Tx (cx1-cx2)/Tx]

where Tx is horizontal shift between cameras and cx1=cx2 if CALIB_ZERO_DISPARITY is set. Vertical Stereo: For cameras shifted mainly along y-axis:

P1 = [f  0  cx  0   ]
     [0  f  cy1 0   ]
     [0  0  1   0   ]

P2 = [f  0  cx  0   ]
     [0  f  cy2 Ty*f]
     [0  0  1   0   ]

Q  = [1  0  0    -cx        ]
     [0  1  0    -cy1       ]
     [0  0  0    f          ]
     [0  0  -1/Ty (cy1-cy2)/Ty]

The first three columns of P1 and P2 are the new “rectified” camera matrices. Pass these with R1 and R2 to initUndistortRectifyMap to initialize rectification maps.

stereoRectifyUncalibrated

Computes a rectification transform for an uncalibrated stereo camera.

bool cv::stereoRectifyUncalibrated(
    InputArray points1,
    InputArray points2,
    InputArray F,
    Size imgSize,
    OutputArray H1,
    OutputArray H2,
    double threshold = 5
)

points1

InputArray

required

Array of feature points in the first image.

points2

InputArray

required

Corresponding points in the second image.

InputArray

required

Input fundamental matrix. Can be computed from the same point pairs using findFundamentalMat.

imgSize

Size

required

Size of the image.

OutputArray

required

Output rectification homography matrix for the first image.

OutputArray

required

Output rectification homography matrix for the second image.

threshold

double

default:"5"

Optional threshold to filter outliers. If >0, point pairs not complying with epipolar geometry are rejected. Otherwise all points are considered inliers.

Computes rectification transformations without knowing intrinsic parameters. Implements the algorithm from Hartley99.

Algorithm heavily depends on epipolar geometry. If camera lenses have significant distortion, correct it before computing fundamental matrix and calling this function.

Utility Functions

getOptimalNewCameraMatrix

Returns the new camera intrinsic matrix based on the free scaling parameter.

Mat cv::getOptimalNewCameraMatrix(
    InputArray cameraMatrix,
    InputArray distCoeffs,
    Size imageSize,
    double alpha,
    Size newImgSize = Size(),
    CV_OUT Rect* validPixROI = 0,
    bool centerPrincipalPoint = false
)

cameraMatrix

InputArray

required

Input camera intrinsic matrix.

distCoeffs

InputArray

required

Input vector of distortion coefficients. If NULL/empty, zero distortion is assumed.

imageSize

Size

required

Original image size.

alpha

double

required

Free scaling parameter between 0 (only valid pixels) and 1 (retain all source pixels). See stereoRectify for details.

newImgSize

Size

default:"Size()"

Image size after rectification. By default, set to imageSize.

validPixROI

Rect*

Optional output rectangle outlining all-good-pixels region in undistorted image.

centerPrincipalPoint

bool

default:"false"

Optional flag indicating whether the principal point should be at image center or chosen to best fit source image (determined by alpha).

Returns: New camera intrinsic matrix. By varying alpha parameter, you can retrieve only sensible pixels (alpha=0), keep all original pixels (alpha=1), or get something in between. When alpha>0, undistorted result likely has black pixels corresponding to “virtual” pixels outside captured distorted image.

rectify3Collinear

Computes rectification transforms for 3-head camera where all heads are on the same line.

float cv::rectify3Collinear(
    InputArray cameraMatrix1,
    InputArray distCoeffs1,
    InputArray cameraMatrix2,
    InputArray distCoeffs2,
    InputArray cameraMatrix3,
    InputArray distCoeffs3,
    InputArrayOfArrays imgpt1,
    InputArrayOfArrays imgpt3,
    Size imageSize,
    InputArray R12,
    InputArray T12,
    InputArray R13,
    InputArray T13,
    OutputArray R1,
    OutputArray R2,
    OutputArray R3,
    OutputArray P1,
    OutputArray P2,
    OutputArray P3,
    OutputArray Q,
    double alpha,
    Size newImgSize,
    CV_OUT Rect* roi1,
    CV_OUT Rect* roi2,
    int flags
)

Computes rectification transformations for tri-focal stereo camera systems with collinear arrangement.

Stereo Matching Classes

StereoBM

Class for computing stereo correspondence using the block matching algorithm.

class CV_EXPORTS_W StereoBM : public StereoMatcher
{
public:
    static Ptr<StereoBM> create(int numDisparities = 0, int blockSize = 21);
    
    // Parameters
    CV_WRAP virtual int getPreFilterType() const = 0;
    CV_WRAP virtual void setPreFilterType(int preFilterType) = 0;
    
    CV_WRAP virtual int getPreFilterSize() const = 0;
    CV_WRAP virtual void setPreFilterSize(int preFilterSize) = 0;
    
    CV_WRAP virtual int getPreFilterCap() const = 0;
    CV_WRAP virtual void setPreFilterCap(int preFilterCap) = 0;
    
    CV_WRAP virtual int getTextureThreshold() const = 0;
    CV_WRAP virtual void setTextureThreshold(int textureThreshold) = 0;
    
    CV_WRAP virtual int getUniquenessRatio() const = 0;
    CV_WRAP virtual void setUniquenessRatio(int uniquenessRatio) = 0;
    
    CV_WRAP virtual int getSmallerBlockSize() const = 0;
    CV_WRAP virtual void setSmallerBlockSize(int blockSize) = 0;
    
    CV_WRAP virtual Rect getROI1() const = 0;
    CV_WRAP virtual void setROI1(Rect roi1) = 0;
    
    CV_WRAP virtual Rect getROI2() const = 0;
    CV_WRAP virtual void setROI2(Rect roi2) = 0;
};

Key Parameters:

numDisparities

int

default:"0"

Maximum disparity minus minimum disparity. Must be divisible by 16. Typical value: 16, 32, 48, 64, etc.

blockSize

int

default:"21"

Matched block size. Must be odd number ≥1. Typical values: 5-21. Larger blocks produce smoother but less detailed disparity maps.

preFilterType

int

Type of the prefilter:

PREFILTER_NORMALIZED_RESPONSE: Normalized response
PREFILTER_XSOBEL: Sobel prefilter

preFilterSize

int

Prefilter window size (5-255, must be odd).

preFilterCap

int

Truncation value for prefiltered image pixels (1-63).

textureThreshold

int

Minimum texture for disparity computation. Areas with low texture are filtered out.

uniquenessRatio

int

Margin in percentage by which best computed cost function value should “win” second best value. Typically 5-15.

Example:

Ptr<StereoBM> stereo = StereoBM::create(64, 15);
stereo->setPreFilterCap(31);
stereo->setUniquenessRatio(10);

Mat disparity;
stereo->compute(leftImage, rightImage, disparity);

StereoSGBM

Class for computing stereo correspondence using Semi-Global Block Matching algorithm.

class CV_EXPORTS_W StereoSGBM : public StereoMatcher
{
public:
    enum {
        MODE_SGBM = 0,
        MODE_HH   = 1,
        MODE_SGBM_3WAY = 2,
        MODE_HH4  = 3
    };
    
    static Ptr<StereoSGBM> create(
        int minDisparity = 0,
        int numDisparities = 16,
        int blockSize = 3,
        int P1 = 0,
        int P2 = 0,
        int disp12MaxDiff = 0,
        int preFilterCap = 0,
        int uniquenessRatio = 0,
        int speckleWindowSize = 0,
        int speckleRange = 0,
        int mode = MODE_SGBM
    );
    
    CV_WRAP virtual int getPreFilterCap() const = 0;
    CV_WRAP virtual void setPreFilterCap(int preFilterCap) = 0;
    
    CV_WRAP virtual int getUniquenessRatio() const = 0;
    CV_WRAP virtual void setUniquenessRatio(int uniquenessRatio) = 0;
    
    CV_WRAP virtual int getP1() const = 0;
    CV_WRAP virtual void setP1(int P1) = 0;
    
    CV_WRAP virtual int getP2() const = 0;
    CV_WRAP virtual void setP2(int P2) = 0;
    
    CV_WRAP virtual int getMode() const = 0;
    CV_WRAP virtual void setMode(int mode) = 0;
};

Key Parameters:

minDisparity

int

default:"0"

Minimum possible disparity value. Typically 0, but can be adjusted.

numDisparities

int

default:"16"

Maximum disparity minus minimum disparity. Must be divisible by 16. Values: 16, 32, 48, 64, 96, 128, etc.

blockSize

int

default:"3"

Matched block size. Must be odd number ≥1. Values: 3, 5, 7, etc. SGBM works well with smaller blocks than BM.

int

default:"0"

First parameter controlling disparity smoothness. Penalty for disparity change by ±1. If 0, default is 8 * channels * blockSize^2.

int

default:"0"

Second parameter controlling disparity smoothness. Penalty for disparity change by more than 1. If 0, default is 32 * channels * blockSize^2. P2 > P1.

disp12MaxDiff

int

default:"0"

Maximum allowed difference in left-right disparity check. Set to negative value to disable check.

preFilterCap

int

default:"0"

Truncation value for prefiltered image pixels. Default: 63.

uniquenessRatio

int

default:"0"

Margin by which best cost function value should “win” second best. Typically 5-15.

speckleWindowSize

int

default:"0"

Maximum size of smooth disparity regions to consider noise speckles and invalidate. Set to 0 to disable. Typical: 50-200.

speckleRange

int

default:"0"

Maximum disparity variation within connected component. Typical: 1-2.

mode

int

default:"MODE_SGBM"

Algorithm mode:

MODE_SGBM: Standard Semi-Global Block Matching
MODE_HH: Hirschmuller algorithm
MODE_SGBM_3WAY: Modified SGBM
MODE_HH4: Full-scale two-pass algorithm

SGBM produces smoother and more accurate disparity maps than BM but is computationally more expensive. Example:

Ptr<StereoSGBM> stereo = StereoSGBM::create(
    0,    // minDisparity
    96,   // numDisparities
    5,    // blockSize
    600,  // P1
    2400, // P2
    1,    // disp12MaxDiff
    63,   // preFilterCap
    10,   // uniquenessRatio
    100,  // speckleWindowSize
    32,   // speckleRange
    StereoSGBM::MODE_SGBM_3WAY
);

Mat disparity;
stereo->compute(leftImage, rightImage, disparity);

// Convert to float disparity
disparity.convertTo(disparity, CV_32F, 1.0/16.0);

SGBM is more suitable for real-time applications and produces better results than BM, especially in textured regions. Consider using MODE_SGBM_3WAY or MODE_HH4 for best quality.

​Overview

​Stereo Calibration

​stereoCalibrate

​Extended Version

​Stereo Calibration Flags

​Stereo Rectification

​stereoRectify

​stereoRectifyUncalibrated

​Utility Functions

​getOptimalNewCameraMatrix

​rectify3Collinear

​Stereo Matching Classes

​StereoBM

​StereoSGBM

​See Also

Overview

Stereo Calibration

stereoCalibrate

Extended Version

Stereo Calibration Flags

Stereo Rectification

stereoRectify

stereoRectifyUncalibrated

Utility Functions

getOptimalNewCameraMatrix

rectify3Collinear

Stereo Matching Classes

StereoBM

StereoSGBM

See Also