SolvePnP for Camera Pose - 【OpenCV 學習筆記】

OpenCV 的 SolvePnP 可以求解一物點 3D 物空間坐標與其對應像點 2D 像坐標的轉換關係，在某些條件已知時，這個轉換關係可以延伸為相機的 pose，使其成為一種定位定向方法，與攝影測量的後方交會有異曲同工之妙。

2023 年 7 月 31 日

SolvePnP 簡介

學生時期用過攝影測量的後方交會法 (resection)，基於共線式 (collinearity equation) 、已知的內方位參數( Interior Orientation )“與三個以上物空間坐標已知的特徵點，解算相機的外方位參數 ( Exterior Orientation )，做為人/車的 pose 位置與姿態，發展導航定位應用。事實上，這就是 SolvePnP 方法。

Photogrammetry 攝影測量 -【知識筆記】

攝影測量 Photogrammetry 介紹

共線式是基於共線條件的數學式，該條件是一個物點，與其影像上對應的像點，會和攝影中心一起，三點構成一條直線，如下圖。因此相機攝影中心、物點與其對應像點，雖然各自處於不同的坐標系統：Camera coordinate system、World coordinate system 與 Image coordinate system，但彼此間具備共線數學關係。此一關係可由同一特徵點的物空間坐標、像點坐標、相機內方位參數，以及相機外方位參數構成。相機內方位參數包含像主點 (影像中心點) 、焦距跟透鏡畸變參數；相機外方位參數包含相機攝影中心的物空間坐標與其姿態 (roll, pitch, yaw)。相關內容會在攝影測量篇介紹，本篇就不多加贅述。

https://docs.opencv.org/4.x/d5/d1f/calib3d_solvePnP.html

SolvePnP 基於類似的概念，使用本質矩陣 (Essential Matrix)，以矩陣運算方法，求解特徵點的物空間坐標，轉換到像坐標所需的旋轉與平移矩陣 (R & T Matrix)。詳細的說，這兩個矩陣可以把物點的物空間坐標，轉換到相機坐標系，或是再配合內方位參數，轉換到影像坐標系。 SolvePnP 核心公式即構建此一關係，如下圖標示，其中 u, v 表示像點坐標，f_x, f_y 為相機焦距，c_x, c_y 為像主點座標。 r₁₁~r₃₃ 是旋轉矩陣 R 或 rotation vector (rvec) 的元素，t_x, t_y, t_z 是平移向量 T 或 translation vector (tvec) 的元素，X_w, Y_w, Z_w 為物空間坐標。

將物空間坐標乘上 rvec 與 tvec 構成的矩陣，即轉換為相機坐標系 X_c, Y_c, Z_c ：

其他官方說明可以參考：

https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html

SolvePnP方法的變種可以參考：

https://docs.opencv.org/4.x/d5/d1f/calib3d_solvePnP.html

SolvePnP 函數

如同前面的說明，OpenCV SolvePnP函數需要的輸入包含：

InputArray objectPoints：特徵點物空間坐標
InputArray imagePoints：特徵點像坐標
InputArray cameraMatrix：包含焦距與像主點坐標的矩陣
InputArray distCoeffs：透鏡畸變參數

原則上需要提供四組特徵點對，才能求解 rvec 與 tvec：

OutputArray rvec：物空間坐標至相機坐標的旋轉向量
OutputArray tvec：物空間坐標至相機坐標的平移向量

使用範例如下，須留意 objectPoints 與 imagePoints 內的坐標對應順序是否正確：

cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, raux, taux, false, PnPMethod)

該函數的其他 option 可以參考官網：

https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d

求解相機 Pose

使用SolvePnP求解相機位姿的應用情境，首先該環境需要有已知物空間坐標的特徵物，再按以下步驟解算相機拍照時的位姿：

載入一事先儲存的參考影像 (reference)，該影像包含四個以上，已知物空間坐標的特徵點，例如某特定物件 A 的四個角點 coner points，並以此建立物空間坐標矩陣 InputArray objectPoints
拍攝一目標影像 (target) 包含特定物件 A，使用 ORB / SURF/ SIFT 等 detector，計算目標影像與參考影像的 keypoints 與其對應的 descriptors，ORB 範例代碼 (C++) 如下：

Ptr<ORB> detector = ORB::create();

/* Configure ORB detector parameters */
//decrease for more pyramid levels for small target
detector->setScaleFactor(ScaleFactor); 

//decrease for more features
detector->setEdgeThreshold(EdgeThreshold); 

//increase slide window size for small target
detector->setPatchSize(PatchSize); 

//increase for up-limitiation for pyramid levels for small target
detector->setNLevels(NLevels); 

//increase for scale-up image before this level
detector->setFirstLevel(FirstLevel); 

detector->detectAndCompute(img_reference, noArray(), keypoints_reference, descriptors_reference);
detector->detectAndCompute(img_target, noArray(), keypoints_target, descriptors_target);

OpenCV ORB 官方介紹：

https://docs.opencv.org/3.4/d1/d89/tutorial_py_orb.html

實做參考：

https://www.796t.com/content/1545506044.html

透過 FLANN 等匹配器，基於步驟 2 目標影像與參考影像的 keypoints descriptors，匹配目標影像與參考影像，並透過 Ratio test 挑選較佳的匹配點對放入 good_matches，找出相同的物件 A，範例代碼 (C++) 如下：

std::vector<DMatch> matches;
FlannBasedMatcher fbmatcher(new flann::LshIndexParams(20, 10, 2));
fbmatcher.match(descriptors_reference, descriptors_target, matches);

//Ratio test
double minDist = 1000;
double maxDist = 0;
for (int i = 0; i < descriptors_reference.rows; i++){
	double dist = matches[i].distance;
	if (dist < minDist){
		minDist = dist ;
	}
	if (dist > maxDist){
		maxDist = dist;
	}
}

for (int i = 0; i < descriptors_reference.rows; i++){
	double dist = matches[i].distance;
	if(dist < max(MinDisThres * minDist, 0.02)){
		good_matches.push_back(matches[i]);
	}
}

OpenCV Feature Matching 官方介紹：

https://docs.opencv.org/4.x/dc/dc3/tutorial_py_matcher.html

找出相同的物件 A 後，使用 OpenCV findHomography 找出參考影像與目標影像匹配點對 good_matches 間的透視投影關係 (Perspective Transformation)，再取出特定物件 A 在參考影像中四個角點的像坐標，經由 OpenCV perspectiveTransform，取得四個角點投影在目標影像上的像坐標，構成 InputArray imagePoints

std::vector<Point2f> ref;
std::vector<Point2f> tar;

for( size_t i = 0; i < good_matches.size(); i++ ){
    ref.push_back( keypoints_reference[ good_matches[i].queryIdx ].pt );
    tar.push_back( keypoints_target[ good_matches[i].trainIdx ].pt );
}
	
//larger than ransacReprojThreshold(1~10) considered as outlier  
Mat H = findHomography( ref, tar, RANSAC, ransacReprojThreshold); 
	
//Get the corners from the img_reference ( the object to be "detected" )
std::vector<Point2f> ref_corners(4);
ref_corners[0] = Point2f(0, 0);
ref_corners[1] = Point2f( (float)img_reference.cols, 0 );
ref_corners[2] = Point2f( (float)img_reference.cols, (float)img_reference.rows );
ref_corners[3] = Point2f( 0, (float)img_reference.rows );
	
//Perspective Transform to the img_target ( the object to be "detected" )
perspectiveTransform( ref_corners, tar_corners, H);

OpenCV find Homography 官方說明：

https://docs.opencv.org/3.4/d1/de0/tutorial_py_feature_homography.html

基於已知規格或 Calibration 報告，建立相機內方位參數矩陣 cameraMatrix 與 distCoeffs
此時SolvePnP所需的輸入皆已齊備，可以解算 rvec 與 tvec，並將 rvec 變換為 3D 物空間坐標至 2D 相機坐標的旋轉矩陣 rotMat：

//Solve PnP
cv::Mat rvec;
cv::Mat_<float> tvec;
cv::Mat raux, taux;
cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, raux, taux, false, PnPMethod);//imagePoints is tar_corners
		  
//Rotation vector to rotation matrix
raux.convertTo(rvec, CV_32F);   
taux.convertTo(tvec, CV_32F);  
cv::Mat_<float> rotMat(3, 3);
cv::Rodrigues(Rvec, rotMat);

如前所述，rotMat 與 tvec 可把物空間坐標系，轉換為相機坐標系，將 2D 相機坐標系的原點 (0, 0) 視為相機在相機坐標系中的位置，則對其逆轉換 -rotMat.t() * tvec 可得相機在 3D 物空間的坐標 P_oc ：

//Estimate pose
cv::Mat_<float> P_oc(3, 1);
Camera_world = -rotMat.t()*Tvec; //[Xw, Yw, Zw]

程式碼參考：

https://blog.csdn.net/cocoaqin/article/details/77848588

對 3D 物空間坐標至 2D 相機坐標的旋轉矩陣，使用 OpenCV 函數decomposeProjectionMatrix 取出 Euler angle 即為相機姿態：

cv::Vec3d eulerAngles;
getEulerAngles(rotMat,eulerAngles);
float yaw   = eulerAngles[1]; 
float pitch = eulerAngles[0];
float roll  = eulerAngles[2];

程式碼可參考下篇：

https://answers.opencv.org/question/16796/computing-attituderoll-pitch-yaw-from-solvepnp/?answer=52913#post-id-52913

decomposeProjectionMatrix 的官方說明：

https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#decomposeprojectionmatrix

姿態的坐標系統

留意 Camera coordinate system 與 World coodinate system 坐標軸定義不同，造成 roll, pitch, yaw 定義的差異。如下圖：

solvepnp — https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html

從上圖可以得知，ROS 定義的相機坐標系統 X_C, Y_C, Z_C ，與下圖一般習用的導航或載具坐標系統不同。因此 decomposeProjectionMatrix 對 rotMat (3D 物空間坐標至 2D 相機坐標的旋轉矩陣) 解出的 eulerAngles 元素，是依據繞相機坐標系 X_C, Y_C, Z_C 排序，因此首位元素對應到 World coodinate system 是 pitch (圍繞 X_C 軸旋轉，等同 Y_V )，第二個元素是 yaw (圍繞 Y_C 軸旋轉，等同 Z_V) ，最後是 roll (圍繞 Z_C 軸旋轉，等同 X_V)。

下面連結文章以另一種方式說明這個關係：

https://www.twblogs.net/a/5db28793bd9eee310ee61b3a

參考文獻

坐標系統參考：https://stackoverflow.com/questions/44726404/camera-pose-from-solvepnp/camera-pose-from-solvepnp
相關源碼參考1：https://github.com/xukeqin/PnP/blob/master/pnp/main.cpp
相關源碼參考2：https://github.com/daviddoria/Examples/blob/master/c%2B%2B/OpenCV/SolvePNP/SolvePNP.cxx
SolvePnP精度分析：https://zhuanlan.zhihu.com/p/294787981
SolvePnP方法比較：https://www.cnblogs.com/pine-apple/p/15350400.html
SolvePnP實做參考：https://docs.opencv.org/3.4/dc/d2c/tutorial_real_time_pose.html
ORB 與 Image Resolution 的關係：https://stackoverflow.com/questions/52689296/is-orb-depending-on-image-resolution

IMU Inertial Measurement Unit – 【知識筆記】

https://mapostech.com/imu/

SolvePnP for Camera Pose – 【OpenCV 學習筆記】

SolvePnP 簡介

SolvePnP 函數

求解相機 Pose

姿態的坐標系統

參考文獻

相關

在〈SolvePnP for Camera Pose – 【OpenCV 學習筆記】〉中有 1 則留言