多视图几何与 运动恢复结构 章国锋 浙江大学 CAD&CG 国家重点实验室
视频场景重建的流程 运动恢复结构 深度恢复 三维重建
针孔相机模型 投影方程 : 齐次坐标表示 : Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, Second Edition 2004.
1 0 1 0 1 0 1 1 Z Y X f f Z fy fx 针孔相机模型 K [R t] Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, Second Edition 2004.
主点的偏移 1 0 1 0 0 ~ 1 / / 0 0 0 0 0 0 Z Y X y f x f Z Zy fy Zx fx y Z fy x Z fx Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, Second Edition 2004.
相机的外部参数 Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, Second Edition 2004.
透视相机模型 K f x x 0 f y cy 0 s 0 c 1 P K R t 11 DoF (5+3+3)
径向畸变 比如鱼眼镜头 : 数学模型 : R R 2 2 2 2 2 ( x, y) (1 K ( x y ) K ( x y )...) x 1 2 y (Marc Pollefeys)
径向畸变矫正例子 (Marc Pollefeys)
Multi-View Geometry Structure-from-Motion Automatically recover the camera parameters and 3D structure from multiple images or video sequences. Noah Snavely, Steven M. Seitz, Richard Szeliski. "Photo tourism: Exploring photo collections in 3D". 2016.
Two-View Geometry 3D???
Two-View Geometry 3D???
Two-View Geometry 3D: Epipolar Geometry
极线几何
基础矩阵 只跟两个视图的相对相机姿态和内参有关 F 是一个 3 3 秩为 2 的矩阵 Fe = 0 7 个自由度 最少 7 对匹配点就可以求解 F 七点法八点法 K [ t T 2 ] RK 1 1 OpenCV: cvfindfundamentalmat()
八点法求解基础矩阵 根据对极几何关系, 基本矩阵 F 满足 若设 那么对极几何关系又可以写作 : 若存在 n 对对应点,F 应满足如下的线性系统 :
八点法求解基础矩阵 f 为 9 维向量, 若要有解,rank(A) 至多为 8 在 rank(a) = 8 时,f 的方向是唯一的 通过至少 8 对对应点, 可恰好得到使 f 方向唯一的 A f 为 A 的右零空间的基向量, 可用 svd(a) 求得 真实数据存在噪音, 大于 8 组对应点得到的 A 满秩即 rank(a) = 9 此时同样可计算 (U,Σ,V) = svd(a) 令 f 为 V 中对应最小奇异值的列向量
多视图几何 投影函数
Structure from Motion Pipeline Feature Tracking Obtain a set of feature tracks Structure from Motion Solve the camera parameters and 3D points of tracks
图像特征 图像中显著 容易区分和匹配的内容 不变性 点角点线 : 直线, 曲线, 边 : 二维边, 三维边形状 : 长方形, 圆, 椭圆, 球, 纹理 视角不变 ( 尺度, 方向, 平移 ) 光照不变 物体变形 部分遮挡
Harris 角点检测 核心思想 : 统计图像梯度的分布 平滑区域 : 梯度不明显 边缘区域 : 梯度明显, 方向一致 角落区域 : 梯度明显, 方向不一致 方法 : 计算像素邻域的梯度二阶矩 计算上述矩阵的角点响应指标 对 R 进行阈值过滤和非极大值抑制
FAST 通过直接的阈值和判断来加速角点提取 考虑中心点周围的 16 个像素, 设中心点亮度为 p 如果有连续 n 个像素亮度都大于 p+t, 或者都小于 p-t ( 如图中的 14~16, 1 ~ 6) 检查 1 5 9 13 四个位置, 如果是角点, 四个位置中应当有三个满足上面的条件 速度快, 但对噪音不鲁棒 Edward Rosten, Tom Drummond. Machine Learning for High-Speed Corner Detection. ECCV (1) 2006: 430-443.
SIFT Scale-Invariant Feature Transform SIFT 通过在不同级别的图像 DoG 上寻找极大 / 极小值来确定特征的位置和对应的尺度, 后续的特征提取在与其尺度最邻近的图像 DoG 上进行 这使它有良好的尺度不变性 David G. Lowe.Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60(2): 91-110 (2004).
More Invariant Features SIFT 之后陆续出现了各种尺度不变特征描述量提取算法 如 RIFT GLOH SURF 等 其中 SURF 性能上接近 SIFT SURF 使用了 Haar 小波卷积替代 SIFT 中的高斯核 用积分图像进行了加速, 使得计算速度达到 SIFT 的 3~7 倍 ORB 由于其良好的匹配性能和极快的提取速度也得到了广泛使用
特征提取 精度高 SIFT SURF ORB 速度快 SIFT 极佳的尺度不变性, 能一定程度上适应视角变化和亮度变化 SURF 能够处理严重的图像模糊, 速度要高于 SIFT, 但精度不如 SIFT ORB 极快的提取速度, 在实时应用中常用来替代 SIFT 以上三种特征提取算法均在 OpenCV 中有实现
特征匹配 模板匹配直接在目标图像中寻找给定的图像块
特征匹配 在小运动假设下, 可以采用 KLT 跟踪方法 : I(x,y,t) I(x,y,t+1) 一个等式, 两个未知量
特征匹配 进一步假设 : 相邻像素运动一致 ( 单个像素 ) ( 邻域窗口 )
特征匹配 大运动情况下的匹配 通过比较特征描述量的距离进行匹配 SIFT = 128 维 SURF = 64 维 ORB = 256bits 暴力匹配 快速最近邻匹配 OpenCV 中提供了相应的匹配算法
Loopback Sequences and Multiple Sequences How to efficiently match the common features among different subsequences?
Non-Consecutive Feature Tracking
Framework Overview 1. Detect SIFT features over the entire sequence. 2. Consecutive point tracking: 2.1 Match features between consecutive frames with descriptor comparison. 2.2 Perform the second-pass matching to extend track lifetime. 3. Non-consecutive track matching: 3.1 Use hierachical k-means to cluster the constructed tracks. 3.2 Estimate the matching matrix with the grouped tracks. 3.3 Detect overlapping subsequences and join the matched tracks.
Two-Pass Matching for Consecutive Tracking SIFT Feature Extraction First-Pass Matching by Descriptor Comparison Global distinctive
Two-View Geometry 3D???
Two-View Geometry 3D: Epipolar Geometry
Not enough! How to handle image distortion? Naïve window-based matching becomes unreliable! How to give a good position initializaton? Whole line searching is still time-consuming and ambiguous with many potential correspondences.???
Second-Pass Matching by Planar Motion Segmentation Estimate a set of homographies Using inlier matches in first-pass matching frame t 1 2 H t H, t1 t, t1 frame t+1 Alignment 3 H t, t1 4 H t, t1
Second-Pass Matching by Planar Motion Segmentation Guided matching Epipolar constraint Homography constraint
Second-Pass Matching with Multi- Homographies First-Pass Matching (53 matches) Direct Searching (11 matches added) Our Second-Pass Matching (346 matches added)
Non-Consecutive track matching Fast Matching Matrix Estimation Detect overlapping subsequences and join the matched tracks.
Fast Matching Matrix Estimation Each track has a group of description vectors Track descriptor Use a hierarchical K-means approach to cluster the track descriptors
Fast Matching Matrix Estimation
Non-Consecutive Track Matching Simultaneously Match Images and Refine Matching Matrix Refine the matching matrix after matching the common features of the selected image pairs. More reliably find the best matching images with the updated matching matrix.
Traditional SfM Framework Feature tracking over whole sequence Structure & motion initialization Compute F between two initial images Compute P 1 and P 2 Triangulate 3D points of the matched features For each additional view Compute the camera pose Refine and extend 3D points Self-Calibration Upgrade the projective reconstruction to metric one. Refine structure and motion Bundle adjustment
三角化 已知 F, 计算 P 和 P 已知 x 和 x 计算 X: x= PX x'= P 'X Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, Second Edition 2004.
有噪声情况下的三角化 由于存在噪声, 反投到三维空间上的射线并不会严格相交 优化投影点到对应极线的距离 Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, Second Edition 2004.
线性三角化方法 给定方程 x= PX x'= P 'X p it 表示 P 的第 i 行. 写成矩阵和向量相乘的形式 直接解析求解. 没有几何意义 不是最优.
优化几何误差 Cost function 用 Levenberg-Marquart 算法求解
Knowing 3D points, Compute Camera Motion Compute Projection Matrix Decomposition for Metric Projection Matrix P K[ R t] [ KR Kt] [ M Kt] Decompose M into K, R by QR decomposition 1 t K p, p, p ) ( 14 24 34 T
Bundle Adjustment Definition Refining a visual reconstruction to produce jointly optimal 3D structure and viewing parameter (camera pose and/or calibration) estimates. B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W. Fitzgibbon. Bundle adjustment - a modern synthesis. In Workshop on Vision Algorithms, pages 298-372, 1999.
Geometric Ambiguities Projective Self-Calibration Metric Reconstruction Reconstruction Marc Pollefeys. Visual 3D Modeling from Images
Self-Calibration State-of-the-Art References R.I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, second ed. Cambridge Univ. Press, 2004. M. Pollefeys, L.J. Van Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, Visual Modeling with a Hand-Held Camera, Int l J. Computer Vision, vol. 59, no. 3, pp. 207-232, 2004. G. Zhang, X. Qin, W. Hua, T.-T. Wong, P.-A. Heng, and H. Bao, Robust Metric Reconstruction from Challenging Video Sequences, Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2007.
推荐 SfM 开源系统 ENFT-SFM or LS-ACTS http://www.zjucvg.net/ls-acts/ls-acts.html OpenMVG https://github.com/openmvg/openmv VisualSFM http://ccwu.me/vsfm/