快速MVSNET：稀疏到密集的多视图立体声，带有学习的传播和高斯精炼

论文标题

快速MVSNET：稀疏到密集的多视图立体声，带有学习的传播和高斯精炼

Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement

论文作者

Yu, Zehao, Gao, Shenghua

论文摘要

几乎所有以前的基于深度学习的多视图立体声（MV）方法都集中在提高重建质量上。除质量外，在实际情况下，效率也是MVS的理想功能。为此，本文提出了一种快速MVSNET，这是一种新型的稀疏到密度的粗到最新框架，可在MV中进行快速准确的深度估计。具体而言，在我们的快速MVSNET中，我们首先构建了一个稀疏的成本量，以学习稀疏且高分辨率的深度图。然后，我们利用一个小规模的卷积神经网络编码局部区域内像素的深度依赖性，以使稀疏的高分辨率深度图密密麻麻。最后，提出了一个简单但有效的高斯牛顿层，以进一步优化深度图。一方面，高分辨率深度图，数据自适应传播方法和高斯 - 牛顿层共同保证了我们方法的有效性。另一方面，我们快速MVSNET中的所有模块都很轻巧，因此可以保证我们的方法的效率。此外，由于深度表示稀疏，我们的方法也非常友好。广泛的实验结果表明，我们的方法分别比Point-MVSNet和R-MVSNet快5 $ \ times $和14美元$ \ times $，同时在具有挑战性的坦克和庙宇数据集以及DTU数据集中获得了可比甚至更好的结果。代码可在https://github.com/svip-lab/fastMVSNet上找到。

Almost all previous deep learning-based multi-view stereo (MVS) approaches focus on improving reconstruction quality. Besides quality, efficiency is also a desirable feature for MVS in real scenarios. Towards this end, this paper presents a Fast-MVSNet, a novel sparse-to-dense coarse-to-fine framework, for fast and accurate depth estimation in MVS. Specifically, in our Fast-MVSNet, we first construct a sparse cost volume for learning a sparse and high-resolution depth map. Then we leverage a small-scale convolutional neural network to encode the depth dependencies for pixels within a local region to densify the sparse high-resolution depth map. At last, a simple but efficient Gauss-Newton layer is proposed to further optimize the depth map. On one hand, the high-resolution depth map, the data-adaptive propagation method and the Gauss-Newton layer jointly guarantee the effectiveness of our method. On the other hand, all modules in our Fast-MVSNet are lightweight and thus guarantee the efficiency of our approach. Besides, our approach is also memory-friendly because of the sparse depth representation. Extensive experimental results show that our method is 5$\times$ and 14$\times$ faster than Point-MVSNet and R-MVSNet, respectively, while achieving comparable or even better results on the challenging Tanks and Temples dataset as well as the DTU dataset. Code is available at https://github.com/svip-lab/FastMVSNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题