p $^{2} $ net：无监督室内深度估计的补丁手牌和平面登记

论文标题

p $^{2} $ net：无监督室内深度估计的补丁手牌和平面登记

P$^{2}$Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation

论文作者

Yu, Zehao, Jin, Lei, Gao, Shenghua

论文摘要

本文解决了室内环境中无监督的深度估计任务。由于这些场景中非文本地区的广泛领域，该任务极具挑战性。这些领域可能会在常用的无监督的深度估计框架中不知所措的优化过程。但是，即使这些地区被掩盖了，表现仍然不令人满意。在本文中，我们认为表现不佳的表现遭受了非歧视性基于点的匹配。为此，我们提出了p $^2 $ net。我们首先提取具有较大本地梯度的点，并采用以每个点为中心的补丁作为其表示形式。然后在补丁上定义多视图一致性损失。该操作大大提高了网络培训的鲁棒性。此外，由于室内场景中的那些无纹理区域（例如，墙壁，地板，屋顶，\等）通常与平面区域相对应，因此我们提议将超级像素作为飞机的先验。我们强制实施预测的深度，可以通过每个超像素内的平面很好地拟合。 NYUV2和SCANNET的广泛实验表明，我们的p $^2 $净净优于现有方法的大幅度。代码可在\ url {https://github.com/svip-lab/indoor-sfmlearner}中获得。

This paper tackles the unsupervised depth estimation task in indoor environments. The task is extremely challenging because of the vast areas of non-texture regions in these scenes. These areas could overwhelm the optimization process in the commonly used unsupervised depth estimation framework proposed for outdoor environments. However, even when those regions are masked out, the performance is still unsatisfactory. In this paper, we argue that the poor performance suffers from the non-discriminative point-based matching. To this end, we propose P$^2$Net. We first extract points with large local gradients and adopt patches centered at each point as its representation. Multiview consistency loss is then defined over patches. This operation significantly improves the robustness of the network training. Furthermore, because those textureless regions in indoor scenes (e.g., wall, floor, roof, \etc) usually correspond to planar regions, we propose to leverage superpixels as a plane prior. We enforce the predicted depth to be well fitted by a plane within each superpixel. Extensive experiments on NYUv2 and ScanNet show that our P$^2$Net outperforms existing approaches by a large margin. Code is available at \url{https://github.com/svip-lab/Indoor-SfMLearner}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题