几乎没有组成先验的单视3-D对象重建

论文标题

几乎没有组成先验的单视3-D对象重建

Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors

论文作者

Michalkiewicz, Mateusz, Parisot, Sarah, Tsogkas, Stavros, Baktashmotlagh, Mahsa, Eriksson, Anders, Belilovsky, Eugene

论文摘要

单视3D重建中深卷积神经网络的令人印象深刻的性能表明，这些模型对输出空间的3D结构进行了非平凡的推理。但是，最近的工作挑战了这一信念，表明复杂的编码器架构的性能类似于最近的邻居基线或简单的线性解码器模型，这些模型利用标准基准中的每个类别数据来利用大量数据。另一方面，必须针对新类别推断出3D形状的设置，很少有示例更自然，并且需要概括形状的模型。在这项工作中，我们在实验上证明，当目标是学习使用很少的示例重建新对象时，天真的基线并不适用，并且在\ emph {少数射击}学习设置中，网络必须学习可以应用于新类别的概念，以避免死记硬背。为了解决现有方法中的缺陷，我们提出了三种方法，这些方法有效地将类先验整合到3D重建模型中，从而可以考虑阶层内变异性并施加模型应该学习的隐式组成结构。在流行的Shapenet数据库上进行的实验表明，我们的方法在几个弹药设置中在此任务上的现有基线大大优于现有的基线。

The impressive performance of deep convolutional neural networks in single-view 3D reconstruction suggests that these models perform non-trivial reasoning about the 3D structure of the output space. However, recent work has challenged this belief, showing that complex encoder-decoder architectures perform similarly to nearest-neighbor baselines or simple linear decoder models that exploit large amounts of per category data in standard benchmarks. On the other hand settings where 3D shape must be inferred for new categories with few examples are more natural and require models that generalize about shapes. In this work we demonstrate experimentally that naive baselines do not apply when the goal is to learn to reconstruct novel objects using very few examples, and that in a \emph{few-shot} learning setting, the network must learn concepts that can be applied to new categories, avoiding rote memorization. To address deficiencies in existing approaches to this problem, we propose three approaches that efficiently integrate a class prior into a 3D reconstruction model, allowing to account for intra-class variability and imposing an implicit compositional structure that the model should learn. Experiments on the popular ShapeNet database demonstrate that our method significantly outperform existing baselines on this task in the few-shot setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题