Rendnet：带有潜在空间渲染的统一2D/3D识别器

论文标题

Rendnet：带有潜在空间渲染的统一2D/3D识别器

RendNet: Unified 2D/3D Recognizer With Latent Space Rendering

论文作者

Shi, Ruoxi, Jiang, Xinyang, Shan, Caihua, Wang, Yansen, Li, Dongsheng

论文摘要

向量图形（VG）在我们的日常生活中无处不在，在工程，体系结构，设计等中使用了广泛的应用。大多数现有方法的VG识别过程是首先将VG渲染为栅格图形（RG），然后将基于RG格式的行为识别。但是，此过程丢弃了几何结构并失去了VG的高分辨率。最近，提出了另一种类别的算法以直接从原始的VG格式识别。但是它受RG渲染可以过滤掉的拓扑错误的影响。它不是查看一种格式，而是将VG和RG格式一起使用以避免这些缺点的好方法。此外，我们认为VG-TO-RG渲染过程对于有效结合VG和RG信息至关重要。通过指定有关如何将VG原语转移到RG像素的规则，渲染过程描述了VG和RG之间的相互作用和相关性。结果，我们提出了Rendnet，这是一种在2D和3D方案上识别的统一体系结构，该体系结构同时考虑VG/RG表示，并通过结合VG-TO-RG栅格化过程来利用其相互作用。实验表明，Rendnet可以在各种VG数据集上的2D和3D对象识别任务上实现最先进的性能。

Vector graphics (VG) have been ubiquitous in our daily life with vast applications in engineering, architecture, designs, etc. The VG recognition process of most existing methods is to first render the VG into raster graphics (RG) and then conduct recognition based on RG formats. However, this procedure discards the structure of geometries and loses the high resolution of VG. Recently, another category of algorithms is proposed to recognize directly from the original VG format. But it is affected by the topological errors that can be filtered out by RG rendering. Instead of looking at one format, it is a good solution to utilize the formats of VG and RG together to avoid these shortcomings. Besides, we argue that the VG-to-RG rendering process is essential to effectively combine VG and RG information. By specifying the rules on how to transfer VG primitives to RG pixels, the rendering process depicts the interaction and correlation between VG and RG. As a result, we propose RendNet, a unified architecture for recognition on both 2D and 3D scenarios, which considers both VG/RG representations and exploits their interaction by incorporating the VG-to-RG rasterization process. Experiments show that RendNet can achieve state-of-the-art performance on 2D and 3D object recognition tasks on various VG datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题