论文标题
3DCTN:用于点云分类的3D卷积转换器网络
3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification
论文作者
论文摘要
尽管准确且快速的云分类是3D应用中的一项基本任务,但由于点云的不规则性和混乱,很难实现此目的,这使得实现有效有效的全球歧视性特征学习变得具有挑战性。最近,已经采用了3D变压器来改善点云处理。然而,巨大的变压器层往往会产生巨大的计算和记忆成本。本文提出了一个新颖的层次结构框架,该框架将卷积与变压器进行点云分类(称为3D卷积转换器网络(3DCTN)),以结合卷积的强大而有效的局部特征学习能力与出色的全球上下文建模变压器的能力。我们的方法具有在下采样点集上运行的两个主要模块,每个模块由一个多尺度的本地特征聚合(LFA)块和一个全局特征学习(GFL)块组成,该块分别使用图形卷积和变压器实现。我们还对一系列变压器变体进行了详细研究,以探索我们网络的更好性能。 ModelNet40上的各种实验表明,就准确性和效率而言,我们的方法可以达到最新的分类性能。
Although accurate and fast point cloud classification is a fundamental task in 3D applications, it is difficult to achieve this purpose due to the irregularity and disorder of point clouds that make it challenging to achieve effective and efficient global discriminative feature learning. Lately, 3D Transformers have been adopted to improve point cloud processing. Nevertheless, massive Transformer layers tend to incur huge computational and memory costs. This paper presents a novel hierarchical framework that incorporates convolution with Transformer for point cloud classification, named 3D Convolution-Transformer Network (3DCTN), to combine the strong and efficient local feature learning ability of convolution with the remarkable global context modeling capability of Transformer. Our method has two main modules operating on the downsampling point sets, and each module consists of a multi-scale local feature aggregating (LFA) block and a global feature learning (GFL) block, which are implemented by using Graph Convolution and Transformer respectively. We also conduct a detailed investigation on a series of Transformer variants to explore better performance for our network. Various experiments on ModelNet40 demonstrate that our method achieves state-of-the-art classification performance, in terms of both accuracy and efficiency.