论文标题
部分可观测时空混沌系统的无模型预测
Focal Sparse Convolutional Networks for 3D Object Detection
论文作者
论文摘要
不均匀的3D稀疏数据,例如,处于不同空间位置的点云或体素,以不同方式为3D对象检测的任务做出了贡献。稀疏卷积网络(稀疏CNN)中的现有基本组件会处理所有稀疏数据,而不论常规或子手机稀疏卷积如何。在本文中,我们介绍了两个新的模块以增强稀疏CNN的能力,这两者都是基于使特征稀疏性可学习的,并以位置为重要性的预测。它们是局灶性稀疏卷积(焦点conv)及其多模式稀疏卷积的多模式变体,或简称焦点conv-f。新的模块可以轻松地用现有的稀疏CNN替代其普通的模块,并以端到端的方式共同培训。我们第一次表明,稀疏卷积中的空间可学习的稀疏性对于复杂的3D对象检测至关重要。对Kitti,Nuscenes和Waymo基准的广泛实验验证了我们方法的有效性。如果没有铃铛和哨声,我们的结果在纸质提交时间的Nuscenes测试基准测试基准上的所有现有单模条目都优于所有现有的单模条目。代码和模型在https://github.com/dvlab-research/focalsconv上。
Non-uniformed 3D sparse data, e.g., point clouds or voxels in different spatial positions, make contribution to the task of 3D object detection in different ways. Existing basic components in sparse convolutional networks (Sparse CNNs) process all sparse data, regardless of regular or submanifold sparse convolution. In this paper, we introduce two new modules to enhance the capability of Sparse CNNs, both are based on making feature sparsity learnable with position-wise importance prediction. They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion, or Focals Conv-F for short. The new modules can readily substitute their plain counterparts in existing Sparse CNNs and be jointly trained in an end-to-end fashion. For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection. Extensive experiments on the KITTI, nuScenes and Waymo benchmarks validate the effectiveness of our approach. Without bells and whistles, our results outperform all existing single-model entries on the nuScenes test benchmark at the paper submission time. Code and models are at https://github.com/dvlab-research/FocalsConv.