论文标题
语义和几何信息如何在TOF对象定位中相互加强
How semantic and geometric information mutually reinforce each other in ToF object localization
论文作者
论文摘要
我们提出了一种新的方法,可以从飞行时间(TOF)传感器提供的强度和深度信息图像中定位3D对象。我们的方法使用两个CNN。第一个使用原始的深度和强度图像作为输入,以分割地板像素,从中估算了相机的外部参数。第二个CNN负责分割利益对象。作为主要创新,它利用了从第一个CNN的预测估计的校准,以表示附着在接地上的坐标系中的几何深度信息,因此与摄像机高度无关。实际上,将像素的高度相对于地面而言,以及向点云的方向作为第二CNN的输入。给定第二个CNN预测的分割,该对象是基于参考模型的点云对齐的本地化的。我们的实验表明,与常规的CNN结构相比,我们提出的两步方法可以提高分割和定位准确性,从而忽略了校准和高度图,但也与PointNet ++相比。
We propose a novel approach to localize a 3D object from the intensity and depth information images provided by a Time-of-Flight (ToF) sensor. Our method uses two CNNs. The first one uses raw depth and intensity images as input, to segment the floor pixels, from which the extrinsic parameters of the camera are estimated. The second CNN is in charge of segmenting the object-of-interest. As a main innovation, it exploits the calibration estimated from the prediction of the first CNN to represent the geometric depth information in a coordinate system that is attached to the ground, and is thus independent of the camera elevation. In practice, both the height of pixels with respect to the ground, and the orientation of normals to the point cloud are provided as input to the second CNN. Given the segmentation predicted by the second CNN, the object is localized based on point cloud alignment with a reference model. Our experiments demonstrate that our proposed two-step approach improves segmentation and localization accuracy by a significant margin compared to a conventional CNN architecture, ignoring calibration and height maps, but also compared to PointNet++.