论文标题
“这是什么?” - 学会从操纵序列分割未知对象
"What's This?" -- Learning to Segment Unknown Objects from Manipulation Sequences
论文作者
论文摘要
我们提出了一个新颖的框架,用于使用机器人操纵器进行自我监督的握把分割。我们的方法依次学习了不可知的前景分割,然后仅通过观察连续的RGB帧之间的运动来区分操纵器和对象之间的区别。与以前的方法相反,我们提出了一个端到端的可训练结构,该体系结构共同结合了运动提示和语义知识。此外,尽管操纵器和对象的运动是我们算法的实质性提示,但我们提出的意思是强大地处理在背景中移动的分心对象以及完全静态的场景。我们的方法不取决于运动机器人或3D对象模型的任何视觉注册,也不取决于精确的手眼校准或任何其他传感器数据。通过广泛的实验评估,我们证明了框架的优势,并提供了有关其处理上述极端运动案例的能力的详细见解。我们还表明,训练具有自动标记数据的语义分割网络与手动注释的培训数据相当。代码和预估计的模型可在https://github.com/dlr-rm/distinctnet上找到。
We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator. Our method successively learns an agnostic foreground segmentation followed by a distinction between manipulator and object solely by observing the motion between consecutive RGB frames. In contrast to previous approaches, we propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge. Furthermore, while the motion of the manipulator and the object are substantial cues for our algorithm, we present means to robustly deal with distraction objects moving in the background, as well as with completely static scenes. Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data. By extensive experimental evaluation we demonstrate the superiority of our framework and provide detailed insights on its capability of dealing with the aforementioned extreme cases of motion. We also show that training a semantic segmentation network with the automatically labeled data achieves results on par with manually annotated training data. Code and pretrained model are available at https://github.com/DLR-RM/DistinctNet.