dytanvo：在动态环境中视觉探光和运动分割的联合精致

论文标题

dytanvo：在动态环境中视觉探光和运动分割的联合精致

DytanVO: Joint Refinement of Visual Odometry and Motion Segmentation in Dynamic Environments

论文作者

Shen, Shihao, Cai, Yilin, Wang, Wenshan, Scherer, Sebastian

论文摘要

基于学习的视觉探针计（VO）算法在常见的静态场景上实现了显着的性能，从大容量模型和大量注释数据中受益，但在动态，人口动用的环境中往往会失败。语义细分在估计摄像机动作之前主要用于丢弃动态关联，但以丢弃静态特征为代价，并且很难扩展到看不见的类别。在本文中，我们利用相机自我运动和运动分割之间的相互依赖性，并表明两者都可以在单个基于学习的框架中共同完善。特别是，我们提出了Dytanvo，这是第一个涉及动态环境的基于学习的VO方法。它需要连续两个单眼框架，并以迭代方式预测相机自我运动。我们的方法在现实世界动态环境中的最先进VOUTIONS的平均提高27.7％，甚至在动态的视觉量大系统中进行了竞争性的性能，从而优化了后端的轨迹。在很多看不见的环境上进行的实验也证明了我们的方法的普遍性。

Learning-based visual odometry (VO) algorithms achieve remarkable performance on common static scenes, benefiting from high-capacity models and massive annotated data, but tend to fail in dynamic, populated environments. Semantic segmentation is largely used to discard dynamic associations before estimating camera motions but at the cost of discarding static features and is hard to scale up to unseen categories. In this paper, we leverage the mutual dependence between camera ego-motion and motion segmentation and show that both can be jointly refined in a single learning-based framework. In particular, we present DytanVO, the first supervised learning-based VO method that deals with dynamic environments. It takes two consecutive monocular frames in real-time and predicts camera ego-motion in an iterative fashion. Our method achieves an average improvement of 27.7% in ATE over state-of-the-art VO solutions in real-world dynamic environments, and even performs competitively among dynamic visual SLAM systems which optimize the trajectory on the backend. Experiments on plentiful unseen environments also demonstrate our method's generalizability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题