论文标题
部分可观测时空混沌系统的无模型预测
Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal Fusion with Depth Guidance
论文作者
论文摘要
不管真实性如何,图像支出技术都会产生视觉上合理的内容,从而使其在实践中应用不可靠。因此,我们提出了一个可靠的图像支出任务,从激光雷达到推断真实的RGB场景的稀疏深度。 LIDARS的大型田间视图使其可以用于数据增强和进一步的多模式任务。具体而言,我们提出了一个深度引导的支出网络,以建模两种模态的不同特征表示,并学习结构感知的跨模式融合。设计了两个组件:1)多模式学习模块从不同模态特征的角度产生独特的深度和RGB特征表示。 2)深度引导融合模块利用完整的深度模式来指导通过渐进的多模式融合来建立RGB内容。此外,我们专门设计了一个附加的约束策略,该策略包括跨模式损失和边缘损失,以增强模棱两可的轮廓和加快可靠的内容生成。对Kitti和Waymo数据集进行的广泛实验证明了我们对最先进方法的优越性,并且在定量和定性上。
Image outpainting technology generates visually plausible content regardless of authenticity, making it unreliable to be applied in practice. Thus, we propose a reliable image outpainting task, introducing the sparse depth from LiDARs to extrapolate authentic RGB scenes. The large field view of LiDARs allows it to serve for data enhancement and further multimodal tasks. Concretely, we propose a Depth-Guided Outpainting Network to model different feature representations of two modalities and learn the structure-aware cross-modal fusion. And two components are designed: 1) The Multimodal Learning Module produces unique depth and RGB feature representations from the perspectives of different modal characteristics. 2) The Depth Guidance Fusion Module leverages the complete depth modality to guide the establishment of RGB contents by progressive multimodal feature fusion. Furthermore, we specially design an additional constraint strategy consisting of Cross-modal Loss and Edge Loss to enhance ambiguous contours and expedite reliable content generation. Extensive experiments on KITTI and Waymo datasets demonstrate our superiority over the state-of-the-art method, quantitatively and qualitatively.