在看不见的域中几乎没有射击对象检测

论文标题

在看不见的域中几乎没有射击对象检测

Few-Shot Object Detection in Unseen Domains

论文作者

Guirguis, Karim, Eskandar, George, Kayser, Matthias, Yang, Bin, Beyerer, Juergen

论文摘要

近年来，很少有射击对象检测（FSOD）通过转移丰富的基本类别获得的知识来学习具有有限数据的新颖对象类。 FSOD方法通常认为两者几乎没有提供新的类别的示例，并且测试时间数据属于同一域。但是，在各种工业和机器人技术应用中，该假设不存在，在这种应用程序中，模型可以从源域中学习新颖的类，而从目标域中推断类。在这项工作中，我们解决了FSOD的零射击域适应性（也称为域概括）的任务。具体而言，我们假设目标域中新颖类的图像和标签都不能在训练过程中可用。我们解决域间隙的方法是两个方面。首先，我们利用元训练范式，在该范式上学习基本类别的域转移，然后将域知识转移到新颖的类别中。其次，我们在新型类别的几镜头上提出了各种数据增强技术，以说明所有可能的领域特定信息。为了将网络仅限于编码域 - 不可思议的类特异性表示，提出了对比损失，以最大程度地提高前景建议和类嵌入之间的相互信息，并将网络的偏见减少到来自目标域的背景信息。我们对无T，Pascal-VOC和Exdark数据集进行的实验表明，所提出的方法成功地减轻了域间隙，而无需利用目标域中的标签或新型类别的图像。

Few-shot object detection (FSOD) has thrived in recent years to learn novel object classes with limited data by transferring knowledge gained on abundant base classes. FSOD approaches commonly assume that both the scarcely provided examples of novel classes and test-time data belong to the same domain. However, this assumption does not hold in various industrial and robotics applications, where a model can learn novel classes from a source domain while inferring on classes from a target domain. In this work, we address the task of zero-shot domain adaptation, also known as domain generalization, for FSOD. Specifically, we assume that neither images nor labels of the novel classes in the target domain are available during training. Our approach for solving the domain gap is two-fold. First, we leverage a meta-training paradigm, where we learn the domain shift on the base classes, then transfer the domain knowledge to the novel classes. Second, we propose various data augmentations techniques on the few shots of novel classes to account for all possible domain-specific information. To constraint the network into encoding domain-agnostic class-specific representations only, a contrastive loss is proposed to maximize the mutual information between foreground proposals and class embeddings and reduce the network's bias to the background information from target domain. Our experiments on the T-LESS, PASCAL-VOC, and ExDark datasets show that the proposed approach succeeds in alleviating the domain gap considerably without utilizing labels or images of novel categories from the target domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题