在图纸中进行面部和身体检测的域自适应自我监督的预训练

论文标题

在图纸中进行面部和身体检测的域自适应自我监督的预训练

Domain-Adaptive Self-Supervised Pre-Training for Face & Body Detection in Drawings

论文作者

Topal, Barış Batuhan, Yuret, Deniz, Sezgin, Tevfik Metin

论文摘要

图纸是图形抽象和交流的强大手段。了解包括数字艺术，卡通和漫画在内的各种形式的图纸一直是计算机视觉和计算机图形社区的一个主要问题。尽管漫画和漫画中有大量数字化的图纸，但它们包含巨大的风格变化，这需要用于培训域特异性识别器的昂贵手动标签。在这项工作中，我们展示了如何基于具有修改的学生网络更新设计的教师学习网络的自我监督学习，可用于构建面部和身体探测器。我们的设置允许在仅提供一小部分子集的标签时从目标域中利用大量未标记的数据。我们进一步证明，可以使用大量的自然图像（即来自现实世界的图像）将样式转移纳入我们的学习管道到引导检测器中。我们的组合结构通过最小的注释工作产生了最先进的（SOTA）和近Sota性能的探测器。我们的代码可以从https://github.com/barisbatuhan/dass_detector访问。

Drawings are powerful means of pictorial abstraction and communication. Understanding diverse forms of drawings, including digital arts, cartoons, and comics, has been a major problem of interest for the computer vision and computer graphics communities. Although there are large amounts of digitized drawings from comic books and cartoons, they contain vast stylistic variations, which necessitate expensive manual labeling for training domain-specific recognizers. In this work, we show how self-supervised learning, based on a teacher-student network with a modified student network update design, can be used to build face and body detectors. Our setup allows exploiting large amounts of unlabeled data from the target domain when labels are provided for only a small subset of it. We further demonstrate that style transfer can be incorporated into our learning pipeline to bootstrap detectors using a vast amount of out-of-domain labeled images from natural images (i.e., images from the real world). Our combined architecture yields detectors with state-of-the-art (SOTA) and near-SOTA performance using minimal annotation effort. Our code can be accessed from https://github.com/barisbatuhan/DASS_Detector.

下载PDF全文

下载文献需遵守相关版权规定

论文标题