属性原型网络，用于任何拍摄学习

论文标题

属性原型网络，用于任何拍摄学习

Attribute Prototype Network for Any-Shot Learning

论文作者

Xu, Wenjia, Xian, Yongqin, Wang, Jiuniu, Schiele, Bernt, Akata, Zeynep

论文摘要

任何照片的图像分类都可以识别只有几个甚至零样品的新型类别。对于零射门学习的任务，已证明视觉属性起着重要作用，而在几个射击方案中，属性的效果却没有探索。为了更好地将基于属性的知识从可见的类别传输到看不见的类，我们认为具有集成属性本地化能力的图像表示形式将对任何拍摄有益，即零射击和少数图像分类任务。为此，我们提出了一个新颖的表示学习框架，该框架仅使用类级属性共同学习歧视性的全球和本地特征。虽然视觉语义嵌入层学习了全局特征，但通过属性原型网络学习了本地特征，该特征同时回归并从中间功能中解散属性。此外，我们引入了一个缩放模块，该模块本地局部和播种了信息丰富的区域，以鼓励网络明确学习信息功能。我们表明，我们的当地增强图像表示形式实现了有关具有挑战性的基准，即幼崽，AWA2和Sun的最新最先进。作为另一个好处，我们的模型指出了图像中属性的视觉证据，证实了我们图像表示的提高属性本地化能力。属性本地化是通过地面真理部分注释，定性和可视化的用户研究进行定量评估的。

Any-shot image classification allows to recognize novel classes with only a few or even zero samples. For the task of zero-shot learning, visual attributes have been shown to play an important role, while in the few-shot regime, the effect of attributes is under-explored. To better transfer attribute-based knowledge from seen to unseen classes, we argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks. To this end, we propose a novel representation learning framework that jointly learns discriminative global and local features using only class-level attributes. While a visual-semantic embedding layer learns global features, local features are learned through an attribute prototype network that simultaneously regresses and decorrelates attributes from intermediate features. Furthermore, we introduce a zoom-in module that localizes and crops the informative regions to encourage the network to learn informative features explicitly. We show that our locality augmented image representations achieve a new state-of-the-art on challenging benchmarks, i.e. CUB, AWA2, and SUN. As an additional benefit, our model points to the visual evidence of the attributes in an image, confirming the improved attribute localization ability of our image representation. The attribute localization is evaluated quantitatively with ground truth part annotations, qualitatively with visualizations, and through well-designed user studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题