多标签一声学习的构图嵌入

论文标题

多标签一声学习的构图嵌入

Compositional Embeddings for Multi-Label One-Shot Learning

论文作者

Li, Zeqian, Mozer, Michael C., Whitehill, Jacob

论文摘要

我们提出了一个构图嵌入框架，该框架不仅会在单次学习的设置中渗透每个输入图像的单个类，而且会渗透一组类。具体而言，我们提出并评估了几种新型模型，该模型由（1）与“组成”函数G训练的嵌入函数F，该功能g训练了一个“组成”函数g，该功能g计算在两个嵌入向量中编码的类之间设置联合操作；（2）与“查询”函数h嵌入F训练的F，该函数h计算一个嵌入一个嵌入中的类是否在另一个嵌入中编码的类。与先前的工作相反，这些模型既必须感知与输入示例相关的类，又要编码不同类标签集之间的关系，并且它们仅使用由培训示例中的标签 - 集合关系组成的弱的一声监督进行培训。 Omniglot，开放图像和可可数据集的实验表明，所提出的组成嵌入模型的表现优于现有的嵌入方法。我们的构图嵌入模型在多标签对象识别的一声和监督学习中都有应用。

We present a compositional embedding framework that infers not just a single class per input image, but a set of classes, in the setting of one-shot learning. Specifically, we propose and evaluate several novel models consisting of (1) an embedding function f trained jointly with a "composition" function g that computes set union operations between the classes encoded in two embedding vectors; and (2) embedding f trained jointly with a "query" function h that computes whether the classes encoded in one embedding subsume the classes encoded in another embedding. In contrast to prior work, these models must both perceive the classes associated with the input examples and encode the relationships between different class label sets, and they are trained using only weak one-shot supervision consisting of the label-set relationships among training examples. Experiments on the OmniGlot, Open Images, and COCO datasets show that the proposed compositional embedding models outperform existing embedding methods. Our compositional embedding models have applications to multi-label object recognition for both one-shot and supervised learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题