人重新识别：隐含定义深度学习分类框架的接受领域

论文标题

人重新识别：隐含定义深度学习分类框架的接受领域

Person Re-identification: Implicitly Defining the Receptive Fields of Deep Learning Classification Frameworks

论文作者

Yaghoubi, Ehsan, Borza, Diana, Kumar, Aruna, Proença, Hugo

论文摘要

深度学习分类模型的\ emph {接收场}确定输入数据的区域，这些区域对于提供正确的决策具有最重要的意义。学习此类接受领域的主要方法是在掩盖数据上训练模型，这有助于网络忽略任何不需要的区域，但有两个主要缺点：1）它通常会产生边缘敏感的决策过程； 2）大大增加了推理阶段的计算成本。本文介绍了一种解决方案，用于通过创建由互换段组成的合成学习数据，以隐式推断网络接收场的推断，这些数据应\ emph {apriori}对于网络决策重要/无关紧要。在实践中，我们使用分割模块来区分每个学习实例的前景（重要）/背景（无关）部分，并在图像对之间随机交换段，同时保持类标签专门与所认为的重要片段的标签一致。该策略通常会将网络驱动到早期收敛和适当的解决方案，在这种情况下，身份和混乱描述不相关。此外，此数据增强解决方案具有各种有趣的属性：1）无参数； 2）它完全保留了标签信息； 3）它与典型的数据增强技术兼容。在实证验证中，我们考虑了人员重新识别问题，并评估了在众所周知的\ emph {丰富注释的行人}（RAP）数据集中提出的解决方案的有效性（\ emph {posper-emph {upper-body}和\ emph {full-emph}），并且具有高度竞争力的是，越来越多。在可重现的研究范式下，可以在\ url {https://github.com/ehsan-yaghoubi/reid-strong-baseline}上获得代码和经验评估协议。

The \emph{receptive fields} of deep learning classification models determine the regions of the input data that have the most significance for providing correct decisions. The primary way to learn such receptive fields is to train the models upon masked data, which helps the networks to ignore any unwanted regions, but has two major drawbacks: 1) it often yields edge-sensitive decision processes; and 2) augments the computational cost of the inference phase considerably. This paper describes a solution for implicitly driving the inference of the networks' receptive fields, by creating synthetic learning data composed of interchanged segments that should be \emph{apriori} important/irrelevant for the network decision. In practice, we use a segmentation module to distinguish between the foreground (important)/background (irrelevant) parts of each learning instance, and randomly swap segments between image pairs, while keeping the class label exclusively consistent with the label of the deemed important segments. This strategy typically drives the networks to early convergence and appropriate solutions, where the identity and clutter descriptions are not correlated. Moreover, this data augmentation solution has various interesting properties: 1) it is parameter-free; 2) it fully preserves the label information; and, 3) it is compatible with the typical data augmentation techniques. In the empirical validation, we considered the person re-identification problem and evaluated the effectiveness of the proposed solution in the well-known \emph{Richly Annotated Pedestrian} (RAP) dataset for two different settings (\emph{upper-body} and \emph{full-body}), observing highly competitive results over the state-of-the-art. Under a reproducible research paradigm, both the code and the empirical evaluation protocol are available at \url{https://github.com/Ehsan-Yaghoubi/reid-strong-baseline}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题