适应嵌入网络的Grad-CAM

论文标题

适应嵌入网络的Grad-CAM

Adapting Grad-CAM for Embedding Networks

论文作者

Chen, Lei, Chen, Jianhui, Hajimirsadeghi, Hossein, Mori, Greg

论文摘要

梯度加权类激活映射（Grad-CAM）方法可以忠实地突出图像中的图像中的重要区域，以进行图像分类，图像字幕和许多其他任务中的深层模型预测。它将后传播中的梯度作为权重（研究生重量）来解释网络决策。但是，将GRAD-CAM应用于嵌入网络会引起重大挑战，因为嵌入网络是由数百万个动态配对的示例（例如三胞胎）训练的。为了克服这些挑战，我们提出了嵌入网络的GRAD-CAM方法的适应。首先，我们从多个培训示例中汇总了毕业生重量，以提高GRAD-CAM的稳定性。然后，我们开发了一种有效的权重转移方法来解释任何图像的决策而无需反向传播。我们广泛验证了标准Cub200数据集上的方法，其中我们的方法比原始的Grad-CAM方法更准确地视觉关注。我们还将该方法应用于使用图像的房屋价格估计应用程序。该方法产生了令人信服的定性结果，展示了我们方法的实用性。

The gradient-weighted class activation mapping (Grad-CAM) method can faithfully highlight important regions in images for deep model prediction in image classification, image captioning and many other tasks. It uses the gradients in back-propagation as weights (grad-weights) to explain network decisions. However, applying Grad-CAM to embedding networks raises significant challenges because embedding networks are trained by millions of dynamically paired examples (e.g. triplets). To overcome these challenges, we propose an adaptation of the Grad-CAM method for embedding networks. First, we aggregate grad-weights from multiple training examples to improve the stability of Grad-CAM. Then, we develop an efficient weight-transfer method to explain decisions for any image without back-propagation. We extensively validate the method on the standard CUB200 dataset in which our method produces more accurate visual attention than the original Grad-CAM method. We also apply the method to a house price estimation application using images. The method produces convincing qualitative results, showcasing the practicality of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题