拉特：反复注意持续图像字幕的瞬态任务

论文标题

拉特：反复注意持续图像字幕的瞬态任务

RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning

论文作者

Del Chiaro, Riccardo, Twardowski, Bartłomiej, Bagdanov, Andrew D., van de Weijer, Joost

论文摘要

对持续学习的研究导致了各种方法来减轻馈送前馈分类网络中的灾难性遗忘。到目前为止，几乎没有关注的重点是不断学习用于图像字幕等问题的经常性模型。在本文中，我们系统地研究了基于LSTM的模型的持续学习以进行图像字幕。我们提出了一种基于注意力的方法，该方法在连续的图像字幕任务中明确适应词汇的短暂性 - 即，任务词汇不是脱节的。我们称我们的方法反复注意瞬态任务（RATT），并展示如何根据体重表达和知识蒸馏来调整连续学习方法，以重复进行连续学习问题。我们将方法应用于使用MS-Coco和Flickr30数据集定义的两个新的持续学习基准上的增量图像字幕问题。我们的结果表明，Ratt能够依次学习五个字幕任务，同时又不会忘记先前学到的任务。

Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propose an attention-based approach that explicitly accommodates the transient nature of vocabularies in continual image captioning tasks -- i.e. that task vocabularies are not disjoint. We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems. We apply our approaches to incremental image captioning problem on two new continual learning benchmarks we define using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is able to sequentially learn five captioning tasks while incurring no forgetting of previously learned ones.

下载PDF全文

下载文献需遵守相关版权规定

论文标题