recsal：视觉显着性预测的深刻递归监督

论文标题

recsal：视觉显着性预测的深刻递归监督

RecSal : Deep Recursive Supervision for Visual Saliency Prediction

论文作者

Mishra, Sandeep, Saha, Oindrila

论文摘要

最先进的显着性预测方法是在模型架构或损失功能上发展的；训练以生成一个目标显着图。但是，可以利用公开可用的显着性预测数据集为每种刺激创建更多信息，而不仅仅是最终的总体显着图。当以生物学启发的方式使用时，这些信息可以在不使用大量参数的模型的情况下为更好的预测性能做出贡献。从这个角度来看，我们建议提取和使用（a）特定区域显着性和（b）固定的时间顺序的统计信息，以为我们的网络提供其他上下文。我们表明，使用空间或时间测序的固定进行额外的监督会导致在显着性预测方面的表现更好。此外，我们还设计了用于利用这些额外信息的新型体系结构，并表明它比没有额外监督的基础模型实现了卓越的性能。我们表明，我们的最佳方法优于先前的最新方法，而参数少了50-80％。我们还表明，与先前的方法不同，我们的模型在所有评估指标中的表现都一致。

State-of-the-art saliency prediction methods develop upon model architectures or loss functions; while training to generate one target saliency map. However, publicly available saliency prediction datasets can be utilized to create more information for each stimulus than just a final aggregate saliency map. This information when utilized in a biologically inspired fashion can contribute in better prediction performance without the use of models with huge number of parameters. In this light, we propose to extract and use the statistics of (a) region specific saliency and (b) temporal order of fixations, to provide additional context to our network. We show that extra supervision using spatially or temporally sequenced fixations results in achieving better performance in saliency prediction. Further, we also design novel architectures for utilizing this extra information and show that it achieves superior performance over a base model which is devoid of extra supervision. We show that our best method outperforms previous state-of-the-art methods with 50-80% fewer parameters. We also show that our models perform consistently well across all evaluation metrics unlike prior methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题