三方：图像操纵本地化的三方渐进集成网络

论文标题

三方：图像操纵本地化的三方渐进集成网络

TriPINet: Tripartite Progressive Integration Network for Image Manipulation Localization

论文作者

Liang, Wei-Yun, Xu, Jing, Jin, Xiao

论文摘要

图像操纵定位旨在区分锻造区域和整个测试图像。尽管已经为这项任务提出了许多杰出的先前艺术，但仍然需要进一步研究两个问题：1）如何将各种类型的功能与伪造线索融合在一起； 2）如何逐步整合多阶段功能，以提高本地化性能。在本文中，我们提出了一个三方进行渐进式集成网络（TripInet），用于端到端图像操纵本地化。首先，我们提取视觉感知信息，例如RGB输入图像以及视觉上不可察觉的特征，例如用于法医学学习的频率和噪声痕迹。其次，我们开发了一个引导的跨模式双重注意（GCMDA）模块，以融合不同类型的锻造线索。第三，我们设计了一组渐进式集成挤压和兴奋（PI-SE）模块，以通过适当地包含解码器中的多尺度功能来提高本地化性能。进行了广泛的实验，以将我们的方法与最先进的图像取证方法进行比较。拟议的三键在多个基准数据集中获得了竞争结果。

Image manipulation localization aims at distinguishing forged regions from the whole test image. Although many outstanding prior arts have been proposed for this task, there are still two issues that need to be further studied: 1) how to fuse diverse types of features with forgery clues; 2) how to progressively integrate multistage features for better localization performance. In this paper, we propose a tripartite progressive integration network (TriPINet) for end-to-end image manipulation localization. First, we extract both visual perception information, e.g., RGB input images, and visual imperceptible features, e.g., frequency and noise traces for forensic feature learning. Second, we develop a guided cross-modality dual-attention (gCMDA) module to fuse different types of forged clues. Third, we design a set of progressive integration squeeze-and-excitation (PI-SE) modules to improve localization performance by appropriately incorporating multiscale features in the decoder. Extensive experiments are conducted to compare our method with state-of-the-art image forensics approaches. The proposed TriPINet obtains competitive results on several benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题