VIT2HASH：无监督的信息预言哈希

论文标题

VIT2HASH：无监督的信息预言哈希

ViT2Hash: Unsupervised Information-Preserving Hashing

论文作者

Gong, Qinkang, Wang, Liangdao, Lai, Hanjiang, Pan, Yan, Yin, Jian

论文摘要

无监督的图像哈希（Hashing）将图像映射到没有监督的情况下将图像映射到二进制代码中，是一个具有较高压缩率的压缩机。因此，如何保存原始数据的有意义信息是一个关键问题。受到大规模视觉预训练模型的启发，该模型被称为VIT，该模型在学习视觉表示方面显示了重大进展，在本文中，我们提出了一个简单的信息传播压缩机，以捕捉目标无人监护的哈希工作任务的VIT模型。具体而言，从像素到连续功能，我们首先提出一个传播功能的模块，使用损坏的图像作为输入，以从预训练的VIT模型和完整的图像中重建原始功能，以便该功能提取器可以专注于保留原始数据的有意义信息。其次，从连续特征到哈希码，我们提出了一个放大放大模块，该模块旨在通过使用拟议的kullback-leibler差异损失来保留预训练的VIT模型中的语义信息。此外，添加量化损失和相似性损失以最大程度地减少量化误差。我们的方法非常简单，并且在三个基准图像数据集上获得了更高的地图程度。

Unsupervised image hashing, which maps images into binary codes without supervision, is a compressor with a high compression rate. Hence, how to preserving meaningful information of the original data is a critical problem. Inspired by the large-scale vision pre-training model, known as ViT, which has shown significant progress for learning visual representations, in this paper, we propose a simple information-preserving compressor to finetune the ViT model for the target unsupervised hashing task. Specifically, from pixels to continuous features, we first propose a feature-preserving module, using the corrupted image as input to reconstruct the original feature from the pre-trained ViT model and the complete image, so that the feature extractor can focus on preserving the meaningful information of original data. Secondly, from continuous features to hash codes, we propose a hashing-preserving module, which aims to keep the semantic information from the pre-trained ViT model by using the proposed Kullback-Leibler divergence loss. Besides, the quantization loss and the similarity loss are added to minimize the quantization error. Our method is very simple and achieves a significantly higher degree of MAP on three benchmark image datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题