论文标题
关于视觉表示的对比学习中的共同信息
On Mutual Information in Contrastive Learning for Visual Representations
论文作者
论文摘要
近年来,已经证明,有几种无监督的“对比”学习算法被证明可以学习在转移任务上表现出色的表示形式。我们表明,这种算法家族在两个或多个图像的“视图”之间最大程度地限制了典型视图来自图像增强的组成。我们的边界概括了基础目标,以支持来自“困难”对比的受限区域的负面采样。我们发现,否定样本和观点的选择对于这些算法的成功至关重要。根据相互信息来重新制定以前的学习目标,也可以简化和稳定。在实践中,我们的新目标产生的表示形式胜过那些以先前的方法转移到分类,边界检测,实例细分和关键点检测的方法。 %实验表明,选择更困难的负样本会导致更强的表示,在分类,边界框检测,实例分段和键盘检测中,用IR,LA和CMC学到的那些表现优于那些。相互信息框架提供了对对比学习方法的统一比较,并揭示了影响代表学习的选择。
In recent years, several unsupervised, "contrastive" learning algorithms in vision have been shown to learn representations that perform remarkably well on transfer tasks. We show that this family of algorithms maximizes a lower bound on the mutual information between two or more "views" of an image where typical views come from a composition of image augmentations. Our bound generalizes the InfoNCE objective to support negative sampling from a restricted region of "difficult" contrasts. We find that the choice of negative samples and views are critical to the success of these algorithms. Reformulating previous learning objectives in terms of mutual information also simplifies and stabilizes them. In practice, our new objectives yield representations that outperform those learned with previous approaches for transfer to classification, bounding box detection, instance segmentation, and keypoint detection. % experiments show that choosing more difficult negative samples results in a stronger representation, outperforming those learned with IR, LA, and CMC in classification, bounding box detection, instance segmentation, and keypoint detection. The mutual information framework provides a unifying comparison of approaches to contrastive learning and uncovers the choices that impact representation learning.