论文标题
只是一个规模问题?重新评估卷积神经网络中的量表
Just a Matter of Scale? Reevaluating Scale Equivariance in Convolutional Neural Networks
论文作者
论文摘要
卷积神经网络的广泛成功可能在很大程度上归因于它们的固有属性。但是,卷积并不等于规模的变化,也无法推广到不同大小的对象。尽管该领域的最新进展,但目前的方法在现实世界数据上未观察到的尺度以及量表在多大程度上发挥作用,目前的方法尚不清楚如何推广到量表。为了解决这个问题,我们提出了基于四个不同领域的新颖缩放和翻译图像识别(搅拌)基准。此外,我们介绍了一个新的模型系列,该模型将同时使用许多重新尺寸的内核,然后选择最合适的核。我们对搅拌的实验结果表明,与标准卷积相比,现有方法和拟议的方法都可以改善范围的概括。我们还证明,我们的模型家族能够很好地推广到更大的尺度并提高规模的肩膀。此外,由于其独特的设计,我们可以验证内核选择与输入量表一致。即使这样,评估的模型都没有维持其在规模上的巨大差异的表现,这表明仍然缺乏对量表的量表如何改善概括和鲁棒性的一般理解。
The widespread success of convolutional neural networks may largely be attributed to their intrinsic property of translation equivariance. However, convolutions are not equivariant to variations in scale and fail to generalize to objects of different sizes. Despite recent advances in this field, it remains unclear how well current methods generalize to unobserved scales on real-world data and to what extent scale equivariance plays a role. To address this, we propose the novel Scaled and Translated Image Recognition (STIR) benchmark based on four different domains. Additionally, we introduce a new family of models that applies many re-scaled kernels with shared weights in parallel and then selects the most appropriate one. Our experimental results on STIR show that both the existing and proposed approaches can improve generalization across scales compared to standard convolutions. We also demonstrate that our family of models is able to generalize well towards larger scales and improve scale equivariance. Moreover, due to their unique design we can validate that kernel selection is consistent with input scale. Even so, none of the evaluated models maintain their performance for large differences in scale, demonstrating that a general understanding of how scale equivariance can improve generalization and robustness is still lacking.