论文标题

学习很难改变的解释

Learning explanations that are hard to vary

论文作者

Parascandolo, Giambattista, Neitz, Alexander, Orvieto, Antonio, Gresele, Luigi, Schölkopf, Bernhard

论文摘要

在本文中,我们调查了以下原则:在深度学习的背景下,“良好的解释很难变化”。我们表明,跨示例的平均梯度(类似于逻辑或模式)可以偏爱记忆和“拼布”解决方案,这些解决方案将不同的策略缝制在一起,而不是识别侵犯。要检查这一点,我们首先将损失面的最小值的一致性概念形式化,该概念仅在汇总示例时才能在多大程度上出现最小值。然后,我们提出并实验验证了一种基于逻辑的简单替代算法,并着重于不向导,并防止在一组真实世界任务中的记忆。最后,使用一个合成数据集,在不变和伪造机制之间具有明确区分的合成数据集,我们剖析了学习信号,并将这种方法与已建立的正规化器进行比较。

In this paper, we investigate the principle that `good explanations are hard to vary' in the context of deep learning. We show that averaging gradients across examples -- akin to a logical OR of patterns -- can favor memorization and `patchwork' solutions that sew together different strategies, instead of identifying invariances. To inspect this, we first formalize a notion of consistency for minima of the loss surface, which measures to what extent a minimum appears only when examples are pooled. We then propose and experimentally validate a simple alternative algorithm based on a logical AND, that focuses on invariances and prevents memorization in a set of real-world tasks. Finally, using a synthetic dataset with a clear distinction between invariant and spurious mechanisms, we dissect learning signals and compare this approach to well-established regularizers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源