论文标题
偏置 - 严格的纹理偏见评估
BiasBed -- Rigorous Texture Bias Evaluation
论文作者
论文摘要
现代卷积神经网络中有充分记录的纹理偏见的存在导致了许多算法,这些算法促进了对形状线索的强调,通常是为了支持对新领域的概括。然而,缺少常见的数据集,基准和一般模型选择策略,并且没有达成的严格评估协议。在本文中,我们研究了训练质地偏差减少的网络时的困难和局限性。特别是,我们还表明,方法之间的适当评估和有意义的比较并非微不足道。我们介绍了偏见的纹理和样式偏见培训的测试床,包括多个数据集和一系列现有算法。它带有广泛的评估方案,其中包括严格的假设测试,以评估结果的重要性,尽管某些样式偏差方法的训练不稳定。我们进行了广泛的实验,阐明了需要仔细的,统计上建立的风格偏见评估方案(及以后)。例如,我们发现文献中提出的某些算法根本不会显着减轻风格偏见的影响。随着偏见的发布,我们希望能够增进对一致且有意义的比较的共同理解,从而更快地朝着没有纹理偏见的学习方法方面的进步。代码可从https://github.com/d1nofuzi/biasbed获得
The well-documented presence of texture bias in modern convolutional neural networks has led to a plethora of algorithms that promote an emphasis on shape cues, often to support generalization to new domains. Yet, common datasets, benchmarks and general model selection strategies are missing, and there is no agreed, rigorous evaluation protocol. In this paper, we investigate difficulties and limitations when training networks with reduced texture bias. In particular, we also show that proper evaluation and meaningful comparisons between methods are not trivial. We introduce BiasBed, a testbed for texture- and style-biased training, including multiple datasets and a range of existing algorithms. It comes with an extensive evaluation protocol that includes rigorous hypothesis testing to gauge the significance of the results, despite the considerable training instability of some style bias methods. Our extensive experiments, shed new light on the need for careful, statistically founded evaluation protocols for style bias (and beyond). E.g., we find that some algorithms proposed in the literature do not significantly mitigate the impact of style bias at all. With the release of BiasBed, we hope to foster a common understanding of consistent and meaningful comparisons, and consequently faster progress towards learning methods free of texture bias. Code is available at https://github.com/D1noFuzi/BiasBed