论文标题
频率调整的通用对抗攻击
Frequency-Tuned Universal Adversarial Attacks
论文作者
论文摘要
研究人员表明,对于图像集的卷积神经网络(CNN)的预测可能会因一个单个图像敏锐的扰动或普遍扰动而严重扭曲,通常在空间域中具有经验上固定的阈值以限制其可感知性。但是,通过考虑人类的看法,我们建议采用JND阈值来指导普遍的对抗性扰动的可感知性。基于此,我们提出了一种频率调整的通用攻击方法来计算通用扰动,并表明我们的方法可以通过将扰动调整到局部频率含量中,从而在可感知性和有效性之间达到良好的平衡。与现有的通用对抗攻击技术相比,我们的频率调整攻击方法可以实现最先进的定量结果。我们证明我们的方法可以显着提高白框和黑盒攻击上基线的性能。
Researchers have shown that the predictions of a convolutional neural network (CNN) for an image set can be severely distorted by one single image-agnostic perturbation, or universal perturbation, usually with an empirically fixed threshold in the spatial domain to restrict its perceivability. However, by considering the human perception, we propose to adopt JND thresholds to guide the perceivability of universal adversarial perturbations. Based on this, we propose a frequency-tuned universal attack method to compute universal perturbations and show that our method can realize a good balance between perceivability and effectiveness in terms of fooling rate by adapting the perturbations to the local frequency content. Compared with existing universal adversarial attack techniques, our frequency-tuned attack method can achieve cutting-edge quantitative results. We demonstrate that our approach can significantly improve the performance of the baseline on both white-box and black-box attacks.