论文标题
指导的对抗攻击,用于评估和增强对抗性防御
Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses
论文作者
论文摘要
对抗性攻击的发展发展是对抗性国防研究进步的基础。高效有效的攻击对于可靠评估防御措施以及开发健壮模型至关重要。对抗性攻击通常是通过最大化标准损失(例如使用投影梯度下降(PGD))内的标准损失(例如交叉渗透损失或最大修订损失)来产生的。在这项工作中,我们引入了标准损失的放松术语,发现更合适的梯度导向,提高攻击功效并导致更有效的对抗性训练。我们提出了有指导的对抗边缘攻击(GAMA),该攻击利用清洁图像的功能映射来指导对手的产生,从而导致更强的攻击。与现有攻击相比,我们评估对多种防御的攻击,并显示出改善的性能。此外,我们提出了指导的对抗训练(GAT),该训练通过利用拟议的放松术语来实现攻击生成和训练,从而在单步防御中实现了最新的表现。
Advances in the development of adversarial attacks have been fundamental to the progress of adversarial defense research. Efficient and effective attacks are crucial for reliable evaluation of defenses, and also for developing robust models. Adversarial attacks are often generated by maximizing standard losses such as the cross-entropy loss or maximum-margin loss within a constraint set using Projected Gradient Descent (PGD). In this work, we introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training. We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries, thereby resulting in stronger attacks. We evaluate our attack against multiple defenses and show improved performance when compared to existing attacks. Further, we propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses by utilizing the proposed relaxation term for both attack generation and training.