dverge：多元化的脆弱性，以增强强大的合奏生成

论文标题

dverge：多元化的脆弱性，以增强强大的合奏生成

DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles

论文作者

Yang, Huanrui, Zhang, Jingyang, Dong, Hongliang, Inkawhich, Nathan, Gardner, Andrew, Touchet, Andrew, Wilkes, Wesley, Berry, Heath, Li, Hai

论文摘要

最近的研究发现，用于图像分类的CNN模型表明了对抗性脆弱性：对抗性攻击可能会误导具有小扰动的CNN模型，这可以有效地在同一数据集中训练的不同模型之间传递。作为一种一般鲁棒性改进技术，对抗性训练通过强迫其学习鲁棒功能来消除单个模型中的脆弱性。该过程很难，通常需要具有较大容量的模型，并且在清洁数据准确性上遭受了重大损失。另外，提出合奏方法来诱导具有不同输出的子模型，以针对传递的对抗示例，即使每个子模型都不适合转移攻击。在此过程中，仅观察到较小的清洁精度下降。但是，以前的合奏训练方法在诱导这种多样性并因此无法有效地达到强大的合奏方面没有有效。我们提出了dverge，它通过提炼非舒适特征来隔离每个子模型中的对抗脆弱性，并多样化对抗性脆弱性，以诱导转移攻击的各种输出。新型的多样性指标和训练程序使Dverge能够与以前的集合方法相比，可以实现更高的鲁棒性，以防止转移攻击，并在将更多子模型添加到合奏中时可以提高鲁棒性。这项工作的代码可从https://github.com/zjysteven/dverge获得

Recent research finds CNN models for image classification demonstrate overlapped adversarial vulnerabilities: adversarial attacks can mislead CNN models with small perturbations, which can effectively transfer between different models trained on the same dataset. Adversarial training, as a general robustness improvement technique, eliminates the vulnerability in a single model by forcing it to learn robust features. The process is hard, often requires models with large capacity, and suffers from significant loss on clean data accuracy. Alternatively, ensemble methods are proposed to induce sub-models with diverse outputs against a transfer adversarial example, making the ensemble robust against transfer attacks even if each sub-model is individually non-robust. Only small clean accuracy drop is observed in the process. However, previous ensemble training methods are not efficacious in inducing such diversity and thus ineffective on reaching robust ensemble. We propose DVERGE, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features, and diversifies the adversarial vulnerability to induce diverse outputs against a transfer attack. The novel diversity metric and training procedure enables DVERGE to achieve higher robustness against transfer attacks comparing to previous ensemble methods, and enables the improved robustness when more sub-models are added to the ensemble. The code of this work is available at https://github.com/zjysteven/DVERGE

下载PDF全文

下载文献需遵守相关版权规定

论文标题