论文标题

知识蒸馏对对抗性鲁棒性的好处

On the benefits of knowledge distillation for adversarial robustness

论文作者

Maroto, Javier, Ortiz-Jiménez, Guillermo, Frossard, Pascal

论文摘要

知识蒸馏通常用于通过培训以匹配其输出来将大型网络或老师压缩到较小的学生(学生)上。最近,一些作品表明,对对抗性攻击的鲁棒性也可以有效地蒸馏出来,以实现对移动友好型模型的良好稳定性。但是,在这项工作中,我们采取了不同的观点,并表明知识蒸馏可直接用于提高对抗性鲁棒性中最新模型的性能。从这个意义上讲,我们提供了彻底的分析,并提供了一般指南,以从健壮的老师中提取知识,并进一步提高学生模型的清洁和对抗性。为此,我们提出了对抗性知识蒸馏(AKD),这是一个新的框架,以提高模型的稳健性能,包括对手对学生的原始标签和老师输出的混合进行培训。通过仔细控制的消融研究,我们表明,使用早期的模型合奏和弱对抗训练是最大化学生表现的关键技术,并表明这些见解在不同的强大蒸馏技术中概括了。最后,我们提供了有关强大知识蒸馏对学生网络动力学的影响的见解,并表明AKD大多改善了网络的校准,并将其在模型发现难以学习甚至记忆的样本上修改其训练动力学。

Knowledge distillation is normally used to compress a big network, or teacher, onto a smaller one, the student, by training it to match its outputs. Recently, some works have shown that robustness against adversarial attacks can also be distilled effectively to achieve good rates of robustness on mobile-friendly models. In this work, however, we take a different point of view, and show that knowledge distillation can be used directly to boost the performance of state-of-the-art models in adversarial robustness. In this sense, we present a thorough analysis and provide general guidelines to distill knowledge from a robust teacher and boost the clean and adversarial performance of a student model even further. To that end, we present Adversarial Knowledge Distillation (AKD), a new framework to improve a model's robust performance, consisting on adversarially training a student on a mixture of the original labels and the teacher outputs. Through carefully controlled ablation studies, we show that using early-stopping, model ensembles and weak adversarial training are key techniques to maximize performance of the student, and show that these insights generalize across different robust distillation techniques. Finally, we provide insights on the effect of robust knowledge distillation on the dynamics of the student network, and show that AKD mostly improves the calibration of the network and modify its training dynamics on samples that the model finds difficult to learn, or even memorize.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源