贝叶斯神经网络的无梯度对抗攻击

论文标题

贝叶斯神经网络的无梯度对抗攻击

Gradient-Free Adversarial Attacks for Bayesian Neural Networks

论文作者

Yuan, Matthew, Wicker, Matthew, Laurenti, Luca

论文摘要

对抗性示例的存在强调了理解机器学习模型的鲁棒性的重要性。贝叶斯神经网络（BNN）由于其校准不确定性而被证明具有有利的对抗性鲁棒性特性。但是，当采用近似贝叶斯推理方法时，BNN的对抗鲁棒性仍然不太了解。在这项工作中，我们采用无梯度优化方法来找到BNN的对抗示例。特别是，我们考虑遗传算法，替代模型以及零秩序优化方法，并将其适应为BNN的目标示例的目标。在对MNIST和时尚MNIST数据集的经验评估中，我们表明，对于各种近似贝叶斯推理方法，与最先进的梯度方法相比，无梯度的无梯度算法的使用可以大大提高寻找对抗性示例的速度。

The existence of adversarial examples underscores the importance of understanding the robustness of machine learning models. Bayesian neural networks (BNNs), due to their calibrated uncertainty, have been shown to posses favorable adversarial robustness properties. However, when approximate Bayesian inference methods are employed, the adversarial robustness of BNNs is still not well understood. In this work, we employ gradient-free optimization methods in order to find adversarial examples for BNNs. In particular, we consider genetic algorithms, surrogate models, as well as zeroth order optimization methods and adapt them to the goal of finding adversarial examples for BNNs. In an empirical evaluation on the MNIST and Fashion MNIST datasets, we show that for various approximate Bayesian inference methods the usage of gradient-free algorithms can greatly improve the rate of finding adversarial examples compared to state-of-the-art gradient-based methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题