论文标题
Manigen:对抗性示例的歧管辅助黑盒发生器
ManiGen: A Manifold Aided Black-box Generator of Adversarial Examples
论文作者
论文摘要
机器学习模型,尤其是神经网络(NN)分类器,具有可接受的性能和准确性,可以在我们日常生活的不同方面广泛采用。基本的假设是这些模型是生成并用于无攻击场景的。但是,已经表明,基于神经网络的分类器容易受到对抗示例的影响。对抗性示例是具有特殊扰动的输入,这些输入被人眼忽略,而可能会误导NN分类器。产生这种扰动的大多数现有方法都需要有关目标分类器的一定程度的知识,这使得它们不太实际。例如,某些发电机需要了解前巨型逻辑的知识,而另一些生成器则使用预测分数。 在本文中,我们设计了一个实用的黑盒对抗示例生成器,称为Manigen。 Manigen不需要对目标分类器的内部状态有任何了解。它通过沿歧管搜索来生成对抗示例,这是输入数据的简洁表示。通过在不同数据集上的大量实验集,我们表明(1)Manigen产生的对抗性示例可以通过与最先进的白色盒子生成器,Carlini和(2)Manigen产生的对抗性示例一样成功,可以误导独立的分类器。
Machine learning models, especially neural network (NN) classifiers, have acceptable performance and accuracy that leads to their wide adoption in different aspects of our daily lives. The underlying assumption is that these models are generated and used in attack free scenarios. However, it has been shown that neural network based classifiers are vulnerable to adversarial examples. Adversarial examples are inputs with special perturbations that are ignored by human eyes while can mislead NN classifiers. Most of the existing methods for generating such perturbations require a certain level of knowledge about the target classifier, which makes them not very practical. For example, some generators require knowledge of pre-softmax logits while others utilize prediction scores. In this paper, we design a practical black-box adversarial example generator, dubbed ManiGen. ManiGen does not require any knowledge of the inner state of the target classifier. It generates adversarial examples by searching along the manifold, which is a concise representation of input data. Through extensive set of experiments on different datasets, we show that (1) adversarial examples generated by ManiGen can mislead standalone classifiers by being as successful as the state-of-the-art white-box generator, Carlini, and (2) adversarial examples generated by ManiGen can more effectively attack classifiers with state-of-the-art defenses.