机器生成对抗图像的明显差异很明显

论文标题

机器生成对抗图像的明显差异很明显

Just Noticeable Difference for Machines to Generate Adversarial Images

论文作者

Akan, Adil Kaan, Genc, Mehmet Ali, Vural, Fatos T. Yarman

论文摘要

设计强大的机器学习算法的一种方法是生成真实的对抗图像，该图像可以尽可能地欺骗算法。在这项研究中，我们提出了一种新方法来生成与真实图像非常相似的对抗图像，但是，这些图像与原始图像有区别，并被模型分配为另一个类别。提出的方法基于一个流行的实验心理学概念，称为明显的差异。我们为机器学习模型定义了明显的差异，并为对抗图像产生了最不感知的差异，这可能会欺骗模型。建议的模型通过梯度下降方法迭代地扭曲了真实图像，直到机器学习算法输出错误标签为止。深度神经网络经过培训，以进行对象检测和分类任务。成本函数包括正规化术语，以生成明显不同的对抗图像，该图像可以通过模型检测到。与最先进的对抗图像发生器的输出相比，本研究中产生的对抗图像看起来更自然。

One way of designing a robust machine learning algorithm is to generate authentic adversarial images which can trick the algorithms as much as possible. In this study, we propose a new method to generate adversarial images which are very similar to true images, yet, these images are discriminated from the original ones and are assigned into another category by the model. The proposed method is based on a popular concept of experimental psychology, called, Just Noticeable Difference. We define Just Noticeable Difference for a machine learning model and generate a least perceptible difference for adversarial images which can trick a model. The suggested model iteratively distorts a true image by gradient descent method until the machine learning algorithm outputs a false label. Deep Neural Networks are trained for object detection and classification tasks. The cost function includes regularization terms to generate just noticeably different adversarial images which can be detected by the model. The adversarial images generated in this study looks more natural compared to the output of state of the art adversarial image generators.

下载PDF全文

下载文献需遵守相关版权规定

论文标题