使用层可持续性分析（LSA）框架的层面正规化对抗训练（LSA）框架

论文标题

使用层可持续性分析（LSA）框架的层面正规化对抗训练（LSA）框架

Layer-wise Regularized Adversarial Training using Layers Sustainability Analysis (LSA) framework

论文作者

Khalooei, Mohammad, Homayounpour, Mohammad Mehdi, Amirmazlaghani, Maryam

论文摘要

当今的各种人工智能应用中使用了深层神经网络模型，面对对抗性攻击，这种模型的加强尤其重要。对抗性攻击的适当解决方案是对抗性训练，它在稳健性和概括之间取决于权衡。本文介绍了一个新颖的框架（层可持续性分析（LSA）），用于在对抗攻击的情况下进行任意神经网络中层脆弱性的分析。 LSA可以是评估深层神经网络并扩展通过层监测和分析来提高模型层可持续性的对抗性训练方法的有用工具包。 LSA框架标识了给定网络的大多数脆弱层（MVL列表）的列表。作为比较度量，相对误差用于评估每一层对对抗输入的可持续性。提出的用于获取强大神经网络以抵御对抗性攻击的方法是基于对LSA建议的层正则化（LR）进行对抗训练（AT）；即AT-LR程序。 AT-LR可以与任何基准对抗攻击一起使用，以减少网络层的脆弱性并改善常规的对抗训练方法。提出的想法在理论上和实验上都表现出色，以实现最先进的多层感知和卷积神经网络体系结构。与AT-LR及其相应的基础对抗训练相比，Moon，Mnist和Cifar-10基准数据集的分类精度分别增加了16.35％，21.79％和10.730％。 LSA框架可在https://github.com/khalooei/lsa上发布。

Deep neural network models are used today in various applications of artificial intelligence, the strengthening of which, in the face of adversarial attacks is of particular importance. An appropriate solution to adversarial attacks is adversarial training, which reaches a trade-off between robustness and generalization. This paper introduces a novel framework (Layer Sustainability Analysis (LSA)) for the analysis of layer vulnerability in an arbitrary neural network in the scenario of adversarial attacks. LSA can be a helpful toolkit to assess deep neural networks and to extend the adversarial training approaches towards improving the sustainability of model layers via layer monitoring and analysis. The LSA framework identifies a list of Most Vulnerable Layers (MVL list) of the given network. The relative error, as a comparison measure, is used to evaluate representation sustainability of each layer against adversarial inputs. The proposed approach for obtaining robust neural networks to fend off adversarial attacks is based on a layer-wise regularization (LR) over LSA proposal(s) for adversarial training (AT); i.e. the AT-LR procedure. AT-LR could be used with any benchmark adversarial attack to reduce the vulnerability of network layers and to improve conventional adversarial training approaches. The proposed idea performs well theoretically and experimentally for state-of-the-art multilayer perceptron and convolutional neural network architectures. Compared with the AT-LR and its corresponding base adversarial training, the classification accuracy of more significant perturbations increased by 16.35%, 21.79%, and 10.730% on Moon, MNIST, and CIFAR-10 benchmark datasets, respectively. The LSA framework is available and published at https://github.com/khalooei/LSA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题