论文标题
动态计算反复神经网络的对抗扰动
Dynamically Computing Adversarial Perturbations for Recurrent Neural Networks
论文作者
论文摘要
卷积和经常性的神经网络已被广泛用于在分类任务上实现最新性能。但是,还注意到,通过精心制作的输入添加剂扰动,可以相对轻松地对对手操纵这些网络。尽管在制定和防御攻击方面存在一些实验性建立的先前作品,但也希望对网络的对抗性示例和稳健性的存在对此类示例具有理论上的保证。我们在本文中提供两者。我们专门关注经常性体系结构,并从动态系统理论中汲取灵感,自然地将其视为一个控制问题,从而使我们能够在输入序列的每个时间步中动态计算对抗性扰动,从而类似于反馈控制器。提供了说明性的例子来补充理论讨论。
Convolutional and recurrent neural networks have been widely employed to achieve state-of-the-art performance on classification tasks. However, it has also been noted that these networks can be manipulated adversarially with relative ease, by carefully crafted additive perturbations to the input. Though several experimentally established prior works exist on crafting and defending against attacks, it is also desirable to have theoretical guarantees on the existence of adversarial examples and robustness margins of the network to such examples. We provide both in this paper. We focus specifically on recurrent architectures and draw inspiration from dynamical systems theory to naturally cast this as a control problem, allowing us to dynamically compute adversarial perturbations at each timestep of the input sequence, thus resembling a feedback controller. Illustrative examples are provided to supplement the theoretical discussions.