论文标题
反射后门:对深神经网络的自然后门攻击
Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks
论文作者
论文摘要
最近的研究表明,在训练时制定的后门攻击可能会损害DNN。后门攻击通过将后门模式注入一小部分培训数据,将后门安装到受害者模型中。在测试时,受害者模型通常在干净的测试数据上行事,但每当测试示例中存在后门模式时,都会始终如一地预测特定的(可能是不正确的)目标类别。尽管现有的后门攻击是有效的,但它们并不是隐形。对培训数据或标签进行的修改通常是可疑的,可以通过简单的数据过滤或人类检查轻松检测。在本文中,我们提出了一种新型的后门攻击,灵感来自重要的自然现象:反射。使用物理反射模型的数学建模,我们提议反射后门(Recool)将后门作为受害者模型种植。我们在3个计算机视觉任务和5个数据集中演示,这些数据集可以攻击成功率高的最新DNN,并且对最新的后门防御能力有抵抗力。
Recent studies have shown that DNNs can be compromised by backdoor attacks crafted at training time. A backdoor attack installs a backdoor into the victim model by injecting a backdoor pattern into a small proportion of the training data. At test time, the victim model behaves normally on clean test data, yet consistently predicts a specific (likely incorrect) target class whenever the backdoor pattern is present in a test example. While existing backdoor attacks are effective, they are not stealthy. The modifications made on training data or labels are often suspicious and can be easily detected by simple data filtering or human inspection. In this paper, we present a new type of backdoor attack inspired by an important natural phenomenon: reflection. Using mathematical modeling of physical reflection models, we propose reflection backdoor (Refool) to plant reflections as backdoor into a victim model. We demonstrate on 3 computer vision tasks and 5 datasets that, Refool can attack state-of-the-art DNNs with high success rate, and is resistant to state-of-the-art backdoor defenses.