揭露：通过强大的功能对齐方式对抗检测和防御

论文标题

揭露：通过强大的功能对齐方式对抗检测和防御

UnMask: Adversarial Detection and Defense Through Robust Feature Alignment

论文作者

Freitas, Scott, Chen, Shang-Tse, Wang, Zijie J., Chau, Duen Horng

论文摘要

从自动驾驶汽车到医疗诊断，深度学习模型正在整合到广泛的高影响力，关键的系统中。但是，最近的研究表明，这些深度学习架构中的许多容易受到对抗性攻击的影响 - 高显示了对防御技术在发生这些攻击之前的至关重要的需求。为了打击这些对抗性攻击，我们开发了基于强大特征对齐的对抗性检测和防御框架。 UNMASK背后的核心思想是通过验证图像的预测类（“鸟”）包含预期的可靠特征（例如喙，翅膀，眼睛）来保护这些模型。例如，如果图像被归类为“鸟”，但是提取的特征是车轮，马鞍和框架，则模型可能会受到攻击。揭露通过纠正错误分类，根据其可靠的功能对图像进行重新分类来检测攻击并捍卫模型。我们的广泛评估表明，揭露（1）最多可检测到96.75％的攻击，（2）通过正确对当前最强攻击（预计梯度下降）正确分类的对抗图像的正确分类，以捍卫该模型。 UNMASK比在8个攻击向量的对抗训练中提供的保护明显更好，平均准确性31.18％。我们用本文开放代码存储库和数据：https：//github.com/safreita1/unmask。

Deep learning models are being integrated into a wide range of high-impact, security-critical systems, from self-driving cars to medical diagnosis. However, recent research has demonstrated that many of these deep learning architectures are vulnerable to adversarial attacks--highlighting the vital need for defensive techniques to detect and mitigate these attacks before they occur. To combat these adversarial attacks, we developed UnMask, an adversarial detection and defense framework based on robust feature alignment. The core idea behind UnMask is to protect these models by verifying that an image's predicted class ("bird") contains the expected robust features (e.g., beak, wings, eyes). For example, if an image is classified as "bird", but the extracted features are wheel, saddle and frame, the model may be under attack. UnMask detects such attacks and defends the model by rectifying the misclassification, re-classifying the image based on its robust features. Our extensive evaluation shows that UnMask (1) detects up to 96.75% of attacks, and (2) defends the model by correctly classifying up to 93% of adversarial images produced by the current strongest attack, Projected Gradient Descent, in the gray-box setting. UnMask provides significantly better protection than adversarial training across 8 attack vectors, averaging 31.18% higher accuracy. We open source the code repository and data with this paper: https://github.com/safreita1/unmask.

下载PDF全文

下载文献需遵守相关版权规定

论文标题