在具有语义知识的神经网络中解决错误的严重性

论文标题

在具有语义知识的神经网络中解决错误的严重性

Addressing Mistake Severity in Neural Networks with Semantic Knowledge

论文作者

Abreu, Natalie, Vaska, Nathan, Helus, Victoria

论文摘要

一般而言，深度神经网络和机器学习算法的鲁棒性是一项开放研究挑战。特别是，很难确保在分发输入或在培训时无法预见的异常实例上保持算法性能。体现的代理将在这些条件下部署，并可能做出错误的预测。除非代理在动态环境中保持其性能，否则将被视为不信任。大多数强大的训练技术旨在提高扰动输入的模型准确性；作为稳健性的另一种形式，我们旨在减少在挑战性条件下神经网络犯下的错误的严重性。我们利用当前的对抗训练方法在训练过程中产生目标的对抗攻击，以增加模型预测与错误分类实例的真实标签之间的语义相似性。结果表明，与标准训练的模型相比，我们的方法在错误严重性方面的表现更好。我们还发现了一个有趣的角色，即在语义相似性方面，不持鲁手的特征发挥作用。

Robustness in deep neural networks and machine learning algorithms in general is an open research challenge. In particular, it is difficult to ensure algorithmic performance is maintained on out-of-distribution inputs or anomalous instances that cannot be anticipated at training time. Embodied agents will be deployed in these conditions, and are likely to make incorrect predictions. An agent will be viewed as untrustworthy unless it can maintain its performance in dynamic environments. Most robust training techniques aim to improve model accuracy on perturbed inputs; as an alternate form of robustness, we aim to reduce the severity of mistakes made by neural networks in challenging conditions. We leverage current adversarial training methods to generate targeted adversarial attacks during the training process in order to increase the semantic similarity between a model's predictions and true labels of misclassified instances. Results demonstrate that our approach performs better with respect to mistake severity compared to standard and adversarially trained models. We also find an intriguing role that non-robust features play with regards to semantic similarity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题