背景知识注入可解释的序列分类

论文标题

背景知识注入可解释的序列分类

Background Knowledge Injection for Interpretable Sequence Classification

论文作者

Gsponer, Severin, Costabello, Luca, Van, Chan Le, Pai, Sumit, Gueret, Christophe, Ifrim, Georgiana, Lecue, Freddy

论文摘要

序列分类是构建模型的监督学习任务，可预测看不见的符号序列的类标签。尽管准确性至关重要，但在某些情况下，必须解释性。不幸的是，这种权衡通常很难实现，因为我们缺乏人类独立的可解释性指标。我们介绍了一种新型的序列学习算法，该算法结合了（i）线性分类器 - 已知可以在预测能力和解释性之间取得良好的平衡，以及（ii）背景知识嵌入。我们扩展了经典的子序列特征空间，该符号组由通过单词或图形嵌入式注入的背景知识生成，并使用此新功能空间学习线性分类器。我们还提出了一种新的措施，以根据符号嵌入来评估一组符号特征的可解释性。关于可穿戴物和氨基酸序列分类的人类活动识别的实验表明，我们的分类方法可以保留预测能力，同时提供更可解释的模型。

Sequence classification is the supervised learning task of building models that predict class labels of unseen sequences of symbols. Although accuracy is paramount, in certain scenarios interpretability is a must. Unfortunately, such trade-off is often hard to achieve since we lack human-independent interpretability metrics. We introduce a novel sequence learning algorithm, that combines (i) linear classifiers - which are known to strike a good balance between predictive power and interpretability, and (ii) background knowledge embeddings. We extend the classic subsequence feature space with groups of symbols which are generated by background knowledge injected via word or graph embeddings, and use this new feature space to learn a linear classifier. We also present a new measure to evaluate the interpretability of a set of symbolic features based on the symbol embeddings. Experiments on human activity recognition from wearables and amino acid sequence classification show that our classification approach preserves predictive power, while delivering more interpretable models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题