ORCA：通过在预处理数据的海洋中找到支持数据证据来解释提示语言模型

论文标题

ORCA：通过在预处理数据的海洋中找到支持数据证据来解释提示语言模型

ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data

论文作者

Han, Xiaochuang, Tsvetkov, Yulia

论文摘要

预审前的语言模型在各种下游任务中的表现越来越出色。但是，从模型学习特定于任务的知识的位置尚不清楚，尤其是在零拍设置中。在这项工作中，我们希望从训练训练中找到该模型特定于任务的能力的证据，并特别有兴趣定位一小部分的预读取数据，该数据直接支持该任务中的模型。我们称之为支持数据证据的子集，并提出了一种新型方法ORCA，以使用与下游任务相关的梯度信息进行有效识别它。这些支持的数据证据提供了有关促进语言模型的有趣见解：在情感分析和文本范围的任务中，伯特表明了对bookcorpus的依赖，这是伯特的两个预科语料库的较小语料库，以及预处理的示例，这些示例掩盖了对任务Verbalizers的同义词。

Large pretrained language models have been performing increasingly well in a variety of downstream tasks via prompting. However, it remains unclear from where the model learns the task-specific knowledge, especially in a zero-shot setup. In this work, we want to find evidence of the model's task-specific competence from pretraining and are specifically interested in locating a very small subset of pretraining data that directly supports the model in the task. We call such a subset supporting data evidence and propose a novel method ORCA to effectively identify it, by iteratively using gradient information related to the downstream task. This supporting data evidence offers interesting insights about the prompted language models: in the tasks of sentiment analysis and textual entailment, BERT shows a substantial reliance on BookCorpus, the smaller corpus of BERT's two pretraining corpora, as well as on pretraining examples that mask out synonyms to the task verbalizers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题