训练前的目标如何影响大型语言模型对语言特性的了解？

论文标题

训练前的目标如何影响大型语言模型对语言特性的了解？

How does the pre-training objective affect what large language models learn about linguistic properties?

论文作者

Alajrami, Ahmed, Aletras, Nikolaos

论文摘要

已经提出了一些预训练的目标，例如蒙版语言建模（MLM），以预先培训语言模型（例如BERT），目的是学习更好的语言表示。但是，据我们所知，到目前为止，迄今为止尚无以前的工作研究不同的培训前目标如何影响伯特对语言学特性的了解。我们假设与其他非语言动机目标相比，具有语言动机的目标（例如MLM）应该有助于获得更好的语言知识，这些目标对人类而言并非直觉或难以猜测的投入与要预测的标签之间的关联。为此，我们以两个语言动机的目标和三个非语言动机的目标进行了预训练。然后，我们探测在结果模型的表示中编码的语言特征。我们发现有力的证据表明，两种不同类型的目标所学的表示形式之间的探测性能只有很小的差异。这些令人惊讶的结果询问了语言知情的预训练的主要叙述。

Several pre-training objectives, such as masked language modeling (MLM), have been proposed to pre-train language models (e.g. BERT) with the aim of learning better language representations. However, to the best of our knowledge, no previous work so far has investigated how different pre-training objectives affect what BERT learns about linguistics properties. We hypothesize that linguistically motivated objectives such as MLM should help BERT to acquire better linguistic knowledge compared to other non-linguistically motivated objectives that are not intuitive or hard for humans to guess the association between the input and the label to be predicted. To this end, we pre-train BERT with two linguistically motivated objectives and three non-linguistically motivated ones. We then probe for linguistic characteristics encoded in the representation of the resulting models. We find strong evidence that there are only small differences in probing performance between the representations learned by the two different types of objectives. These surprising results question the dominant narrative of linguistically informed pre-training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题