关于数据大小在探测微型模型中的重要性

论文标题

关于数据大小在探测微型模型中的重要性

On the Importance of Data Size in Probing Fine-tuned Models

论文作者

Mehrafarin, Houman, Rajaee, Sara, Pilehvar, Mohammad Taher

论文摘要

几项研究通常通过探测镜头调查了微调的有效性背后的原因。但是，这些研究通常忽略了模型的数据集大小的作用。在本文中，我们强调了这一因素的重要性及其在探测性能中的不可否认作用。我们表明，编码的语言知识的程度取决于微调样本的数量。该分析还表明，较大的培训数据主要影响较高的层，并且这种变化的程度是在微调过程中更新模型的迭代次数的因素，而不是训练样本的多样性。最后，我们通过一组实验表明，微调数据大小会影响模型知识所做的更改的可恢复性。

Several studies have investigated the reasons behind the effectiveness of fine-tuning, usually through the lens of probing. However, these studies often neglect the role of the size of the dataset on which the model is fine-tuned. In this paper, we highlight the importance of this factor and its undeniable role in probing performance. We show that the extent of encoded linguistic knowledge depends on the number of fine-tuning samples. The analysis also reveals that larger training data mainly affects higher layers, and that the extent of this change is a factor of the number of iterations updating the model during fine-tuning rather than the diversity of the training samples. Finally, we show through a set of experiments that fine-tuning data size affects the recoverability of the changes made to the model's linguistic knowledge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题