参数空间中的线性插值足以用于微调语言模型

论文标题

参数空间中的线性插值足以用于微调语言模型

Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models

论文作者

Rofin, Mark, Balagansky, Nikita, Gavrilov, Daniil

论文摘要

在高维空间中两个点之间获得连续插值的最简单方法是在它们之间绘制界限。虽然先前的作品着重于模型参数之间的一般连接性，但我们在微调后探索了预训练模型的参数的线性插值。出人意料的是，我们可以在中间点进行微调模型中的中间点下降，而无需进行线性插值。对于可控制的文本生成，可以将这种插值视为将模型转移到或反对所需的文本属性（例如，积极的情绪），可以用作可控制文本生成的进一步方法的理由，而无需推断速度速度开销。

The simplest way to obtain continuous interpolation between two points in high dimensional space is to draw a line between them. While previous works focused on the general connectivity between model parameters, we explored linear interpolation for parameters of pre-trained models after fine-tuning. Surprisingly, we could perform linear interpolation without a performance drop in intermediate points for fine-tuned models. For controllable text generation, such interpolation could be seen as moving a model towards or against the desired text attribute (e.g., positive sentiment), which could be used as grounds for further methods for controllable text generation without inference speed overhead.

下载PDF全文

下载文献需遵守相关版权规定

论文标题