论文标题

克服零击的跨语性遗忘

Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

论文作者

Vu, Tu, Barua, Aditya, Lester, Brian, Cer, Daniel, Iyyer, Mohit, Constant, Noah

论文摘要

在本文中,我们探讨了仅使用摘要作为案例研究,使用英文标记的数据才能使用目标语言执行生成任务的具有挑战性的问题。我们假设一个严格的设置,无法访问并行数据或机器翻译,并发现在这种情况下,常见的转移学习方法是一种纯粹在英语灾难性上进行微调的生成性多语言模型,从而忘记了如何生成非英语。鉴于参数有效适应技术的最新增长,我们首次研究了一种这样一种方法,即及时调整(Lester等,2021),可以克服灾难性忘记,忘记启用零拍的跨语性产生。我们的实验表明,参数有效的及时调整在与较小的语言(例如从英语到泰语)之间传输时,可以提高标准微调的收益。但是,这些方法和完全监督的基线之间仍然存在显着的差距。为了进一步改善跨语性转移,我们探索了几种方法,包括:(1)混合使用未标记的多语言数据,以及(2)将提示提示转化为可重组的语言和任务组件。我们的方法可以提供进一步的质量收益,这表明强劲的零射击跨语性产生已达到影响。

In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study. We assume a strict setting with no access to parallel data or machine translation and find that common transfer learning approaches struggle in this setting, as a generative multilingual model fine-tuned purely on English catastrophically forgets how to generate non-English. Given the recent rise of parameter-efficient adaptation techniques, we conduct the first investigation into how one such method, prompt tuning (Lester et al., 2021), can overcome catastrophic forgetting to enable zero-shot cross-lingual generation. Our experiments show that parameter-efficient prompt tuning provides gains over standard fine-tuning when transferring between less-related languages, e.g., from English to Thai. However, a significant gap still remains between these methods and fully-supervised baselines. To improve cross-lingual transfer further, we explore several approaches, including: (1) mixing in unlabeled multilingual data, and (2) explicitly factoring prompts into recombinable language and task components. Our approaches can provide further quality gains, suggesting that robust zero-shot cross-lingual generation is within reach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源