RARR：使用语言模型研究和修改语言模型所说的话

论文标题

RARR：使用语言模型研究和修改语言模型所说的话

RARR: Researching and Revising What Language Models Say, Using Language Models

论文作者

Gao, Luyu, Dai, Zhuyun, Pasupat, Panupong, Chen, Anthony, Chaganty, Arun Tejasvi, Fan, Yicheng, Zhao, Vincent Y., Lao, Ni, Lee, Hongrae, Juan, Da-Cheng, Guu, Kelvin

论文摘要

语言模型（LMS）现在在许多任务中都表现出色，例如少数记录，问答，推理和对话。但是，它们有时会产生不支持或误导性的内容。用户无法轻易确定其输出是否值得信赖，因为大多数LMS都没有任何内置机制来归因于外部证据。为了启用归因，同时仍保留了最近一代模型的所有强大优势，我们提出了RARR（使用研究和修订版的翻新归因），该系统1）自动找到任何文本生成模型的输出的归因和2）后编辑的输出以确保未施加的内容，同时保存原始输出，并尽可能地保留原始输出。当将几个最先进的LMS的输出应用于各种一组生成任务时，我们发现RARR显着改善了归因，同时否则将原始输入保留到比以前探索的编辑模型的程度要大得多。此外，RARR的实施仅需要少数几个培训示例，大型语言模型和标准的Web搜索。

Language models (LMs) now excel at many tasks such as few-shot learning, question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a large language model, and standard web search.

下载PDF全文

下载文献需遵守相关版权规定

论文标题