论文标题
RARR:使用语言模型研究和修改语言模型所说的话
RARR: Researching and Revising What Language Models Say, Using Language Models
论文作者
论文摘要
语言模型(LMS)现在在许多任务中都表现出色,例如少数记录,问答,推理和对话。但是,它们有时会产生不支持或误导性的内容。用户无法轻易确定其输出是否值得信赖,因为大多数LMS都没有任何内置机制来归因于外部证据。为了启用归因,同时仍保留了最近一代模型的所有强大优势,我们提出了RARR(使用研究和修订版的翻新归因),该系统1)自动找到任何文本生成模型的输出的归因和2)后编辑的输出以确保未施加的内容,同时保存原始输出,并尽可能地保留原始输出。当将几个最先进的LMS的输出应用于各种一组生成任务时,我们发现RARR显着改善了归因,同时否则将原始输入保留到比以前探索的编辑模型的程度要大得多。此外,RARR的实施仅需要少数几个培训示例,大型语言模型和标准的Web搜索。
Language models (LMs) now excel at many tasks such as few-shot learning, question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a large language model, and standard web search.