论文标题

通过自然语言推论增强预训练的语言模型的自洽性和表现

Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference

论文作者

Mitchell, Eric, Noh, Joseph J., Li, Siyan, Armstrong, William S., Agarwal, Ananth, Liu, Patrick, Finn, Chelsea, Manning, Christopher D.

论文摘要

尽管大型预训练的语言模型具有强大的功能,但他们的预测通常缺乏测试输入的逻辑一致性。例如,最先进的金刚鹦鹉提问(QA)模型回答“是”是“麻雀是鸟?”和“鸟有脚吗?”但是答案是“不”的“麻雀有脚?”。为了解决这种失败模式,我们建议使用预先训练的自然语言推断(NLI)模型(NLI)模型提高预训练的NLP模型的一致性和准确性,而无需进行微调或重新训练。给定一批测试输入,Concord为每个输入示例了几个候选输出,并实例化了一个因子图,该因素图既说明了模型孤立的每个答案选择的可能性,又是NLI模型对成对答案选择兼容性的信念。我们表明,加权的MaxSat求解器可以在此因素图下有效计算高质量的答案选择,从而改善了原始模型的预测。我们的实验表明,Concord始终使用现成的NLI模型始终提高了现成的封闭式QA和VQA模型的准确性和一致性,尤其是使ConvQA上LXMERT的准确性提高了5%的绝对值。有关代码和数据,请参见https://ericmitchell.ai/emnlp-2022-concord/。

While large pre-trained language models are powerful, their predictions often lack logical consistency across test inputs. For example, a state-of-the-art Macaw question-answering (QA) model answers 'Yes' to 'Is a sparrow a bird?' and 'Does a bird have feet?' but answers 'No' to 'Does a sparrow have feet?'. To address this failure mode, we propose a framework, Consistency Correction through Relation Detection, or ConCoRD, for boosting the consistency and accuracy of pre-trained NLP models using pre-trained natural language inference (NLI) models without fine-tuning or re-training. Given a batch of test inputs, ConCoRD samples several candidate outputs for each input and instantiates a factor graph that accounts for both the model's belief about the likelihood of each answer choice in isolation and the NLI model's beliefs about pair-wise answer choice compatibility. We show that a weighted MaxSAT solver can efficiently compute high-quality answer choices under this factor graph, improving over the raw model's predictions. Our experiments demonstrate that ConCoRD consistently boosts accuracy and consistency of off-the-shelf closed-book QA and VQA models using off-the-shelf NLI models, notably increasing accuracy of LXMERT on ConVQA by 5% absolute. See https://ericmitchell.ai/emnlp-2022-concord/ for code and data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源