Greaselm：图形推理增强的语言模型，以回答

论文标题

Greaselm：图形推理增强的语言模型，以回答

GreaseLM: Graph REASoning Enhanced Language Models for Question Answering

论文作者

Zhang, Xikun, Bosselut, Antoine, Yasunaga, Michihiro, Ren, Hongyu, Liang, Percy, Manning, Christopher D., Leskovec, Jure

论文摘要

回答有关文本叙事的复杂问题需要在既定的上下文和世界知识上的知识上进行推理。但是，预验证的语言模型（LM）是大多数现代质量检查系统的基础，并不能牢固地表示概念之间的潜在关系，这对于推理是必不可少的。尽管知识图（kg）通常用于增强具有世界知识结构化表示的LMS，但它仍然是一个悬而未决的问题，如何有效地融合和推理KG表示和语言环境，这提供了情境约束和细微差别。在这项工作中，我们提出了Greaselm，这是一个新模型，该模型融合了验证的LMS和图形神经网络的编码表示，并在多层模态相互作用操作上融合了图形。来自两种模式的信息都传播到另一个方式，允许语言上下文表示以结构化的世界知识为基础，并允许在上下文中使用语言细微差别（例如，否定，套期保值），以告知知识的图表。我们在常识性推理中的三个基准（即CommonSenseQA，OpenBookQa）和医学问题答案（即MEDQA-USMLE）域中的三个基准测试结果表明，Greaselm可以更可靠地回答需要对情境约束和结构性知识的推理的问题，甚至更大。

Answering complex questions about textual narratives requires reasoning over both stated context and the world knowledge that underlies it. However, pretrained language models (LM), the foundation of most modern QA systems, do not robustly represent latent relationships between concepts, which is necessary for reasoning. While knowledge graphs (KG) are often used to augment LMs with structured representations of world knowledge, it remains an open question how to effectively fuse and reason over the KG representations and the language context, which provides situational constraints and nuances. In this work, we propose GreaseLM, a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations. Information from both modalities propagates to the other, allowing language context representations to be grounded by structured world knowledge, and allowing linguistic nuances (e.g., negation, hedging) in the context to inform the graph representations of knowledge. Our results on three benchmarks in the commonsense reasoning (i.e., CommonsenseQA, OpenbookQA) and medical question answering (i.e., MedQA-USMLE) domains demonstrate that GreaseLM can more reliably answer questions that require reasoning over both situational constraints and structured knowledge, even outperforming models 8x larger.

下载PDF全文

下载文献需遵守相关版权规定

论文标题