Stepgame：在文本中进行鲁棒多跳空间推理的新基准

论文标题

Stepgame：在文本中进行鲁棒多跳空间推理的新基准

StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

论文作者

Shi, Zhengxiang, Zhang, Qiang, Lipani, Aldo

论文摘要

自然语言中推断空间关系是智能系统应具有的至关重要的能力。 BABI数据集试图捕获与该域相关的任务（任务17和19）。但是，这些任务有几个局限性。最重要的是，它们仅限于固定表达式，它们在解决方案所需的推理步骤的数量上受到限制，并且无法测试模型的鲁棒性到包含无关或冗余信息的输入。在本文中，我们提出了一个新的提问数据集，称为StepGame，用于在文本中进行鲁棒的多跳空间推理。我们的实验表明，在Stepgame数据集中，BABI数据集的最新模型在努力。此外，我们提出了一个基于张量的基于内存的神经网络（TP-MANN），专门用于空间推理任务。两个数据集的实验结果都表明，我们的模型以优越的概括和鲁棒性能优于所有基准。

Inferring spatial relations in natural language is a crucial ability an intelligent system should possess. The bAbI dataset tries to capture tasks relevant to this domain (task 17 and 19). However, these tasks have several limitations. Most importantly, they are limited to fixed expressions, they are limited in the number of reasoning steps required to solve them, and they fail to test the robustness of models to input that contains irrelevant or redundant information. In this paper, we present a new Question-Answering dataset called StepGame for robust multi-hop spatial reasoning in texts. Our experiments demonstrate that state-of-the-art models on the bAbI dataset struggle on the StepGame dataset. Moreover, we propose a Tensor-Product based Memory-Augmented Neural Network (TP-MANN) specialized for spatial reasoning tasks. Experimental results on both datasets show that our model outperforms all the baselines with superior generalization and robustness performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题