税法的法定推理数据集和问题回答

论文标题

税法的法定推理数据集和问题回答

A Dataset for Statutory Reasoning in Tax Law Entailment and Question Answering

论文作者

Holzenberger, Nils, Blair-Stanek, Andrew, Van Durme, Benjamin

论文摘要

立法可以看作是用自然语言表达的规定规则的机构。我们将立法应用于案件的事实，我们称为法定推理，这些事实也以自然语言表达。计算法定推理与机器阅读中的大多数现有作品不同，因为确定案例所需的许多信息是完全声明的（一项法律），而大部分机器阅读中所需的信息倾向于通过分配语言统计来学习。为了调查自然语言理解方法在法定推理上的表现，我们介绍了一个数据集以及法律域文本语料库。机器阅读模型的直接应用在我们的问题上表现出较低的开箱即用性能，无论它们是否已经对法律领域进行了微调。我们将其与手工构建的基于序言的系统进行对比，该系统旨在完全解决任务。这些实验支持讨论法定推理前进的挑战，我们认为这是一项有趣的现实世界任务，可以激发能够利用自然语言指定的规定规则的模型的发展。

Legislation can be viewed as a body of prescriptive rules expressed in natural language. The application of legislation to facts of a case we refer to as statutory reasoning, where those facts are also expressed in natural language. Computational statutory reasoning is distinct from most existing work in machine reading, in that much of the information needed for deciding a case is declared exactly once (a law), while the information needed in much of machine reading tends to be learned through distributional language statistics. To investigate the performance of natural language understanding approaches on statutory reasoning, we introduce a dataset, together with a legal-domain text corpus. Straightforward application of machine reading models exhibits low out-of-the-box performance on our questions, whether or not they have been fine-tuned to the legal domain. We contrast this with a hand-constructed Prolog-based system, designed to fully solve the task. These experiments support a discussion of the challenges facing statutory reasoning moving forward, which we argue is an interesting real-world task that can motivate the development of models able to utilize prescriptive rules specified in natural language.

下载PDF全文

下载文献需遵守相关版权规定

论文标题