法律文章增强法律案件匹配：一种因果学习方法

论文标题

法律文章增强法律案件匹配：一种因果学习方法

Law Article-Enhanced Legal Case Matching: a Causal Learning Approach

论文作者

Sun, Zhongxiang, Xu, Jun, Zhang, Xiao, Dong, Zhenhua, Wen, Ji-Rong

论文摘要

法律案件匹配自动构建一个模型来估计源和目标案件之间的相似性，在智能法律体系中起着至关重要的作用。语义文本匹配模型已应用于源和目标法律案例作为长形式文本文档的任务。这些通用匹配模型仅根据法律案件中的文本进行预测，忽视了法律文章在法律案件匹配中的重要作用。在现实世界中，匹配结果（例如相关标签）受到法律条款的极大影响，因为法律案件的内容和判决是根据法律基础形成的。从因果意义上讲，匹配的决定受到法律案件引用的法律文章的调解效应的影响，以及在法律案件中关键情况（例如，详细的事实描述）的直接影响。鉴于观察结果，本文提出了一个名为“合法匹配”的模型不足的因果学习框架，根据该框架，通过尊重相应的法律文章来学习法律案例匹配模型。鉴于一对法律案件和相关法律文章，合法匹配将法律文章的嵌入视为工具变量（IVS），而法律案件的嵌入为治疗方法。使用IV回归，可以将处理分解为与法律相关的和法律无关的部分，分别反映了调解和直接影响。然后将这两个部分与不同的权重相结合，以共同支持最终匹配预测。我们表明该框架是模型不可静止的，并且可以将许多法律案例匹配模型应用于基础模型。综合实验表明，合法匹配可以在三个公共数据集上胜过最先进的基线。

Legal case matching, which automatically constructs a model to estimate the similarities between the source and target cases, has played an essential role in intelligent legal systems. Semantic text matching models have been applied to the task where the source and target legal cases are considered as long-form text documents. These general-purpose matching models make the predictions solely based on the texts in the legal cases, overlooking the essential role of the law articles in legal case matching. In the real world, the matching results (e.g., relevance labels) are dramatically affected by the law articles because the contents and the judgments of a legal case are radically formed on the basis of law. From the causal sense, a matching decision is affected by the mediation effect from the cited law articles by the legal cases, and the direct effect of the key circumstances (e.g., detailed fact descriptions) in the legal cases. In light of the observation, this paper proposes a model-agnostic causal learning framework called Law-Match, under which the legal case matching models are learned by respecting the corresponding law articles. Given a pair of legal cases and the related law articles, Law-Match considers the embeddings of the law articles as instrumental variables (IVs), and the embeddings of legal cases as treatments. Using IV regression, the treatments can be decomposed into law-related and law-unrelated parts, respectively reflecting the mediation and direct effects. These two parts are then combined with different weights to collectively support the final matching prediction. We show that the framework is model-agnostic, and a number of legal case matching models can be applied as the underlying models. Comprehensive experiments show that Law-Match can outperform state-of-the-art baselines on three public datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题