自我评估：自我监督的细粒度对话评估

论文标题

自我评估：自我监督的细粒度对话评估

SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation

论文作者

Ma, Longxuan, Zhuang, Ziyu, Zhang, Weinan, Li, Mingda, Liu, Ting

论文摘要

本文介绍了一个新颖的自我监督的细颗粒对话评估框架（自我评估）。核心思想是建模转弯质量与整个对话质量之间的相关性。我们首先提出了一种新型的自动数据构建方法，该方法可以自动为任意对话数据分配细粒度的分数。然后，我们使用多级对比度学习模式训练\ textbf {self eval}，有助于区分不同的分数水平。多个基准的实验结果表明，自我与人类评估高度一致，并且比最先进的模型更好。我们对本文的实验进行了详细的分析。我们的代码可在GitHub上找到。

This paper introduces a novel Self-supervised Fine-grained Dialogue Evaluation framework (SelF-Eval). The core idea is to model the correlation between turn quality and the entire dialogue quality. We first propose a novel automatic data construction method that can automatically assign fine-grained scores for arbitrarily dialogue data. Then we train \textbf{SelF-Eval} with a multi-level contrastive learning schema which helps to distinguish different score levels. Experimental results on multiple benchmarks show that SelF-Eval is highly consistent with human evaluations and better than the state-of-the-art models. We give a detailed analysis of the experiments in this paper. Our code is available on GitHub.

下载PDF全文

下载文献需遵守相关版权规定

论文标题