论文标题

WIC-TSV:在上下文中单词验证目标意义验证的评估基准

WiC-TSV: An Evaluation Benchmark for Target Sense Verification of Words in Context

论文作者

Breit, Anna, Revenko, Artem, Rezaee, Kiamehr, Pilehvar, Mohammad Taher, Camacho-Collados, Jose

论文摘要

我们提出WIC-TSV,这是一种新的多域评估基准,用于单词感官歧义。更具体地说,我们介绍了一个框架,以在上下文中对单词的目标意识验证,该框架将其在公式中作为二进制分类任务的独特性,从而独立于外部意义清单,以及对各个领域的覆盖范围。这使数据集高度灵活,以评估各种域中和跨域中的各种模型和系统。 WIC-TSV提供了三种不同的评估设置,具体取决于提供给模型的输入信号。我们使用最先进的语言模型在数据集上设置了基线性能。实验结果表明,即使这些模型可以在任务上表现出色,但机器和人类性能之间仍然存在差距,尤其是在室外设置中。 WIC-TSV数据可从https://competitions.codalab.org/competitions/23683获得

We present WiC-TSV, a new multi-domain evaluation benchmark for Word Sense Disambiguation. More specifically, we introduce a framework for Target Sense Verification of Words in Context which grounds its uniqueness in the formulation as a binary classification task thus being independent of external sense inventories, and the coverage of various domains. This makes the dataset highly flexible for the evaluation of a diverse set of models and systems in and across domains. WiC-TSV provides three different evaluation settings, depending on the input signals provided to the model. We set baseline performance on the dataset using state-of-the-art language models. Experimental results show that even though these models can perform decently on the task, there remains a gap between machine and human performance, especially in out-of-domain settings. WiC-TSV data is available at https://competitions.codalab.org/competitions/23683

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源