论文标题

序列标签的标签组件的嵌入:细粒命名实体识别的案例研究

Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition

论文作者

Kato, Takuma, Abe, Kaori, Ouchi, Hiroki, Miyawaki, Shumpei, Suzuki, Jun, Inui, Kentaro

论文摘要

通常,序列标签中使用的标签由不同类型的元素组成。例如,可以将IOB-Format实体标签(例如B-Person和i-Peron)分解为SPAN(B和I)和键入信息(人)。但是,尽管大多数序列标记模型都不考虑此类标签组件,但跨人类标签的共享组件(例如人)可能对标签预测有益。在这项工作中,我们建议将标签组件信息作为嵌入到模型中。通过对英语和日本细粒命名实体识别的实验,我们证明了所提出的方法提高了性能,尤其是对于具有低频标签的实例。

In general, the labels used in sequence labeling consist of different types of elements. For example, IOB-format entity labels, such as B-Person and I-Person, can be decomposed into span (B and I) and type information (Person). However, while most sequence labeling models do not consider such label components, the shared components across labels, such as Person, can be beneficial for label prediction. In this work, we propose to integrate label component information as embeddings into models. Through experiments on English and Japanese fine-grained named entity recognition, we demonstrate that the proposed method improves performance, especially for instances with low-frequency labels.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源