波斯EZAFE使用变压器及其在言论部分标记中的作用

论文标题

波斯EZAFE使用变压器及其在言论部分标记中的作用

Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging

论文作者

Doostmohammadi, Ehsan, Nassajian, Minoo, Rahimi, Adel

论文摘要

EZAFE是一些伊朗语言中的语法粒子，将两个单词联系在一起。不管它传达的重要信息，它几乎总是在波斯脚本中指示，从而导致阅读复杂的句子和自然语言处理任务中的错误。在本文中，我们使用不同的机器学习方法来实现EZAFE识别任务。基于变压器的方法BERT和XLMROBERTA取得了最佳结果，后者的F1得分比以前的最新成绩高2.68％。此外，我们使用EZAFE信息来改善波斯语的标记结果，并表明此类信息对基于变压器的方法没有用，并解释了为什么可能是这样。

Ezafe is a grammatical particle in some Iranian languages that links two words together. Regardless of the important information it conveys, it is almost always not indicated in Persian script, resulting in mistakes in reading complex sentences and errors in natural language processing tasks. In this paper, we experiment with different machine learning methods to achieve state-of-the-art results in the task of ezafe recognition. Transformer-based methods, BERT and XLMRoBERTa, achieve the best results, the latter achieving 2.68% F1-score more than the previous state-of-the-art. We, moreover, use ezafe information to improve Persian part-of-speech tagging results and show that such information will not be useful to transformer-based methods and explain why that might be the case.

下载PDF全文

下载文献需遵守相关版权规定

论文标题