多语言条款级形态的变压器

论文标题

多语言条款级形态的变压器

Transformers on Multilingual Clause-Level Morphology

论文作者

Acikgoz, Emre Can, Chubakov, Tilek, Kural, Müge, Şahin, Gözde Gül, Yuret, Deniz

论文摘要

本文介绍了我们在MRL：由KUIS AI NLP团队设计的多语言条款级形态（EMNLP 2022研讨会）的第一项共享任务（EMNLP 2022研讨会）。我们介绍了共同任务的所有三个部分的工作：拐点，重新攻击和分析。我们主要采用两种方法探索变压器：（i）从头开始的训练模型与数据增强结合使用，以及（ii）在多语言形态任务下进行前缀调整的转移学习。数据增强可显着提高大多数语言在拐点和重新攻击任务中的性能。另一方面，对预训练的MGPT模型进行前缀调整有助于我们适应低数据和多语言设置中的分析任务。尽管具有数据增强的变压器体系结构为拐点和重新攻击任务带来了最有希望的结果，但MGPT的前缀调整获得了分析任务的最高结果。我们的系统在MRL 2022的所有三个任务中都获得了第一名。

This paper describes our winning systems in MRL: The 1st Shared Task on Multilingual Clause-level Morphology (EMNLP 2022 Workshop) designed by KUIS AI NLP team. We present our work for all three parts of the shared task: inflection, reinflection, and analysis. We mainly explore transformers with two approaches: (i) training models from scratch in combination with data augmentation, and (ii) transfer learning with prefix-tuning at multilingual morphological tasks. Data augmentation significantly improves performance for most languages in the inflection and reinflection tasks. On the other hand, Prefix-tuning on a pre-trained mGPT model helps us to adapt analysis tasks in low-data and multilingual settings. While transformer architectures with data augmentation achieved the most promising results for inflection and reinflection tasks, prefix-tuning on mGPT received the highest results for the analysis task. Our systems received 1st place in all three tasks in MRL 2022.

下载PDF全文

下载文献需遵守相关版权规定

论文标题