Lince：语言代码转换评估的集中基准

论文标题

Lince：语言代码转换评估的集中基准

LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation

论文作者

Aguilar, Gustavo, Kar, Sudipta, Solorio, Thamar

论文摘要

NLP研究的最新趋势引起了人们对语言代码转换（CS）的兴趣。已经提出了现代方法来解决多种语言对的广泛NLP任务。不幸的是，这些提出的方法几乎无法推广到不同的代码开关语言。此外，尚不清楚模型体系结构是否适用于不同的任务，同时仍与代码转换设置兼容。这主要是因为缺乏集中基准和研究人员根据其特定需求和利益所采用的稀疏中心语料库。 To facilitate research in this direction, we propose a centralized benchmark for Linguistic Code-switching Evaluation (LinCE) that combines ten corpora covering four different code-switched language pairs (i.e., Spanish-English, Nepali-English, Hindi-English, and Modern Standard Arabic-Egyptian Arabic) and four tasks (i.e., language identification, named entity recognition, part-of-speech tagging, and sentiment 分析）。作为基准集中式工作的一部分，我们在ritual.uh.edu/lince上提供了一个在线平台，研究人员可以在实时与他人进行比较的同时提交结果。此外，我们还提供了许多流行模型，包括LSTM，Elmo和多语言BERT，以便NLP社区可以与最新的系统进行比较。 Lince是一项持续的努力，我们将使用更多低资源的语言和任务来扩展它。

Recent trends in NLP research have raised an interest in linguistic code-switching (CS); modern approaches have been proposed to solve a wide range of NLP tasks on multiple language pairs. Unfortunately, these proposed methods are hardly generalizable to different code-switched languages. In addition, it is unclear whether a model architecture is applicable for a different task while still being compatible with the code-switching setting. This is mainly because of the lack of a centralized benchmark and the sparse corpora that researchers employ based on their specific needs and interests. To facilitate research in this direction, we propose a centralized benchmark for Linguistic Code-switching Evaluation (LinCE) that combines ten corpora covering four different code-switched language pairs (i.e., Spanish-English, Nepali-English, Hindi-English, and Modern Standard Arabic-Egyptian Arabic) and four tasks (i.e., language identification, named entity recognition, part-of-speech tagging, and sentiment analysis). As part of the benchmark centralization effort, we provide an online platform at ritual.uh.edu/lince, where researchers can submit their results while comparing with others in real-time. In addition, we provide the scores of different popular models, including LSTM, ELMo, and multilingual BERT so that the NLP community can compare against state-of-the-art systems. LinCE is a continuous effort, and we will expand it with more low-resource languages and tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题